mirror of
https://github.com/netdata/netdata.git
synced 2025-04-15 01:58:34 +00:00
2 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
5f72d4279b
|
Streaming improvements No 3 (#19168)
* ML uses synchronous queries
* do not call malloc_trim() to free memory, since to locks everything
* Reschedule dimensions for training from worker threads.
* when we collect or read from the database, it is SAMPLES. When we generate points for a chart is POINTS
* keep the receiver send buffer 10x the default
* support autoscaling stream circular buffers
* nd_poll() prefers sending data vs receiving data - in an attempt to dequeue as soon as possible
* fix last commit
* allow removing receiver and senders inline, if the stream thread is not working on them
* fix logs
* Revert "nd_poll() prefers sending data vs receiving data - in an attempt to dequeue as soon as possible"
This reverts commit
|
||
![]() |
9ecf021ec2
|
Streaming improvements #1 (#19137)
* prefer tinysleep over yielding the processor * split spinlocks to separate files * rename spinlock initializers * Optimize ML queuing operations. - Allocate 25% of cores for ML. - Split queues by request type. - Accurate stats for queue operations by type. * abstracted circular buffer into a new private structure to enable using it in receiver sending side - no features added yet, only abstracted the existing functionality - not tested yet * completed the abstraction of stream circular buffer * unified list of receivers and senders; opcodes now support both receivers and senders * use strings in pluginsd * stream receivers send data back to the child using the event loop * do not share pgc aral between caches * pgc uses 4 to 256 partitions, by default equal to the number of CPU cores * add forgotten worker job * workers now monitor spinlock contention * stream sender tries to lock the sender, but does not wait for it - it will be handled later * increase the number of web server threads to the number of cpu cores, with a minimum of 6 * use the nowait versions of nd_sock functions * handle EAGAIN properly * add spinlock contention tracing for rw_spinlock * aral lock/unlock contention tracing * allocate the compressed buffer * use 128KiB for aral default page size; limit memory protection to 5GiB * aral uses mmap() for big pages * enrich log messages * renamed telemetry to pulse * unified sender and receiver socket event loops * logging improvements * NETDATA_LOG_STREAM_SENDER logs inbound and outbound traffic * 16k receiver buffer size to improve interactivity * fix NETDATA_LOG_STREAM_SENDER in sender_execute * do not stream ML models for charts and dimensions that have not been exposed * add support for sending QUIT to plugins and waiting for some time for them to quit gracefully * global spinlock contention per function * use an aral per pgc partition; use 8 partitions for PGD * rrdcalc: do not change the frequency of alerts - it uses arbitrary values used during replication, changing permanently the frequency of alerts replication: use 1/3 of the cores or 1 core every 10 nodes (min of the two) pgd: use as many aral partitions as the CPU cores, up to 256 * aral does 1 allocation per page (the structure and the elements together), instead of two * use the evitor thread only when we run out of memory; restore the optimization about prepending or appending clean pages based on their accesses; use the main cache free memory for the other caches, reducing I/O when the main cache has enough room * reduce the number of events per poll() to 10 * aral allocates pages of up to 1MiB; restore processing 100 events per nd_poll() call * drain the sockets while reading * receiver sockets should be non-blocking * add stability detector to aral * increase the receivers send buffer * do not remove the sender or the receiver while we drain the input sockets --------- Co-authored-by: vkalintiris <vasilis@netdata.cloud> |
Renamed from src/daemon/telemetry/telemetry-aral.c (Browse further)