mirror of
https://github.com/netdata/netdata.git
synced 2025-04-06 22:38:55 +00:00
Balance streaming parents (#18945)
* recreate the circular buffer from time to time * do not update cloud url if the node id is not updated * remove deadlock and optimize pipe size * removed const * finer control on randomized delays * restore children re-connecting to parents * handle partial pipe reads; sender_commit() now checks if the sender is still connected to avoid bombarding it with data that cannot be sent * added commented code about optimizing the array of pollfds * improve interactivity of sender; code cleanup * do not use the pipe for sending messages, instead use a queue in memory (that can never be full) * fix dictionaries families * do not destroy aral on replication exit - it crashes the senders * support multiple dispatchers and connectors; code cleanup * more cleanup * Add serde support for KMeans models. - Serialization/Deserialization support of KMeans models. - Send/receive ML models between a child/parent. - Fix some rare and old crash reports. - Reduce allocations by a couple thousand per second when training. - Enable ML statistics temporarily which might increase CPU consumption. * fix ml models streaming * up to 10 dispatchers and 2 connectors * experiment: limit the number of receivers to the number of cores - 2 * reworked compression at the receiver to minimize read operations * multi-core receivers * use slot 0 on receivers * use slot 0 on receivers * use half the cores for receivers with a minimum of 4 * cancel receiver threads * use offsets instead of pointers in the compressed buffer; track last reads * fix crash on using freed decompressor; core re-org * fix incorrect job registration * fix send_to_plugin() for SSL * add reason to disconnect message * fix signaling receivers to stop * added --dev option to netdata-installer.sh to prevent it from removing the build directory * Fix serde of double values. NaNs and +/- infinities are encoded as strings. * unused param * reset max cbuffer size when it is recreated * struct receiver_state is now private * 1 dispatcher, 1 connector, 2/3 cores for receivers * all replication requests are served by replication threads - never the dispatcher threads * optimize partitions and cache lines for dbengine cache * fix crash on receiver shutdown * rw spinlock now prioritizes writers * backfill all higher tiers * extent cache to 10% * automatic sizing of replication threads * add more replication threads * configure cache eviction parameters to avoid running in aggressive mode all the time * run evictions and flushes every 100ms * add missing initialization * add missing initialization - again * add evictors for all caches * add dedicated evict thread per cache * destroy the completion * avoid sending too many signals to eviction threads * alternative way to make sure there are data to evict * measure inline cache events * disable inline evictions and flushing for open and extent cache * use a spinlock to avoid sending too many signals * batch evictions are not in steps of pages * fix wanted cache size when there are no clean entries in it * fix wanted cache size when there are no clean entries in it * fix wanted cache size again * adaptive batch evictions; batch evictions first try all partitions * move waste events to waste chart * added evict_traversed * evict is smaller steps * removed obsolete code * disabled inlining of evictions and flushing; added timings for evictions * more detailed timings for evictions * use inline evictors * use aral for gorilla pages of 512 bytes, when they are loaded from disk * use aral for all gorilla page sizes loaded from disk * disable inlining again to test it after the memory optimization * timings for dbengine evictions * added timing names * detailed timings * detailed timings - again * removed timings and restored inline evictions * eviction on release only under critical pressure * cleanup and replication tuning * tune cache size calculation * tune replication threads calculation * make streaming receiver exit * Do not allocate/copy extent data twice. * Build/link mimalloc Just for testing, it will be reverted. * lower memory requirements * Link mimalloc statically * run replication with synchronous queries * added missing worker jobs in sender dispatcher * enable batch evictions in pgc * fix sender-dispatcher workers * set max dispatchers to 2 * increase the default replication threads * log stream_info errors * increase replication threads * log the json text when we fail to parse json response of stream_info * stream info response may come back in multiple steps * print the socket error of stream info * added debug to stream info socket error * loop while content-length is smaller than the payload received * Revert "Link mimalloc statically" This reverts commitc98e482d47
. * Revert "Build/link mimalloc" This reverts commit8aae22a28a
. * Remove NEED_PROTOBUF * Use mimalloc * Revert "Use mimalloc" This reverts commit9a68034786
. * Use mimalloc * support 256 bytes gorilla pages, when they are loaded from disk * added os_mem_available() * test memory protection * use protection only on one cache * use the free memory of the main cache in the other caches too * use the free memory of the main cache in the open cache too * Batch gorilla writes by tracking the last written number. In a setup with 200 children, `perf` shows that the worst offender is the gorilla write operation, reporting ~17% overhead. With this change `perf` reports ~4% overhead and netdata's CPU consumption decreased by ~16%. * make buffered_reader_next_line() a couple times faster * flushing open cache * Use re2c for the line splitting pluginsd. Function get's optimized around 3x. We should delete old code and use the re2c for the rest of the functions, but we need to keep the PR size as minimal as possible. Will do in follow up PRs. * use cores - 1 for receivers, use only 1 sender * move sender processing to a separate function * Revert "Batch gorilla writes by tracking the last written number." This reverts commit2e72a5c56d
. * Batch gorilla writes only from writers This reapplies df79be2f01145bd79091a8934d7c80b4b3eb915b and introduces a couple changes to remomove writes from readers. * log information for buffer overflow * fix heap use after free * added comments to the main stream receiver loop * 3 dispatchers * single threaded receiver and sender * code cleanup * de-associate hosts from streaming threads when both the receiver and sender stop, so that each time the threads are re-balanced * fix heap use after free * properly get the slot number of pollfd * fixes * fixes * revert worker changes * reuse streaming threads * backfilling should be synchronous * remove the node last * do not keep a pointer to rellocatable buffer * give to pgc the right page size, not less * restore spreading metrics size across time * use the calculated slots for gorilla pages * accurately track gorilla page size changes * check the sth pointer for validity * code cleanup, files re-org and renames to reflect the new structure of streaming * updated referenced size when the size of a page changes; removed flush spins - fluhses cancelled is a waste event * improve families in netdata statistics * page size histogram per cache * page size histogram per cache queue (hot, dirty, clean) * fix heap after use in pdc.c * rw_spinlocks: when preferring a writer yield so that the writer has the chance to get the lock * do not balloon open and extent caches more than needed (it fragments memory and there is not enough memory for the main cache) * fixed typo * enable trace allocations to work * Skip adding kmeans model when ML dimension has not been created. * PGD is now entirely on ARAL for all types of pages * 2 partitions for PGD * Check for ML queue prior to pushing as well. * merge multiple arals, to avoid wasting memory * significantly less arals; proper calculation of gorilla efficiency * report pgd buffers separately from pgc * aral only for sizes less than 512 bytes * tune aral caches * log the functions using the streaming buffer when concurrent use is detected * aral supporting different pages for collected pages and clean pages - an attempt to minimize fragmentation at high performance * fix misuse of sender thread buffers * select the right buffer, based on the receiver tid * no more rrdpush, renamed to stream * lower aral max page size to 16KiB - in an attempt to lower fragmentation under memory pressure * update opcode handling * automatic sizing of aral limiting its size to 200 items per page or 4 x system pages * tune cache eviction strategy * renamed global statistics to telemetry and split it into multiple files * left over renames of global statistics to telemetry * added heatmap to chart types * note about re-balancing a parents cluster * fix formating * added aral telemetry to find the fragmentation per aral * experiment with a different strategy when making clean pages: always append so that the cache is being constantly rotated; aral telemetry reports utilization instead of fragmentation * aral now takes into account waiting deallocators when it creates new pages * split netdata-conf functions into multiple files; added dbengine use all caches and dbengine out of memory protection settings * tune cache eviction strategy * cache parameters cleanup * rename mem_available to system_memory * Fix variable type. * Add fuzzer for pluginsd line splitter. * use cgroup v1 and v2 to detect memory protection; log on start the detection of memory * fixed typo * added logs about system memory detection * remove debug logs from system memory detection * move the rest of dbengine config to netdata-conf * respect streaming buffer size configured * add workers to pgc eviction threads * renamed worker * fixed flip-flop in size and entries conversions * use aral_by_size when we actually agreegate stats to aral by size * use keyword defintions * move opcode definitions to stream-thread.h * swap struct pollfd slots to make sure all the sockets have an equal chance of being processed * Revert "Add fuzzer for pluginsd line splitter." This reverts commit454cbcf6e1
. * Revert "Use re2c for the line splitting pluginsd." This reverts commit2b2f9d3887
. * stream thread use judy arrays instead of linked lists and pre-allocated arrays * added comment about pfd structure on sender and receiver * fixed logs and made the defaut sender timeout 5 seconds * Spawn ML worker threads based on number of CPUs. * Add statistics for ML allocations/deallocations. * Add host flag to check for pending alert transitions to save Remove precompiled statements Offload processing of alerts in the event loop Queue alert transitions to the metadata event loop to be saved Run metadata checks every 5 seconds * do not block doing socket retries when errno indicates EWOULDBLOCK; insist sending data in send_to_plugin() * Revert "Add host flag to check for pending alert transitions to save" This reverts commit86ade0e87e
. * fix error reasons * Disable ML memory statistics when using mimalloc * add reason when ml cannot acquire the dimension * added ML memory and depending on the DICT_WITH_STATS define, add aral by size too * do not stream ML when the parent does not have ML enabled * nd_poll() to overcome the starvation of poll() and use epoll() under Linux * nd_poll() optimization to minimize the number of system calls * nd_poll() fix * nd_poll() fix again * make glibc release memory to the system when the system is critical in memory * try bigger aral pages, to enable releasing memory back to the system * Queue alert transitions to the metadata event loop (global list not per host) Add host count to check for pending alert transitions to save Remove precompiled statements Offload processing of alerts in the event loop Run metadata checks every 5 seconds * round robin aral allocations * fix aral round robin * ask glibc to release memory when the allocations are aggressive * tinysleep yields the processor instead of waiting * run malloc_trim() more frequently * Add reference count on alarm_entry * selective tinysleep and processor yielding * revert gorilla batch writes * codacy fixes --------- Co-authored-by: vkalintiris <vasilis@netdata.cloud> Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
This commit is contained in:
parent
da1867879e
commit
6b8c6baac2
281 changed files with 21006 additions and 14570 deletions
CMakeLists.txt
docs
packaging/cmake
src
aclk
claim
collectors
daemon
analytics.ccommon.h
config
README.mdnetdata-conf-backwards-compatibility.cnetdata-conf-backwards-compatibility.hnetdata-conf-db.cnetdata-conf-db.hnetdata-conf-directories.cnetdata-conf-directories.hnetdata-conf-global.cnetdata-conf-global.hnetdata-conf-logs.cnetdata-conf-logs.hnetdata-conf-web.cnetdata-conf-web.hnetdata-conf.cnetdata-conf.h
daemon.cdyncfg
dyncfg-echo.cdyncfg-files.cdyncfg-inline.cdyncfg-intercept.cdyncfg-internals.hdyncfg-tree.cdyncfg-unittest.cdyncfg.cdyncfg.h
global_statistics.cglobal_statistics.hh2o-common.clibuv_workers.clibuv_workers.hmain.cservice.cstatic_threads.ctelemetry
telemetry-aral.ctelemetry-aral.htelemetry-daemon-memory.ctelemetry-daemon-memory.htelemetry-daemon.ctelemetry-daemon.htelemetry-dbengine.ctelemetry-dbengine.htelemetry-dictionary.ctelemetry-dictionary.htelemetry-gorilla.ctelemetry-gorilla.htelemetry-heartbeat.ctelemetry-heartbeat.htelemetry-http-api.ctelemetry-http-api.htelemetry-ingestion.ctelemetry-ingestion.htelemetry-ml.ctelemetry-ml.htelemetry-queries.ctelemetry-queries.htelemetry-sqlite3.ctelemetry-sqlite3.htelemetry-string.ctelemetry-string.htelemetry-trace-allocations.ctelemetry-trace-allocations.htelemetry-workers.ctelemetry-workers.htelemetry.ctelemetry.h
database
contexts
api_v2_contexts.capi_v2_contexts_agents.capi_v2_contexts_alert_config.capi_v2_contexts_alert_transitions.c
engine
README.mdcache.ccache.hdatafile.cdbengine-stresstest.cdbengine-unittest.cjournalfile.cmetric.cpage.cpage.hpagecache.cpdc.crrdengine.crrdengine.hrrdengineapi.crrdengineapi.h
rrd-database-mode.c
148
CMakeLists.txt
148
CMakeLists.txt
|
@ -438,6 +438,7 @@ check_function_exists(backtrace HAVE_BACKTRACE)
|
|||
check_function_exists(arc4random_buf HAVE_ARC4RANDOM_BUF)
|
||||
check_function_exists(arc4random_uniform HAVE_ARC4RANDOM_UNIFORM)
|
||||
check_function_exists(getrandom HAVE_GETRANDOM)
|
||||
check_function_exists(sysinfo HAVE_SYSINFO)
|
||||
|
||||
#
|
||||
# check source compilation
|
||||
|
@ -475,6 +476,14 @@ int main() {
|
|||
}
|
||||
" HAVE_C_MALLOPT)
|
||||
|
||||
check_c_source_compiles("
|
||||
#include <malloc.h>
|
||||
int main() {
|
||||
malloc_trim(0);
|
||||
return 0;
|
||||
}
|
||||
" HAVE_C_MALLOC_TRIM)
|
||||
|
||||
check_c_source_compiles("
|
||||
#define _GNU_SOURCE
|
||||
#include <stdio.h>
|
||||
|
@ -920,6 +929,21 @@ set(LIBNETDATA_FILES
|
|||
src/libnetdata/xxHash/xxhash.h
|
||||
src/libnetdata/os/random.c
|
||||
src/libnetdata/os/random.h
|
||||
src/libnetdata/socket/nd-sock.c
|
||||
src/libnetdata/socket/nd-sock.h
|
||||
src/libnetdata/socket/listen-sockets.c
|
||||
src/libnetdata/socket/listen-sockets.h
|
||||
src/libnetdata/socket/poll-events.c
|
||||
src/libnetdata/socket/poll-events.h
|
||||
src/libnetdata/socket/connect-to.c
|
||||
src/libnetdata/socket/connect-to.h
|
||||
src/libnetdata/socket/socket-peers.c
|
||||
src/libnetdata/socket/socket-peers.h
|
||||
src/libnetdata/libjudy/judyl-typed.h
|
||||
src/libnetdata/os/system_memory.c
|
||||
src/libnetdata/os/system_memory.h
|
||||
src/libnetdata/socket/nd-poll.c
|
||||
src/libnetdata/socket/nd-poll.h
|
||||
)
|
||||
|
||||
set(LIBH2O_FILES
|
||||
|
@ -1013,8 +1037,8 @@ set(DAEMON_FILES
|
|||
src/daemon/daemon.h
|
||||
src/daemon/libuv_workers.c
|
||||
src/daemon/libuv_workers.h
|
||||
src/daemon/global_statistics.c
|
||||
src/daemon/global_statistics.h
|
||||
src/daemon/telemetry/telemetry.c
|
||||
src/daemon/telemetry/telemetry.h
|
||||
src/daemon/analytics.c
|
||||
src/daemon/analytics.h
|
||||
src/daemon/main.c
|
||||
|
@ -1035,15 +1059,59 @@ set(DAEMON_FILES
|
|||
src/daemon/pipename.h
|
||||
src/daemon/unit_test.c
|
||||
src/daemon/unit_test.h
|
||||
src/daemon/config/dyncfg.c
|
||||
src/daemon/config/dyncfg.h
|
||||
src/daemon/config/dyncfg-files.c
|
||||
src/daemon/config/dyncfg-unittest.c
|
||||
src/daemon/config/dyncfg-inline.c
|
||||
src/daemon/config/dyncfg-echo.c
|
||||
src/daemon/config/dyncfg-internals.h
|
||||
src/daemon/config/dyncfg-intercept.c
|
||||
src/daemon/config/dyncfg-tree.c
|
||||
src/daemon/dyncfg/dyncfg.c
|
||||
src/daemon/dyncfg/dyncfg.h
|
||||
src/daemon/dyncfg/dyncfg-files.c
|
||||
src/daemon/dyncfg/dyncfg-unittest.c
|
||||
src/daemon/dyncfg/dyncfg-inline.c
|
||||
src/daemon/dyncfg/dyncfg-echo.c
|
||||
src/daemon/dyncfg/dyncfg-internals.h
|
||||
src/daemon/dyncfg/dyncfg-intercept.c
|
||||
src/daemon/dyncfg/dyncfg-tree.c
|
||||
src/daemon/telemetry/telemetry-http-api.c
|
||||
src/daemon/telemetry/telemetry-http-api.h
|
||||
src/daemon/telemetry/telemetry-queries.c
|
||||
src/daemon/telemetry/telemetry-queries.h
|
||||
src/daemon/telemetry/telemetry-ingestion.c
|
||||
src/daemon/telemetry/telemetry-ingestion.h
|
||||
src/daemon/telemetry/telemetry-ml.c
|
||||
src/daemon/telemetry/telemetry-ml.h
|
||||
src/daemon/telemetry/telemetry-gorilla.c
|
||||
src/daemon/telemetry/telemetry-gorilla.h
|
||||
src/daemon/telemetry/telemetry-daemon.c
|
||||
src/daemon/telemetry/telemetry-daemon.h
|
||||
src/daemon/telemetry/telemetry-daemon-memory.c
|
||||
src/daemon/telemetry/telemetry-daemon-memory.h
|
||||
src/daemon/telemetry/telemetry-sqlite3.c
|
||||
src/daemon/telemetry/telemetry-sqlite3.h
|
||||
src/daemon/telemetry/telemetry-dbengine.c
|
||||
src/daemon/telemetry/telemetry-dbengine.h
|
||||
src/daemon/telemetry/telemetry-string.c
|
||||
src/daemon/telemetry/telemetry-string.h
|
||||
src/daemon/telemetry/telemetry-heartbeat.c
|
||||
src/daemon/telemetry/telemetry-heartbeat.h
|
||||
src/daemon/telemetry/telemetry-dictionary.c
|
||||
src/daemon/telemetry/telemetry-dictionary.h
|
||||
src/daemon/telemetry/telemetry-workers.c
|
||||
src/daemon/telemetry/telemetry-workers.h
|
||||
src/daemon/telemetry/telemetry-trace-allocations.c
|
||||
src/daemon/telemetry/telemetry-trace-allocations.h
|
||||
src/daemon/telemetry/telemetry-aral.c
|
||||
src/daemon/telemetry/telemetry-aral.h
|
||||
src/daemon/config/netdata-conf-db.c
|
||||
src/daemon/config/netdata-conf-db.h
|
||||
src/daemon/config/netdata-conf.h
|
||||
src/daemon/config/netdata-conf-backwards-compatibility.c
|
||||
src/daemon/config/netdata-conf-backwards-compatibility.h
|
||||
src/daemon/config/netdata-conf-web.c
|
||||
src/daemon/config/netdata-conf-web.h
|
||||
src/daemon/config/netdata-conf-directories.c
|
||||
src/daemon/config/netdata-conf-directories.h
|
||||
src/daemon/config/netdata-conf-logs.c
|
||||
src/daemon/config/netdata-conf-logs.h
|
||||
src/daemon/config/netdata-conf-global.c
|
||||
src/daemon/config/netdata-conf-global.h
|
||||
src/daemon/config/netdata-conf.c
|
||||
)
|
||||
|
||||
set(H2O_FILES
|
||||
|
@ -1227,15 +1295,34 @@ if(ENABLE_ML)
|
|||
set(ML_FILES
|
||||
src/ml/ad_charts.h
|
||||
src/ml/ad_charts.cc
|
||||
src/ml/Config.cc
|
||||
src/ml/dlib/dlib/all/source.cpp
|
||||
src/ml/ml.h
|
||||
src/ml/ml.cc
|
||||
src/ml/ml-private.h
|
||||
src/ml/ml_calculated_number.h
|
||||
src/ml/ml_host.h
|
||||
src/ml/ml_config.h
|
||||
src/ml/ml_config.cc
|
||||
src/ml/ml_dimension.h
|
||||
src/ml/ml_enums.h
|
||||
src/ml/ml_enums.cc
|
||||
src/ml/ml_features.h
|
||||
src/ml/ml_features.cc
|
||||
src/ml/ml_kmeans.h
|
||||
src/ml/ml_kmeans.cc
|
||||
src/ml/ml_queue.h
|
||||
src/ml/ml_worker.h
|
||||
src/ml/ml_string_wrapper.h
|
||||
src/ml/ml_queue.cc
|
||||
src/ml/ml_private.h
|
||||
src/ml/ml_public.h
|
||||
src/ml/ml_public.cc
|
||||
)
|
||||
|
||||
if(NOT ENABLE_MIMALLOC)
|
||||
list(APPEND ML_FILES src/ml/ml_memory.cc)
|
||||
endif()
|
||||
else()
|
||||
set(ML_FILES
|
||||
src/ml/ml.h
|
||||
src/ml/ml_public.h
|
||||
src/ml/ml-dummy.c
|
||||
)
|
||||
endif()
|
||||
|
@ -1338,6 +1425,8 @@ set(RRD_PLUGIN_FILES
|
|||
src/database/rrdfunctions-exporters.h
|
||||
src/database/rrdfunctions-internals.h
|
||||
src/database/rrdcollector-internals.h
|
||||
src/database/rrd-database-mode.h
|
||||
src/database/rrd-database-mode.c
|
||||
)
|
||||
|
||||
if(ENABLE_DBENGINE)
|
||||
|
@ -1405,7 +1494,7 @@ set(SYSTEMD_JOURNAL_PLUGIN_FILES
|
|||
)
|
||||
|
||||
set(STREAMING_PLUGIN_FILES
|
||||
src/streaming/rrdpush.h
|
||||
src/streaming/stream.h
|
||||
src/streaming/stream-compression/compression.c
|
||||
src/streaming/stream-compression/compression.h
|
||||
src/streaming/stream-compression/brotli.c
|
||||
|
@ -1416,8 +1505,8 @@ set(STREAMING_PLUGIN_FILES
|
|||
src/streaming/stream-compression/lz4.h
|
||||
src/streaming/stream-compression/zstd.c
|
||||
src/streaming/stream-compression/zstd.h
|
||||
src/streaming/receiver.c
|
||||
src/streaming/sender.c
|
||||
src/streaming/stream-receiver.c
|
||||
src/streaming/stream-sender.c
|
||||
src/streaming/replication.c
|
||||
src/streaming/replication.h
|
||||
src/streaming/h2o-common.h
|
||||
|
@ -1429,11 +1518,11 @@ set(STREAMING_PLUGIN_FILES
|
|||
src/streaming/stream-path.h
|
||||
src/streaming/stream-capabilities.c
|
||||
src/streaming/stream-capabilities.h
|
||||
src/streaming/sender-connect.c
|
||||
src/streaming/sender-internals.h
|
||||
src/streaming/sender-execute.c
|
||||
src/streaming/sender-commit.c
|
||||
src/streaming/sender-destinations.c
|
||||
src/streaming/stream-connector.c
|
||||
src/streaming/stream-sender-internals.h
|
||||
src/streaming/stream-sender-execute.c
|
||||
src/streaming/stream-sender-commit.c
|
||||
src/streaming/stream-parents.c
|
||||
src/streaming/stream-handshake.c
|
||||
src/streaming/protocol/command-function.c
|
||||
src/streaming/protocol/command-host-labels.c
|
||||
|
@ -1443,11 +1532,17 @@ set(STREAMING_PLUGIN_FILES
|
|||
src/streaming/stream-conf.c
|
||||
src/streaming/stream-conf.h
|
||||
src/streaming/stream-handshake.h
|
||||
src/streaming/sender.h
|
||||
src/streaming/sender-destinations.h
|
||||
src/streaming/stream-parents.h
|
||||
src/streaming/rrdhost-status.c
|
||||
src/streaming/rrdhost-status.h
|
||||
src/streaming/receiver.h
|
||||
src/streaming/stream-sender-api.c
|
||||
src/streaming/stream-receiver-internals.h
|
||||
src/streaming/stream-receiver-api.c
|
||||
src/streaming/stream-thread.c
|
||||
src/streaming/stream-thread.h
|
||||
src/streaming/stream-receiver-connection.c
|
||||
src/streaming/stream-sender-commit.h
|
||||
src/streaming/stream-traffic-types.h
|
||||
)
|
||||
|
||||
set(WEB_PLUGIN_FILES
|
||||
|
@ -1459,6 +1554,7 @@ set(WEB_PLUGIN_FILES
|
|||
src/web/server/static/static-threaded.h
|
||||
src/web/server/web_client_cache.c
|
||||
src/web/server/web_client_cache.h
|
||||
src/web/api/v3/api_v3_stream_info.c
|
||||
src/web/api/v3/api_v3_stream_path.c
|
||||
)
|
||||
|
||||
|
|
|
@ -115,7 +115,7 @@ context, charttype]`, where:
|
|||
- `family`: An identifier used to group charts together (can be null).
|
||||
- `context`: An identifier used to group contextually similar charts together. The best practice is to provide a context
|
||||
that is `A.B`, with `A` being the name of the collector, and `B` being the name of the specific metric.
|
||||
- `charttype`: Either `line`, `area`, or `stacked`. If null line is the default value.
|
||||
- `charttype`: Either `line`, `area`, `stacked` or `heatmap`. If null line is the default value.
|
||||
|
||||
You can read more about `family` and `context` in the [Netdata Charts](/docs/dashboards-and-charts/netdata-charts.md) doc.
|
||||
|
||||
|
|
File diff suppressed because one or more lines are too long
Before (image error) Size: 31 KiB After (image error) Size: 31 KiB |
|
@ -73,6 +73,7 @@
|
|||
#cmakedefine HAVE_ARC4RANDOM_UNIFORM
|
||||
#cmakedefine HAVE_RAND_S
|
||||
#cmakedefine HAVE_GETRANDOM
|
||||
#cmakedefine HAVE_SYSINFO
|
||||
|
||||
#cmakedefine HAVE_BACKTRACE
|
||||
#cmakedefine HAVE_CLOSE_RANGE
|
||||
|
@ -95,6 +96,7 @@
|
|||
#cmakedefine STRERROR_R_CHAR_P
|
||||
#cmakedefine HAVE_C__GENERIC
|
||||
#cmakedefine HAVE_C_MALLOPT
|
||||
#cmakedefine HAVE_C_MALLOC_TRIM
|
||||
#cmakedefine HAVE_SETNS
|
||||
#cmakedefine HAVE_STRNDUP
|
||||
#cmakedefine SSL_HAS_PENDING
|
||||
|
|
|
@ -201,7 +201,7 @@ static int wait_till_agent_claim_ready()
|
|||
// We trap the impossible NULL here to keep the linter happy without using a fatal() in the code.
|
||||
const char *cloud_base_url = cloud_config_url_get();
|
||||
if (cloud_base_url == NULL) {
|
||||
netdata_log_error("Do not move the \"url\" out of post_conf_load!!");
|
||||
netdata_log_error("Do not move the \"url\" out of netdata_conf_section_global_run_as_user!!");
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
@ -559,7 +559,7 @@ static int aclk_attempt_to_connect(mqtt_wss_client client)
|
|||
while (service_running(SERVICE_ACLK)) {
|
||||
aclk_cloud_base_url = cloud_config_url_get();
|
||||
if (aclk_cloud_base_url == NULL) {
|
||||
error_report("Do not move the \"url\" out of post_conf_load!!");
|
||||
error_report("Do not move the \"url\" out of netdata_conf_section_global_run_as_user!!");
|
||||
aclk_status = ACLK_STATUS_NO_CLOUD_URL;
|
||||
return -1;
|
||||
}
|
||||
|
@ -868,7 +868,7 @@ void aclk_host_state_update(RRDHOST *host, int cmd, int queryable)
|
|||
create_query->data.bin_payload.topic = ACLK_TOPICID_CREATE_NODE;
|
||||
create_query->data.bin_payload.msg_name = "CreateNodeInstance";
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Registering host=%s, hops=%u", host->machine_guid, host->system_info->hops);
|
||||
"Registering host=%s, hops=%d", host->machine_guid, host->system_info->hops);
|
||||
|
||||
aclk_execute_query(create_query);
|
||||
return;
|
||||
|
@ -892,7 +892,7 @@ void aclk_host_state_update(RRDHOST *host, int cmd, int queryable)
|
|||
query->data.bin_payload.payload = generate_node_instance_connection(&query->data.bin_payload.size, &node_state_update);
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Queuing status update for node=%s, live=%d, hops=%u, queryable=%d",
|
||||
"Queuing status update for node=%s, live=%d, hops=%d, queryable=%d",
|
||||
(char*)node_state_update.node_id, cmd, host->system_info->hops, queryable);
|
||||
freez((void*)node_state_update.node_id);
|
||||
query->data.bin_payload.msg_name = "UpdateNodeInstanceConnection";
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
#include "aclk_capas.h"
|
||||
|
||||
#include "ml/ml.h"
|
||||
#include "ml/ml_public.h"
|
||||
|
||||
#define HTTP_API_V2_VERSION 7
|
||||
|
||||
|
@ -31,14 +31,14 @@ const struct capability *aclk_get_agent_capas()
|
|||
agent_capabilities[3].version = metric_correlations_version;
|
||||
agent_capabilities[3].enabled = 1;
|
||||
|
||||
agent_capabilities[7].enabled = localhost->health.health_enabled;
|
||||
agent_capabilities[7].enabled = localhost->health.enabled;
|
||||
|
||||
return agent_capabilities;
|
||||
}
|
||||
|
||||
struct capability *aclk_get_node_instance_capas(RRDHOST *host)
|
||||
{
|
||||
bool functions = (host == localhost || (host->receiver && stream_has_capability(host->receiver, STREAM_CAP_FUNCTIONS)));
|
||||
bool functions = (host == localhost || receiver_has_capability(host, STREAM_CAP_FUNCTIONS));
|
||||
bool dyncfg = (host == localhost || dyncfg_available_for_rrdhost(host));
|
||||
|
||||
struct capability ni_caps[] = {
|
||||
|
@ -48,7 +48,7 @@ struct capability *aclk_get_node_instance_capas(RRDHOST *host)
|
|||
{ .name = "ctx", .version = 1, .enabled = 1 },
|
||||
{ .name = "funcs", .version = functions ? 1 : 0, .enabled = functions ? 1 : 0 },
|
||||
{ .name = "http_api_v2", .version = HTTP_API_V2_VERSION, .enabled = 1 },
|
||||
{ .name = "health", .version = 2, .enabled = host->health.health_enabled },
|
||||
{ .name = "health", .version = 2, .enabled = host->health.enabled},
|
||||
{ .name = "req_cancel", .version = 1, .enabled = 1 },
|
||||
{ .name = "dyncfg", .version = 2, .enabled = dyncfg },
|
||||
{ .name = NULL, .version = 0, .enabled = 0 }
|
||||
|
|
|
@ -6,7 +6,7 @@
|
|||
|
||||
#include "aclk_util.h"
|
||||
|
||||
#include "daemon/global_statistics.h"
|
||||
#include "daemon/telemetry/telemetry.h"
|
||||
|
||||
static const char *http_req_type_to_str(http_req_type_t req) {
|
||||
switch (req) {
|
||||
|
|
|
@ -189,7 +189,7 @@ CLOUD_STATUS claim_reload_and_wait_online(void) {
|
|||
cloud_conf_load(0);
|
||||
bool claimed = load_claiming_state();
|
||||
registry_update_cloud_base_url();
|
||||
rrdpush_sender_send_claimed_id(localhost);
|
||||
stream_sender_send_claimed_id(localhost);
|
||||
nd_log_limits_reset();
|
||||
|
||||
CLOUD_STATUS status = cloud_status();
|
||||
|
|
|
@ -30,8 +30,8 @@ CLOUD_STATUS cloud_status(void) {
|
|||
return CLOUD_STATUS_ONLINE;
|
||||
|
||||
if(localhost->sender &&
|
||||
rrdhost_flag_check(localhost, RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS) &&
|
||||
stream_has_capability(localhost->sender, STREAM_CAP_NODE_ID) &&
|
||||
rrdhost_flag_check(localhost, RRDHOST_FLAG_STREAM_SENDER_READY_4_METRICS) &&
|
||||
stream_sender_has_capabilities(localhost, STREAM_CAP_NODE_ID) &&
|
||||
!UUIDiszero(localhost->node_id) &&
|
||||
!UUIDiszero(localhost->aclk.claim_id_of_parent))
|
||||
return CLOUD_STATUS_INDIRECT;
|
||||
|
|
|
@ -53,7 +53,7 @@ size_t all_pids_count(void) {
|
|||
}
|
||||
|
||||
void apps_pids_init(void) {
|
||||
pids.all_pids.aral = aral_create("pid_stat", sizeof(struct pid_stat), 1, 65536, NULL, NULL, NULL, false, true);
|
||||
pids.all_pids.aral = aral_create("pid_stat", sizeof(struct pid_stat), 1, 0, NULL, NULL, NULL, false, true);
|
||||
simple_hashtable_init_PID(&pids.all_pids.ht, 1024);
|
||||
}
|
||||
|
||||
|
|
|
@ -739,7 +739,7 @@ ARAL *ebpf_allocate_pid_aral(char *name, size_t size)
|
|||
}
|
||||
|
||||
return aral_create(name, size,
|
||||
0, max_elements,
|
||||
0, 0,
|
||||
NULL, NULL, NULL, false, false);
|
||||
}
|
||||
|
||||
|
|
|
@ -2654,7 +2654,7 @@ void *statsd_main(void *ptr) {
|
|||
RRDSET *st_pcharts = NULL;
|
||||
RRDDIM *rd_pcharts = NULL;
|
||||
|
||||
if(global_statistics_enabled) {
|
||||
if(telemetry_enabled) {
|
||||
st_metrics = rrdset_create_localhost(
|
||||
"netdata",
|
||||
"statsd_metrics",
|
||||
|
@ -2851,7 +2851,7 @@ void *statsd_main(void *ptr) {
|
|||
if(unlikely(!service_running(SERVICE_COLLECTORS)))
|
||||
break;
|
||||
|
||||
if(global_statistics_enabled) {
|
||||
if(telemetry_enabled) {
|
||||
rrddim_set_by_pointer(st_metrics, rd_metrics_gauge, (collected_number)statsd.gauges.metrics);
|
||||
rrddim_set_by_pointer(st_metrics, rd_metrics_counter, (collected_number)statsd.counters.metrics);
|
||||
rrddim_set_by_pointer(st_metrics, rd_metrics_timer, (collected_number)statsd.timers.metrics);
|
||||
|
|
|
@ -377,10 +377,7 @@ void analytics_https(void)
|
|||
BUFFER *b = buffer_create(30, NULL);
|
||||
analytics_exporting_connectors_ssl(b);
|
||||
|
||||
buffer_strcat(b, netdata_ssl_streaming_sender_ctx &&
|
||||
rrdhost_flag_check(localhost, RRDHOST_FLAG_RRDPUSH_SENDER_CONNECTED) &&
|
||||
SSL_connection(&localhost->sender->ssl) ? "streaming|" : "|");
|
||||
|
||||
buffer_strcat(b, stream_sender_is_connected_with_ssl(localhost) ? "streaming|" : "|");
|
||||
buffer_strcat(b, netdata_ssl_web_server_ctx ? "web" : "");
|
||||
|
||||
analytics_set_data_str(&analytics_data.netdata_config_https_available, (char *)buffer_tostring(b));
|
||||
|
@ -619,7 +616,7 @@ cleanup:
|
|||
*/
|
||||
void set_late_analytics_variables(struct rrdhost_system_info *system_info)
|
||||
{
|
||||
analytics_set_data(&analytics_data.netdata_config_stream_enabled, stream_conf_send_enabled ? "true" : "false");
|
||||
analytics_set_data(&analytics_data.netdata_config_stream_enabled, stream_send.enabled ? "true" : "false");
|
||||
analytics_set_data_str(&analytics_data.netdata_config_memory_mode, (char *)rrd_memory_mode_name(default_rrd_memory_mode));
|
||||
analytics_set_data(&analytics_data.netdata_host_cloud_enabled, "true");
|
||||
|
||||
|
|
|
@ -3,7 +3,12 @@
|
|||
#ifndef NETDATA_COMMON_H
|
||||
#define NETDATA_COMMON_H 1
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
#include "libnetdata/libnetdata.h"
|
||||
#include "config/netdata-conf.h"
|
||||
#include "libuv_workers.h"
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
|
@ -11,9 +16,10 @@
|
|||
|
||||
#include "web/api/maps/maps.h"
|
||||
|
||||
#include "daemon/config/dyncfg.h"
|
||||
#include "daemon/config/netdata-conf.h"
|
||||
#include "daemon/dyncfg/dyncfg.h"
|
||||
|
||||
#include "global_statistics.h"
|
||||
#include "daemon/telemetry/telemetry.h"
|
||||
|
||||
// health monitoring and alarm notifications
|
||||
#include "health/health.h"
|
||||
|
@ -30,10 +36,10 @@
|
|||
#endif
|
||||
|
||||
// streaming metrics between netdata servers
|
||||
#include "streaming/rrdpush.h"
|
||||
#include "streaming/stream.h"
|
||||
|
||||
// anomaly detection
|
||||
#include "ml/ml.h"
|
||||
#include "ml/ml_public.h"
|
||||
|
||||
// the netdata registry
|
||||
// the registry is actually an API feature
|
||||
|
@ -94,4 +100,8 @@ long get_netdata_cpus(void);
|
|||
|
||||
void set_environment_for_plugins_and_scripts(void);
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif /* NETDATA_COMMON_H */
|
||||
|
|
|
@ -18,7 +18,7 @@ The configuration file uses an INI-style format with `[SECTION]` headers:
|
|||
| [[health]](#health-section-options) | [Health monitoring](/src/health/README.md) |
|
||||
| `[web]` | [Web Server](/src/web/server/README.md) |
|
||||
| `[registry]` | [Registry](/src/registry/README.md) |
|
||||
| `[global statistics]` | Internal monitoring |
|
||||
| `[telemetry]` | Internal monitoring |
|
||||
| `[statsd]` | [StatsD plugin](/src/collectors/statsd.plugin/README.md) |
|
||||
| [`[plugins]`](#plugins-section-options) | Data collection Plugins (Collectors) |
|
||||
| [[plugin:NAME]](#per-plugin-configuration) | Individual [Plugins](#per-plugin-configuration) |
|
||||
|
|
285
src/daemon/config/netdata-conf-backwards-compatibility.c
Normal file
285
src/daemon/config/netdata-conf-backwards-compatibility.c
Normal file
|
@ -0,0 +1,285 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "netdata-conf-backwards-compatibility.h"
|
||||
#include "database/engine/rrdengineapi.h"
|
||||
|
||||
void netdata_conf_backwards_compatibility(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
// move [global] options to the [web] section
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "http port listen backlog",
|
||||
CONFIG_SECTION_WEB, "listen backlog");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "bind socket to IP",
|
||||
CONFIG_SECTION_WEB, "bind to");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "bind to",
|
||||
CONFIG_SECTION_WEB, "bind to");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "port",
|
||||
CONFIG_SECTION_WEB, "default port");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "default port",
|
||||
CONFIG_SECTION_WEB, "default port");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "disconnect idle web clients after seconds",
|
||||
CONFIG_SECTION_WEB, "disconnect idle clients after seconds");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "respect web browser do not track policy",
|
||||
CONFIG_SECTION_WEB, "respect do not track policy");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web x-frame-options header",
|
||||
CONFIG_SECTION_WEB, "x-frame-options response header");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "enable web responses gzip compression",
|
||||
CONFIG_SECTION_WEB, "enable gzip compression");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web compression strategy",
|
||||
CONFIG_SECTION_WEB, "gzip compression strategy");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web compression level",
|
||||
CONFIG_SECTION_WEB, "gzip compression level");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "config directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "config");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "stock config directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "stock config");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "log directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "log");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web files directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "web");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "cache directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "cache");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "lib directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "lib");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "home directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "home");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "lock directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "lock");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "plugins directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "plugins");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "health configuration directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "health config");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "stock health configuration directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "stock health config");
|
||||
|
||||
config_move(CONFIG_SECTION_REGISTRY, "registry db directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "registry");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "debug log",
|
||||
CONFIG_SECTION_LOGS, "debug");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "error log",
|
||||
CONFIG_SECTION_LOGS, "error");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "access log",
|
||||
CONFIG_SECTION_LOGS, "access");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "facility log",
|
||||
CONFIG_SECTION_LOGS, "facility");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "errors flood protection period",
|
||||
CONFIG_SECTION_LOGS, "errors flood protection period");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "errors to trigger flood protection",
|
||||
CONFIG_SECTION_LOGS, "errors to trigger flood protection");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "debug flags",
|
||||
CONFIG_SECTION_LOGS, "debug flags");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "TZ environment variable",
|
||||
CONFIG_SECTION_ENV_VARS, "TZ");
|
||||
|
||||
config_move(CONFIG_SECTION_PLUGINS, "PATH environment variable",
|
||||
CONFIG_SECTION_ENV_VARS, "PATH");
|
||||
|
||||
config_move(CONFIG_SECTION_PLUGINS, "PYTHONPATH environment variable",
|
||||
CONFIG_SECTION_ENV_VARS, "PYTHONPATH");
|
||||
|
||||
config_move(CONFIG_SECTION_STATSD, "enabled",
|
||||
CONFIG_SECTION_PLUGINS, "statsd");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "memory mode",
|
||||
CONFIG_SECTION_DB, "db");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "mode",
|
||||
CONFIG_SECTION_DB, "db");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "history",
|
||||
CONFIG_SECTION_DB, "retention");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "update every",
|
||||
CONFIG_SECTION_DB, "update every");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "page cache size",
|
||||
CONFIG_SECTION_DB, "dbengine page cache size");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "dbengine page cache size MB",
|
||||
CONFIG_SECTION_DB, "dbengine page cache size");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "dbengine extent cache size MB",
|
||||
CONFIG_SECTION_DB, "dbengine extent cache size");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "page cache size",
|
||||
CONFIG_SECTION_DB, "dbengine page cache size MB");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "page cache uses malloc",
|
||||
CONFIG_SECTION_DB, "dbengine page cache with malloc");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "page cache with malloc",
|
||||
CONFIG_SECTION_DB, "dbengine page cache with malloc");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "memory deduplication (ksm)",
|
||||
CONFIG_SECTION_DB, "memory deduplication (ksm)");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine page fetch timeout",
|
||||
CONFIG_SECTION_DB, "dbengine page fetch timeout secs");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine page fetch retries",
|
||||
CONFIG_SECTION_DB, "dbengine page fetch retries");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine extent pages",
|
||||
CONFIG_SECTION_DB, "dbengine pages per extent");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "cleanup obsolete charts after seconds",
|
||||
CONFIG_SECTION_DB, "cleanup obsolete charts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "cleanup obsolete charts after secs",
|
||||
CONFIG_SECTION_DB, "cleanup obsolete charts after");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "gap when lost iterations above",
|
||||
CONFIG_SECTION_DB, "gap when lost iterations above");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "cleanup orphan hosts after seconds",
|
||||
CONFIG_SECTION_DB, "cleanup orphan hosts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "cleanup orphan hosts after secs",
|
||||
CONFIG_SECTION_DB, "cleanup orphan hosts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "cleanup ephemeral hosts after secs",
|
||||
CONFIG_SECTION_DB, "cleanup ephemeral hosts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "seconds to replicate",
|
||||
CONFIG_SECTION_DB, "replication period");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "seconds per replication step",
|
||||
CONFIG_SECTION_DB, "replication step");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "enable zero metrics",
|
||||
CONFIG_SECTION_DB, "enable zero metrics");
|
||||
|
||||
config_move("global statistics", "update every",
|
||||
CONFIG_SECTION_TELEMETRY, "update every");
|
||||
|
||||
config_move(CONFIG_SECTION_PLUGINS, "netdata monitoring",
|
||||
CONFIG_SECTION_PLUGINS, "netdata telemetry");
|
||||
|
||||
config_move(CONFIG_SECTION_PLUGINS, "netdata monitoring extended",
|
||||
CONFIG_SECTION_TELEMETRY, "extended telemetry");
|
||||
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
|
||||
bool found_old_config = false;
|
||||
|
||||
if(config_move(CONFIG_SECTION_GLOBAL, "dbengine disk space",
|
||||
CONFIG_SECTION_DB, "dbengine tier 0 retention size") != -1)
|
||||
found_old_config = true;
|
||||
|
||||
if(config_move(CONFIG_SECTION_GLOBAL, "dbengine multihost disk space",
|
||||
CONFIG_SECTION_DB, "dbengine tier 0 retention size") != -1)
|
||||
found_old_config = true;
|
||||
|
||||
if(config_move(CONFIG_SECTION_DB, "dbengine disk space MB",
|
||||
CONFIG_SECTION_DB, "dbengine tier 0 retention size") != -1)
|
||||
found_old_config = true;
|
||||
|
||||
for(size_t tier = 0; tier < RRD_STORAGE_TIERS ;tier++) {
|
||||
char old_config[128], new_config[128];
|
||||
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine tier %zu retention days", tier);
|
||||
snprintfz(new_config, sizeof(new_config), "dbengine tier %zu retention time", tier);
|
||||
config_move(CONFIG_SECTION_DB, old_config,
|
||||
CONFIG_SECTION_DB, new_config);
|
||||
|
||||
if(tier == 0)
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine multihost disk space MB");
|
||||
else
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine tier %zu multihost disk space MB", tier);
|
||||
snprintfz(new_config, sizeof(new_config), "dbengine tier %zu retention size", tier);
|
||||
if(config_move(CONFIG_SECTION_DB, old_config,
|
||||
CONFIG_SECTION_DB, new_config) != -1 && tier == 0)
|
||||
found_old_config = true;
|
||||
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine tier %zu disk space MB", tier);
|
||||
snprintfz(new_config, sizeof(new_config), "dbengine tier %zu retention size", tier);
|
||||
if(config_move(CONFIG_SECTION_DB, old_config,
|
||||
CONFIG_SECTION_DB, new_config) != -1 && tier == 0)
|
||||
found_old_config = true;
|
||||
}
|
||||
|
||||
legacy_multihost_db_space = found_old_config;
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "error",
|
||||
CONFIG_SECTION_LOGS, "daemon");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "severity level",
|
||||
CONFIG_SECTION_LOGS, "level");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "errors to trigger flood protection",
|
||||
CONFIG_SECTION_LOGS, "logs to trigger flood protection");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "errors flood protection period",
|
||||
CONFIG_SECTION_LOGS, "logs flood protection period");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "is ephemeral",
|
||||
CONFIG_SECTION_GLOBAL, "is ephemeral node");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "has unstable connection",
|
||||
CONFIG_SECTION_GLOBAL, "has unstable connection");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "run at least every seconds",
|
||||
CONFIG_SECTION_HEALTH, "run at least every");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "postpone alarms during hibernation for seconds",
|
||||
CONFIG_SECTION_HEALTH, "postpone alarms during hibernation for");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "health log history",
|
||||
CONFIG_SECTION_HEALTH, "health log retention");
|
||||
|
||||
config_move(CONFIG_SECTION_REGISTRY, "registry expire idle persons days",
|
||||
CONFIG_SECTION_REGISTRY, "registry expire idle persons");
|
||||
|
||||
config_move(CONFIG_SECTION_WEB, "disconnect idle clients after seconds",
|
||||
CONFIG_SECTION_WEB, "disconnect idle clients after");
|
||||
|
||||
config_move(CONFIG_SECTION_WEB, "accept a streaming request every seconds",
|
||||
CONFIG_SECTION_WEB, "accept a streaming request every");
|
||||
|
||||
config_move(CONFIG_SECTION_STATSD, "set charts as obsolete after secs",
|
||||
CONFIG_SECTION_STATSD, "set charts as obsolete after");
|
||||
|
||||
config_move(CONFIG_SECTION_STATSD, "disconnect idle tcp clients after seconds",
|
||||
CONFIG_SECTION_STATSD, "disconnect idle tcp clients after");
|
||||
|
||||
config_move("plugin:idlejitter", "loop time in ms",
|
||||
"plugin:idlejitter", "loop time");
|
||||
|
||||
config_move("plugin:proc:/sys/class/infiniband", "refresh ports state every seconds",
|
||||
"plugin:proc:/sys/class/infiniband", "refresh ports state every");
|
||||
}
|
10
src/daemon/config/netdata-conf-backwards-compatibility.h
Normal file
10
src/daemon/config/netdata-conf-backwards-compatibility.h
Normal file
|
@ -0,0 +1,10 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_DAEMON_NETDATA_CONF_BACKWARDS_COMPATIBILITY_H
|
||||
#define NETDATA_DAEMON_NETDATA_CONF_BACKWARDS_COMPATIBILITY_H
|
||||
|
||||
#include "config.h"
|
||||
|
||||
void netdata_conf_backwards_compatibility(void);
|
||||
|
||||
#endif //NETDATA_DAEMON_NETDATA_CONF_BACKWARDS_COMPATIBILITY_H
|
418
src/daemon/config/netdata-conf-db.c
Normal file
418
src/daemon/config/netdata-conf-db.c
Normal file
|
@ -0,0 +1,418 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "netdata-conf-db.h"
|
||||
|
||||
int default_rrd_update_every = UPDATE_EVERY;
|
||||
int default_rrd_history_entries = RRD_DEFAULT_HISTORY_ENTRIES;
|
||||
|
||||
bool dbengine_enabled = false; // will become true if and when dbengine is initialized
|
||||
size_t storage_tiers = 3;
|
||||
bool dbengine_use_direct_io = true;
|
||||
static size_t storage_tiers_grouping_iterations[RRD_STORAGE_TIERS] = {1, 60, 60, 60, 60};
|
||||
static double storage_tiers_retention_days[RRD_STORAGE_TIERS] = {14, 90, 2 * 365, 2 * 365, 2 * 365};
|
||||
|
||||
time_t rrdset_free_obsolete_time_s = 3600;
|
||||
time_t rrdhost_free_orphan_time_s = 3600;
|
||||
time_t rrdhost_free_ephemeral_time_s = 86400;
|
||||
|
||||
size_t get_tier_grouping(size_t tier) {
|
||||
if(unlikely(tier >= storage_tiers)) tier = storage_tiers - 1;
|
||||
|
||||
size_t grouping = 1;
|
||||
// first tier is always 1 iteration of whatever update every the chart has
|
||||
for(size_t i = 1; i <= tier ;i++)
|
||||
grouping *= storage_tiers_grouping_iterations[i];
|
||||
|
||||
return grouping;
|
||||
}
|
||||
|
||||
static void netdata_conf_dbengine_pre_logs(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
errno_clear();
|
||||
|
||||
#ifdef ENABLE_DBENGINE
|
||||
// this is required for dbegnine to work, so call it here (it is ok, it won't run twice)
|
||||
netdata_conf_section_directories();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default Database Engine page type
|
||||
|
||||
const char *page_type = config_get(CONFIG_SECTION_DB, "dbengine page type", "gorilla");
|
||||
if (strcmp(page_type, "gorilla") == 0)
|
||||
tier_page_type[0] = RRDENG_PAGE_TYPE_GORILLA_32BIT;
|
||||
else if (strcmp(page_type, "raw") == 0)
|
||||
tier_page_type[0] = RRDENG_PAGE_TYPE_ARRAY_32BIT;
|
||||
else {
|
||||
tier_page_type[0] = RRDENG_PAGE_TYPE_ARRAY_32BIT;
|
||||
netdata_log_error("Invalid dbengine page type ''%s' given. Defaulting to 'raw'.", page_type);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default Database Engine page cache size in MiB
|
||||
|
||||
default_rrdeng_page_cache_mb = (int) config_get_size_mb(CONFIG_SECTION_DB, "dbengine page cache size", default_rrdeng_page_cache_mb);
|
||||
default_rrdeng_extent_cache_mb = (int) config_get_size_mb(CONFIG_SECTION_DB, "dbengine extent cache size", default_rrdeng_extent_cache_mb);
|
||||
db_engine_journal_check = config_get_boolean(CONFIG_SECTION_DB, "dbengine enable journal integrity check", CONFIG_BOOLEAN_NO);
|
||||
|
||||
if(default_rrdeng_extent_cache_mb < 0) {
|
||||
default_rrdeng_extent_cache_mb = 0;
|
||||
config_set_size_mb(CONFIG_SECTION_DB, "dbengine extent cache size", default_rrdeng_extent_cache_mb);
|
||||
}
|
||||
|
||||
if(default_rrdeng_page_cache_mb < RRDENG_MIN_PAGE_CACHE_SIZE_MB) {
|
||||
netdata_log_error("Invalid page cache size %d given. Defaulting to %d.", default_rrdeng_page_cache_mb, RRDENG_MIN_PAGE_CACHE_SIZE_MB);
|
||||
default_rrdeng_page_cache_mb = RRDENG_MIN_PAGE_CACHE_SIZE_MB;
|
||||
config_set_size_mb(CONFIG_SECTION_DB, "dbengine page cache size", default_rrdeng_page_cache_mb);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default Database Engine disk space quota in MiB
|
||||
//
|
||||
// // if (!config_exists(CONFIG_SECTION_DB, "dbengine disk space MB") && !config_exists(CONFIG_SECTION_DB, "dbengine multihost disk space MB"))
|
||||
//
|
||||
// default_rrdeng_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_DB, "dbengine disk space MB", default_rrdeng_disk_quota_mb);
|
||||
// if(default_rrdeng_disk_quota_mb < RRDENG_MIN_DISK_SPACE_MB) {
|
||||
// netdata_log_error("Invalid dbengine disk space %d given. Defaulting to %d.", default_rrdeng_disk_quota_mb, RRDENG_MIN_DISK_SPACE_MB);
|
||||
// default_rrdeng_disk_quota_mb = RRDENG_MIN_DISK_SPACE_MB;
|
||||
// config_set_number(CONFIG_SECTION_DB, "dbengine disk space MB", default_rrdeng_disk_quota_mb);
|
||||
// }
|
||||
//
|
||||
// default_multidb_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_DB, "dbengine multihost disk space MB", compute_multidb_diskspace());
|
||||
// if(default_multidb_disk_quota_mb < RRDENG_MIN_DISK_SPACE_MB) {
|
||||
// netdata_log_error("Invalid multidb disk space %d given. Defaulting to %d.", default_multidb_disk_quota_mb, default_rrdeng_disk_quota_mb);
|
||||
// default_multidb_disk_quota_mb = default_rrdeng_disk_quota_mb;
|
||||
// config_set_number(CONFIG_SECTION_DB, "dbengine multihost disk space MB", default_multidb_disk_quota_mb);
|
||||
// }
|
||||
|
||||
#else
|
||||
if (default_rrd_memory_mode == RRD_MEMORY_MODE_DBENGINE) {
|
||||
error_report("RRD_MEMORY_MODE_DBENGINE is not supported in this platform. The agent will use db mode 'save' instead.");
|
||||
default_rrd_memory_mode = RRD_MEMORY_MODE_RAM;
|
||||
}
|
||||
#endif
|
||||
}
|
||||
|
||||
#ifdef ENABLE_DBENGINE
|
||||
struct dbengine_initialization {
|
||||
ND_THREAD *thread;
|
||||
char path[FILENAME_MAX + 1];
|
||||
int disk_space_mb;
|
||||
size_t retention_seconds;
|
||||
size_t tier;
|
||||
int ret;
|
||||
};
|
||||
|
||||
void *dbengine_tier_init(void *ptr) {
|
||||
struct dbengine_initialization *dbi = ptr;
|
||||
dbi->ret = rrdeng_init(NULL, dbi->path, dbi->disk_space_mb, dbi->tier, dbi->retention_seconds);
|
||||
return ptr;
|
||||
}
|
||||
|
||||
RRD_BACKFILL get_dbengine_backfill(RRD_BACKFILL backfill)
|
||||
{
|
||||
const char *bf = config_get(
|
||||
CONFIG_SECTION_DB,
|
||||
"dbengine tier backfill",
|
||||
backfill == RRD_BACKFILL_NEW ? "new" :
|
||||
backfill == RRD_BACKFILL_FULL ? "full" :
|
||||
"none");
|
||||
|
||||
if (strcmp(bf, "new") == 0)
|
||||
backfill = RRD_BACKFILL_NEW;
|
||||
else if (strcmp(bf, "full") == 0)
|
||||
backfill = RRD_BACKFILL_FULL;
|
||||
else if (strcmp(bf, "none") == 0)
|
||||
backfill = RRD_BACKFILL_NONE;
|
||||
else {
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "DBENGINE: unknown backfill value '%s', assuming 'new'", bf);
|
||||
config_set(CONFIG_SECTION_DB, "dbengine tier backfill", "new");
|
||||
backfill = RRD_BACKFILL_NEW;
|
||||
}
|
||||
return backfill;
|
||||
}
|
||||
#endif
|
||||
|
||||
void netdata_conf_dbengine_init(const char *hostname) {
|
||||
#ifdef ENABLE_DBENGINE
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// out of memory protection and use all ram for caches
|
||||
|
||||
dbengine_out_of_memory_protection = 0; // will be calculated below
|
||||
OS_SYSTEM_MEMORY sm = os_system_memory(true);
|
||||
if(sm.ram_total_bytes && sm.ram_available_bytes && sm.ram_total_bytes > sm.ram_available_bytes) {
|
||||
// calculate the default out of memory protection size
|
||||
char buf[64];
|
||||
size_snprintf(buf, sizeof(buf), sm.ram_total_bytes / 10, "B", false);
|
||||
size_parse(buf, &dbengine_out_of_memory_protection, "B");
|
||||
}
|
||||
|
||||
if(dbengine_out_of_memory_protection) {
|
||||
dbengine_use_all_ram_for_caches = config_get_boolean(CONFIG_SECTION_DB, "dbengine use all ram for caches", dbengine_use_all_ram_for_caches);
|
||||
dbengine_out_of_memory_protection = config_get_size_bytes(CONFIG_SECTION_DB, "dbengine out of memory protection", dbengine_out_of_memory_protection);
|
||||
|
||||
char buf_total[64], buf_avail[64], buf_oom[64];
|
||||
size_snprintf(buf_total, sizeof(buf_total), sm.ram_total_bytes, "B", false);
|
||||
size_snprintf(buf_avail, sizeof(buf_avail), sm.ram_available_bytes, "B", false);
|
||||
size_snprintf(buf_oom, sizeof(buf_oom), dbengine_out_of_memory_protection, "B", false);
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"DBENGINE Out of Memory Protection. "
|
||||
"System Memory Total: %s, Currently Available: %s, Out of Memory Protection: %s, Use All RAM: %s",
|
||||
buf_total, buf_avail, buf_oom, dbengine_use_all_ram_for_caches ? "enabled" : "disabled");
|
||||
}
|
||||
else {
|
||||
dbengine_out_of_memory_protection = 0;
|
||||
dbengine_use_all_ram_for_caches = false;
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE Out of Memory Protection and Use All Ram cannot be enabled. "
|
||||
"Failed to detect memory size on this system.");
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
|
||||
dbengine_use_direct_io = config_get_boolean(CONFIG_SECTION_DB, "dbengine use direct io", dbengine_use_direct_io);
|
||||
|
||||
unsigned read_num = (unsigned)config_get_number(CONFIG_SECTION_DB, "dbengine pages per extent", DEFAULT_PAGES_PER_EXTENT);
|
||||
if (read_num > 0 && read_num <= DEFAULT_PAGES_PER_EXTENT)
|
||||
rrdeng_pages_per_extent = read_num;
|
||||
else {
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Invalid dbengine pages per extent %u given. Using %u.",
|
||||
read_num, rrdeng_pages_per_extent);
|
||||
|
||||
config_set_number(CONFIG_SECTION_DB, "dbengine pages per extent", rrdeng_pages_per_extent);
|
||||
}
|
||||
|
||||
storage_tiers = config_get_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
if(storage_tiers < 1) {
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "At least 1 storage tier is required. Assuming 1.");
|
||||
|
||||
storage_tiers = 1;
|
||||
config_set_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
}
|
||||
if(storage_tiers > RRD_STORAGE_TIERS) {
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Up to %d storage tier are supported. Assuming %d.",
|
||||
RRD_STORAGE_TIERS, RRD_STORAGE_TIERS);
|
||||
|
||||
storage_tiers = RRD_STORAGE_TIERS;
|
||||
config_set_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
}
|
||||
|
||||
new_dbengine_defaults =
|
||||
(!legacy_multihost_db_space &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 1 update every iterations") &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 2 update every iterations") &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 3 update every iterations") &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 4 update every iterations") &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 1 retention size") &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 2 retention size") &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 3 retention size") &&
|
||||
!config_exists(CONFIG_SECTION_DB, "dbengine tier 4 retention size"));
|
||||
|
||||
default_backfill = get_dbengine_backfill(RRD_BACKFILL_NEW);
|
||||
char dbengineconfig[200 + 1];
|
||||
|
||||
size_t grouping_iterations = default_rrd_update_every;
|
||||
storage_tiers_grouping_iterations[0] = default_rrd_update_every;
|
||||
|
||||
for (size_t tier = 1; tier < storage_tiers; tier++) {
|
||||
grouping_iterations = storage_tiers_grouping_iterations[tier];
|
||||
snprintfz(dbengineconfig, sizeof(dbengineconfig) - 1, "dbengine tier %zu update every iterations", tier);
|
||||
grouping_iterations = config_get_number(CONFIG_SECTION_DB, dbengineconfig, grouping_iterations);
|
||||
if(grouping_iterations < 2) {
|
||||
grouping_iterations = 2;
|
||||
config_set_number(CONFIG_SECTION_DB, dbengineconfig, grouping_iterations);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE on '%s': 'dbegnine tier %zu update every iterations' cannot be less than 2. Assuming 2.",
|
||||
hostname, tier);
|
||||
}
|
||||
storage_tiers_grouping_iterations[tier] = grouping_iterations;
|
||||
}
|
||||
|
||||
default_multidb_disk_quota_mb = (int) config_get_size_mb(CONFIG_SECTION_DB, "dbengine tier 0 retention size", RRDENG_DEFAULT_TIER_DISK_SPACE_MB);
|
||||
if(default_multidb_disk_quota_mb && default_multidb_disk_quota_mb < RRDENG_MIN_DISK_SPACE_MB) {
|
||||
netdata_log_error("Invalid disk space %d for tier 0 given. Defaulting to %d.", default_multidb_disk_quota_mb, RRDENG_MIN_DISK_SPACE_MB);
|
||||
default_multidb_disk_quota_mb = RRDENG_MIN_DISK_SPACE_MB;
|
||||
config_set_size_mb(CONFIG_SECTION_DB, "dbengine tier 0 retention size", default_multidb_disk_quota_mb);
|
||||
}
|
||||
|
||||
#ifdef OS_WINDOWS
|
||||
// FIXME: for whatever reason joining the initialization threads
|
||||
// fails on Windows.
|
||||
bool parallel_initialization = false;
|
||||
#else
|
||||
bool parallel_initialization = (storage_tiers <= (size_t)get_netdata_cpus()) ? true : false;
|
||||
#endif
|
||||
|
||||
struct dbengine_initialization tiers_init[RRD_STORAGE_TIERS] = {};
|
||||
|
||||
size_t created_tiers = 0;
|
||||
char dbenginepath[FILENAME_MAX + 1];
|
||||
|
||||
for (size_t tier = 0; tier < storage_tiers; tier++) {
|
||||
|
||||
if (tier == 0)
|
||||
snprintfz(dbenginepath, FILENAME_MAX, "%s/dbengine", netdata_configured_cache_dir);
|
||||
else
|
||||
snprintfz(dbenginepath, FILENAME_MAX, "%s/dbengine-tier%zu", netdata_configured_cache_dir, tier);
|
||||
|
||||
int ret = mkdir(dbenginepath, 0775);
|
||||
if (ret != 0 && errno != EEXIST) {
|
||||
nd_log(NDLS_DAEMON, NDLP_CRIT, "DBENGINE on '%s': cannot create directory '%s'", hostname, dbenginepath);
|
||||
continue;
|
||||
}
|
||||
|
||||
int disk_space_mb = tier ? RRDENG_DEFAULT_TIER_DISK_SPACE_MB : default_multidb_disk_quota_mb;
|
||||
snprintfz(dbengineconfig, sizeof(dbengineconfig) - 1, "dbengine tier %zu retention size", tier);
|
||||
disk_space_mb = config_get_size_mb(CONFIG_SECTION_DB, dbengineconfig, disk_space_mb);
|
||||
|
||||
snprintfz(dbengineconfig, sizeof(dbengineconfig) - 1, "dbengine tier %zu retention time", tier);
|
||||
storage_tiers_retention_days[tier] = config_get_duration_days(
|
||||
CONFIG_SECTION_DB, dbengineconfig, new_dbengine_defaults ? storage_tiers_retention_days[tier] : 0);
|
||||
|
||||
tiers_init[tier].disk_space_mb = (int) disk_space_mb;
|
||||
tiers_init[tier].tier = tier;
|
||||
tiers_init[tier].retention_seconds = (size_t) (86400.0 * storage_tiers_retention_days[tier]);
|
||||
strncpyz(tiers_init[tier].path, dbenginepath, FILENAME_MAX);
|
||||
tiers_init[tier].ret = 0;
|
||||
|
||||
if(parallel_initialization) {
|
||||
char tag[NETDATA_THREAD_TAG_MAX + 1];
|
||||
snprintfz(tag, NETDATA_THREAD_TAG_MAX, "DBENGINIT[%zu]", tier);
|
||||
tiers_init[tier].thread = nd_thread_create(tag, NETDATA_THREAD_OPTION_JOINABLE, dbengine_tier_init, &tiers_init[tier]);
|
||||
}
|
||||
else
|
||||
dbengine_tier_init(&tiers_init[tier]);
|
||||
}
|
||||
|
||||
for(size_t tier = 0; tier < storage_tiers ;tier++) {
|
||||
if(parallel_initialization)
|
||||
nd_thread_join(tiers_init[tier].thread);
|
||||
|
||||
if(tiers_init[tier].ret != 0) {
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR,
|
||||
"DBENGINE on '%s': Failed to initialize multi-host database tier %zu on path '%s'",
|
||||
hostname, tiers_init[tier].tier, tiers_init[tier].path);
|
||||
}
|
||||
else if(created_tiers == tier)
|
||||
created_tiers++;
|
||||
}
|
||||
|
||||
if(created_tiers && created_tiers < storage_tiers) {
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE on '%s': Managed to create %zu tiers instead of %zu. Continuing with %zu available.",
|
||||
hostname, created_tiers, storage_tiers, created_tiers);
|
||||
|
||||
storage_tiers = created_tiers;
|
||||
}
|
||||
else if(!created_tiers)
|
||||
fatal("DBENGINE on '%s', failed to initialize databases at '%s'.", hostname, netdata_configured_cache_dir);
|
||||
|
||||
for(size_t tier = 0; tier < storage_tiers ;tier++)
|
||||
rrdeng_readiness_wait(multidb_ctx[tier]);
|
||||
|
||||
calculate_tier_disk_space_percentage();
|
||||
|
||||
dbengine_enabled = true;
|
||||
#else
|
||||
storage_tiers = config_get_number(CONFIG_SECTION_DB, "storage tiers", 1);
|
||||
if(storage_tiers != 1) {
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE is not available on '%s', so only 1 database tier can be supported.",
|
||||
hostname);
|
||||
|
||||
storage_tiers = 1;
|
||||
config_set_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
}
|
||||
dbengine_enabled = false;
|
||||
#endif
|
||||
}
|
||||
|
||||
void netdata_conf_section_db(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
rrdhost_free_orphan_time_s =
|
||||
config_get_duration_seconds(CONFIG_SECTION_DB, "cleanup orphan hosts after", rrdhost_free_orphan_time_s);
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default database update frequency
|
||||
|
||||
default_rrd_update_every = (int) config_get_duration_seconds(CONFIG_SECTION_DB, "update every", UPDATE_EVERY);
|
||||
if(default_rrd_update_every < 1 || default_rrd_update_every > 600) {
|
||||
netdata_log_error("Invalid data collection frequency (update every) %d given. Defaulting to %d.", default_rrd_update_every, UPDATE_EVERY);
|
||||
default_rrd_update_every = UPDATE_EVERY;
|
||||
config_set_duration_seconds(CONFIG_SECTION_DB, "update every", default_rrd_update_every);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get the database selection
|
||||
|
||||
{
|
||||
const char *mode = config_get(CONFIG_SECTION_DB, "db", rrd_memory_mode_name(default_rrd_memory_mode));
|
||||
default_rrd_memory_mode = rrd_memory_mode_id(mode);
|
||||
if(strcmp(mode, rrd_memory_mode_name(default_rrd_memory_mode)) != 0) {
|
||||
netdata_log_error("Invalid memory mode '%s' given. Using '%s'", mode, rrd_memory_mode_name(default_rrd_memory_mode));
|
||||
config_set(CONFIG_SECTION_DB, "db", rrd_memory_mode_name(default_rrd_memory_mode));
|
||||
}
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default database size
|
||||
|
||||
if(default_rrd_memory_mode != RRD_MEMORY_MODE_DBENGINE && default_rrd_memory_mode != RRD_MEMORY_MODE_NONE) {
|
||||
default_rrd_history_entries = (int)config_get_number(
|
||||
CONFIG_SECTION_DB, "retention",
|
||||
align_entries_to_pagesize(default_rrd_memory_mode, RRD_DEFAULT_HISTORY_ENTRIES));
|
||||
|
||||
long h = align_entries_to_pagesize(default_rrd_memory_mode, default_rrd_history_entries);
|
||||
if (h != default_rrd_history_entries) {
|
||||
config_set_number(CONFIG_SECTION_DB, "retention", h);
|
||||
default_rrd_history_entries = (int)h;
|
||||
}
|
||||
}
|
||||
|
||||
// --------------------------------------------------------------------
|
||||
// get KSM settings
|
||||
|
||||
#ifdef MADV_MERGEABLE
|
||||
enable_ksm = config_get_boolean_ondemand(CONFIG_SECTION_DB, "memory deduplication (ksm)", enable_ksm);
|
||||
#endif
|
||||
|
||||
// --------------------------------------------------------------------
|
||||
|
||||
rrdhost_free_ephemeral_time_s =
|
||||
config_get_duration_seconds(CONFIG_SECTION_DB, "cleanup ephemeral hosts after", rrdhost_free_ephemeral_time_s);
|
||||
|
||||
rrdset_free_obsolete_time_s =
|
||||
config_get_duration_seconds(CONFIG_SECTION_DB, "cleanup obsolete charts after", rrdset_free_obsolete_time_s);
|
||||
|
||||
// Current chart locking and invalidation scheme doesn't prevent Netdata from segmentation faults if a short
|
||||
// cleanup delay is set. Extensive stress tests showed that 10 seconds is quite a safe delay. Look at
|
||||
// https://github.com/netdata/netdata/pull/11222#issuecomment-868367920 for more information.
|
||||
if (rrdset_free_obsolete_time_s < 10) {
|
||||
rrdset_free_obsolete_time_s = 10;
|
||||
netdata_log_info("The \"cleanup obsolete charts after\" option was set to 10 seconds.");
|
||||
config_set_duration_seconds(CONFIG_SECTION_DB, "cleanup obsolete charts after", rrdset_free_obsolete_time_s);
|
||||
}
|
||||
|
||||
gap_when_lost_iterations_above = (int)config_get_number(CONFIG_SECTION_DB, "gap when lost iterations above", gap_when_lost_iterations_above);
|
||||
if (gap_when_lost_iterations_above < 1) {
|
||||
gap_when_lost_iterations_above = 1;
|
||||
config_set_number(CONFIG_SECTION_DB, "gap when lost iterations above", gap_when_lost_iterations_above);
|
||||
}
|
||||
gap_when_lost_iterations_above += 2;
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
netdata_conf_dbengine_pre_logs();
|
||||
}
|
24
src/daemon/config/netdata-conf-db.h
Normal file
24
src/daemon/config/netdata-conf-db.h
Normal file
|
@ -0,0 +1,24 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_DAEMON_NETDATA_CONF_DBENGINE_H
|
||||
#define NETDATA_DAEMON_NETDATA_CONF_DBENGINE_H
|
||||
|
||||
#include "libnetdata/libnetdata.h"
|
||||
|
||||
extern bool dbengine_enabled;
|
||||
extern size_t storage_tiers;
|
||||
extern bool dbengine_use_direct_io;
|
||||
|
||||
extern int default_rrd_update_every;
|
||||
extern int default_rrd_history_entries;
|
||||
extern int gap_when_lost_iterations_above;
|
||||
extern time_t rrdset_free_obsolete_time_s;
|
||||
|
||||
size_t get_tier_grouping(size_t tier);
|
||||
|
||||
void netdata_conf_section_db(void);
|
||||
void netdata_conf_dbengine_init(const char *hostname);
|
||||
|
||||
#include "netdata-conf.h"
|
||||
|
||||
#endif //NETDATA_DAEMON_NETDATA_CONF_DBENGINE_H
|
31
src/daemon/config/netdata-conf-directories.c
Normal file
31
src/daemon/config/netdata-conf-directories.c
Normal file
|
@ -0,0 +1,31 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "netdata-conf-directories.h"
|
||||
|
||||
static const char *get_varlib_subdir_from_config(const char *prefix, const char *dir) {
|
||||
char filename[FILENAME_MAX + 1];
|
||||
snprintfz(filename, FILENAME_MAX, "%s/%s", prefix, dir);
|
||||
return config_get(CONFIG_SECTION_DIRECTORIES, dir, filename);
|
||||
}
|
||||
|
||||
void netdata_conf_section_directories(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get system paths
|
||||
|
||||
netdata_configured_user_config_dir = config_get(CONFIG_SECTION_DIRECTORIES, "config", netdata_configured_user_config_dir);
|
||||
netdata_configured_stock_config_dir = config_get(CONFIG_SECTION_DIRECTORIES, "stock config", netdata_configured_stock_config_dir);
|
||||
netdata_configured_log_dir = config_get(CONFIG_SECTION_DIRECTORIES, "log", netdata_configured_log_dir);
|
||||
netdata_configured_web_dir = config_get(CONFIG_SECTION_DIRECTORIES, "web", netdata_configured_web_dir);
|
||||
netdata_configured_cache_dir = config_get(CONFIG_SECTION_DIRECTORIES, "cache", netdata_configured_cache_dir);
|
||||
netdata_configured_varlib_dir = config_get(CONFIG_SECTION_DIRECTORIES, "lib", netdata_configured_varlib_dir);
|
||||
|
||||
netdata_configured_lock_dir = get_varlib_subdir_from_config(netdata_configured_varlib_dir, "lock");
|
||||
netdata_configured_cloud_dir = get_varlib_subdir_from_config(netdata_configured_varlib_dir, "cloud.d");
|
||||
|
||||
pluginsd_initialize_plugin_directories();
|
||||
netdata_configured_primary_plugins_dir = plugin_directories[PLUGINSD_STOCK_PLUGINS_DIRECTORY_PATH];
|
||||
}
|
10
src/daemon/config/netdata-conf-directories.h
Normal file
10
src/daemon/config/netdata-conf-directories.h
Normal file
|
@ -0,0 +1,10 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_NETDATA_CONF_DIRECTORIES_H
|
||||
#define NETDATA_NETDATA_CONF_DIRECTORIES_H
|
||||
|
||||
#include "netdata-conf.h"
|
||||
|
||||
void netdata_conf_section_directories(void);
|
||||
|
||||
#endif //NETDATA_NETDATA_CONF_DIRECTORIES_H
|
57
src/daemon/config/netdata-conf-global.c
Normal file
57
src/daemon/config/netdata-conf-global.c
Normal file
|
@ -0,0 +1,57 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "netdata-conf-global.h"
|
||||
|
||||
static int get_hostname(char *buf, size_t buf_size) {
|
||||
if (netdata_configured_host_prefix && *netdata_configured_host_prefix) {
|
||||
char filename[FILENAME_MAX + 1];
|
||||
snprintfz(filename, FILENAME_MAX, "%s/etc/hostname", netdata_configured_host_prefix);
|
||||
|
||||
if (!read_txt_file(filename, buf, buf_size)) {
|
||||
trim(buf);
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
return gethostname(buf, buf_size);
|
||||
}
|
||||
|
||||
void netdata_conf_section_global(void) {
|
||||
netdata_conf_backwards_compatibility();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get the hostname
|
||||
|
||||
netdata_configured_host_prefix = config_get(CONFIG_SECTION_GLOBAL, "host access prefix", "");
|
||||
(void) verify_netdata_host_prefix(true);
|
||||
|
||||
char buf[HOSTNAME_MAX + 1];
|
||||
if (get_hostname(buf, HOSTNAME_MAX))
|
||||
netdata_log_error("Cannot get machine hostname.");
|
||||
|
||||
netdata_configured_hostname = config_get(CONFIG_SECTION_GLOBAL, "hostname", buf);
|
||||
netdata_log_debug(D_OPTIONS, "hostname set to '%s'", netdata_configured_hostname);
|
||||
|
||||
netdata_conf_section_directories();
|
||||
netdata_conf_section_db();
|
||||
|
||||
// --------------------------------------------------------------------
|
||||
// get various system parameters
|
||||
|
||||
os_get_system_cpus_uncached();
|
||||
os_get_system_pid_max();
|
||||
}
|
||||
|
||||
void netdata_conf_section_global_run_as_user(const char **user) {
|
||||
// --------------------------------------------------------------------
|
||||
// get the user we should run
|
||||
|
||||
// IMPORTANT: this is required before web_files_uid()
|
||||
if(getuid() == 0) {
|
||||
*user = config_get(CONFIG_SECTION_GLOBAL, "run as user", NETDATA_USER);
|
||||
}
|
||||
else {
|
||||
struct passwd *passwd = getpwuid(getuid());
|
||||
*user = config_get(CONFIG_SECTION_GLOBAL, "run as user", (passwd && passwd->pw_name)?passwd->pw_name:"");
|
||||
}
|
||||
}
|
13
src/daemon/config/netdata-conf-global.h
Normal file
13
src/daemon/config/netdata-conf-global.h
Normal file
|
@ -0,0 +1,13 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_NETDATA_CONF_GLOBAL_H
|
||||
#define NETDATA_NETDATA_CONF_GLOBAL_H
|
||||
|
||||
#include "libnetdata/libnetdata.h"
|
||||
|
||||
void netdata_conf_section_global(void);
|
||||
void netdata_conf_section_global_run_as_user(const char **user);
|
||||
|
||||
#include "netdata-conf.h"
|
||||
|
||||
#endif //NETDATA_NETDATA_CONF_GLOBAL_H
|
82
src/daemon/config/netdata-conf-logs.c
Normal file
82
src/daemon/config/netdata-conf-logs.c
Normal file
|
@ -0,0 +1,82 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "netdata-conf-logs.h"
|
||||
|
||||
void netdata_conf_section_logs(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
nd_log_set_facility(config_get(CONFIG_SECTION_LOGS, "facility", "daemon"));
|
||||
|
||||
time_t period = ND_LOG_DEFAULT_THROTTLE_PERIOD;
|
||||
size_t logs = ND_LOG_DEFAULT_THROTTLE_LOGS;
|
||||
period = config_get_duration_seconds(CONFIG_SECTION_LOGS, "logs flood protection period", period);
|
||||
logs = (unsigned long)config_get_number(CONFIG_SECTION_LOGS, "logs to trigger flood protection", (long long int)logs);
|
||||
nd_log_set_flood_protection(logs, period);
|
||||
|
||||
const char *netdata_log_level = getenv("NETDATA_LOG_LEVEL");
|
||||
netdata_log_level = netdata_log_level ? nd_log_id2priority(nd_log_priority2id(netdata_log_level)) : NDLP_INFO_STR;
|
||||
|
||||
nd_log_set_priority_level(config_get(CONFIG_SECTION_LOGS, "level", netdata_log_level));
|
||||
|
||||
char filename[FILENAME_MAX + 1];
|
||||
char* os_default_method = NULL;
|
||||
#if defined(OS_LINUX)
|
||||
os_default_method = is_stderr_connected_to_journal() /* || nd_log_journal_socket_available() */ ? "journal" : NULL;
|
||||
#elif defined(OS_WINDOWS)
|
||||
#if defined(HAVE_ETW)
|
||||
os_default_method = "etw";
|
||||
#elif defined(HAVE_WEL)
|
||||
os_default_method = "wel";
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#if defined(OS_WINDOWS)
|
||||
// on windows, debug log goes to windows events
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
#else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/debug.log", netdata_configured_log_dir);
|
||||
#endif
|
||||
|
||||
nd_log_set_user_settings(NDLS_DEBUG, config_get(CONFIG_SECTION_LOGS, "debug", filename));
|
||||
|
||||
if(os_default_method)
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/daemon.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_DAEMON, config_get(CONFIG_SECTION_LOGS, "daemon", filename));
|
||||
|
||||
if(os_default_method)
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/collector.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_COLLECTORS, config_get(CONFIG_SECTION_LOGS, "collector", filename));
|
||||
|
||||
#if defined(OS_WINDOWS)
|
||||
// on windows, access log goes to windows events
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
#else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/access.log", netdata_configured_log_dir);
|
||||
#endif
|
||||
nd_log_set_user_settings(NDLS_ACCESS, config_get(CONFIG_SECTION_LOGS, "access", filename));
|
||||
|
||||
if(os_default_method)
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/health.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_HEALTH, config_get(CONFIG_SECTION_LOGS, "health", filename));
|
||||
|
||||
aclklog_enabled = config_get_boolean(CONFIG_SECTION_CLOUD, "conversation log", CONFIG_BOOLEAN_NO);
|
||||
if (aclklog_enabled) {
|
||||
#if defined(OS_WINDOWS)
|
||||
// on windows, aclk log goes to windows events
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
#else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/aclk.log", netdata_configured_log_dir);
|
||||
#endif
|
||||
nd_log_set_user_settings(NDLS_ACLK, config_get(CONFIG_SECTION_CLOUD, "conversation log file", filename));
|
||||
}
|
||||
|
||||
aclk_config_get_query_scope();
|
||||
}
|
10
src/daemon/config/netdata-conf-logs.h
Normal file
10
src/daemon/config/netdata-conf-logs.h
Normal file
|
@ -0,0 +1,10 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_NETDATA_CONF_LOGS_H
|
||||
#define NETDATA_NETDATA_CONF_LOGS_H
|
||||
|
||||
#include "netdata-conf.h"
|
||||
|
||||
void netdata_conf_section_logs(void);
|
||||
|
||||
#endif //NETDATA_NETDATA_CONF_LOGS_H
|
143
src/daemon/config/netdata-conf-web.c
Normal file
143
src/daemon/config/netdata-conf-web.c
Normal file
|
@ -0,0 +1,143 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "netdata-conf-web.h"
|
||||
#include "daemon/static_threads.h"
|
||||
|
||||
static int make_dns_decision(const char *section_name, const char *config_name, const char *default_value, SIMPLE_PATTERN *p) {
|
||||
const char *value = config_get(section_name,config_name,default_value);
|
||||
|
||||
if(!strcmp("yes",value))
|
||||
return 1;
|
||||
|
||||
if(!strcmp("no",value))
|
||||
return 0;
|
||||
|
||||
if(strcmp("heuristic",value) != 0)
|
||||
netdata_log_error("Invalid configuration option '%s' for '%s'/'%s'. Valid options are 'yes', 'no' and 'heuristic'. Proceeding with 'heuristic'",
|
||||
value, section_name, config_name);
|
||||
|
||||
return simple_pattern_is_potential_name(p);
|
||||
}
|
||||
|
||||
extern struct netdata_static_thread *static_threads;
|
||||
void web_server_threading_selection(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
web_server_mode = web_server_mode_id(config_get(CONFIG_SECTION_WEB, "mode", web_server_mode_name(web_server_mode)));
|
||||
|
||||
int static_threaded = (web_server_mode == WEB_SERVER_MODE_STATIC_THREADED);
|
||||
|
||||
int i;
|
||||
for (i = 0; static_threads[i].name; i++) {
|
||||
if (static_threads[i].start_routine == socket_listen_main_static_threaded)
|
||||
static_threads[i].enabled = static_threaded;
|
||||
}
|
||||
}
|
||||
|
||||
void netdata_conf_section_web(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
web_client_timeout =
|
||||
(int)config_get_duration_seconds(CONFIG_SECTION_WEB, "disconnect idle clients after", web_client_timeout);
|
||||
|
||||
web_client_first_request_timeout =
|
||||
(int)config_get_duration_seconds(CONFIG_SECTION_WEB, "timeout for first request", web_client_first_request_timeout);
|
||||
|
||||
web_client_streaming_rate_t =
|
||||
config_get_duration_seconds(CONFIG_SECTION_WEB, "accept a streaming request every", web_client_streaming_rate_t);
|
||||
|
||||
respect_web_browser_do_not_track_policy =
|
||||
config_get_boolean(CONFIG_SECTION_WEB, "respect do not track policy", respect_web_browser_do_not_track_policy);
|
||||
web_x_frame_options = config_get(CONFIG_SECTION_WEB, "x-frame-options response header", "");
|
||||
if(!*web_x_frame_options)
|
||||
web_x_frame_options = NULL;
|
||||
|
||||
web_allow_connections_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow connections from", "localhost *"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_connections_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow connections by dns", "heuristic", web_allow_connections_from);
|
||||
web_allow_dashboard_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow dashboard from", "localhost *"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_dashboard_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow dashboard by dns", "heuristic", web_allow_dashboard_from);
|
||||
web_allow_badges_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow badges from", "*"), NULL, SIMPLE_PATTERN_EXACT,
|
||||
true);
|
||||
web_allow_badges_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow badges by dns", "heuristic", web_allow_badges_from);
|
||||
web_allow_registry_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_REGISTRY, "allow from", "*"), NULL, SIMPLE_PATTERN_EXACT,
|
||||
true);
|
||||
web_allow_registry_dns = make_dns_decision(CONFIG_SECTION_REGISTRY, "allow by dns", "heuristic",
|
||||
web_allow_registry_from);
|
||||
web_allow_streaming_from = simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow streaming from", "*"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_streaming_dns = make_dns_decision(CONFIG_SECTION_WEB, "allow streaming by dns", "heuristic",
|
||||
web_allow_streaming_from);
|
||||
// Note the default is not heuristic, the wildcards could match DNS but the intent is ip-addresses.
|
||||
web_allow_netdataconf_from = simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow netdata.conf from",
|
||||
"localhost fd* 10.* 192.168.* 172.16.* 172.17.* 172.18.*"
|
||||
" 172.19.* 172.20.* 172.21.* 172.22.* 172.23.* 172.24.*"
|
||||
" 172.25.* 172.26.* 172.27.* 172.28.* 172.29.* 172.30.*"
|
||||
" 172.31.* UNKNOWN"), NULL, SIMPLE_PATTERN_EXACT,
|
||||
true);
|
||||
web_allow_netdataconf_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow netdata.conf by dns", "no", web_allow_netdataconf_from);
|
||||
web_allow_mgmt_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow management from", "localhost"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_mgmt_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow management by dns","heuristic",web_allow_mgmt_from);
|
||||
|
||||
web_enable_gzip = config_get_boolean(CONFIG_SECTION_WEB, "enable gzip compression", web_enable_gzip);
|
||||
|
||||
const char *s = config_get(CONFIG_SECTION_WEB, "gzip compression strategy", "default");
|
||||
if(!strcmp(s, "default"))
|
||||
web_gzip_strategy = Z_DEFAULT_STRATEGY;
|
||||
else if(!strcmp(s, "filtered"))
|
||||
web_gzip_strategy = Z_FILTERED;
|
||||
else if(!strcmp(s, "huffman only"))
|
||||
web_gzip_strategy = Z_HUFFMAN_ONLY;
|
||||
else if(!strcmp(s, "rle"))
|
||||
web_gzip_strategy = Z_RLE;
|
||||
else if(!strcmp(s, "fixed"))
|
||||
web_gzip_strategy = Z_FIXED;
|
||||
else {
|
||||
netdata_log_error("Invalid compression strategy '%s'. Valid strategies are 'default', 'filtered', 'huffman only', 'rle' and 'fixed'. Proceeding with 'default'.", s);
|
||||
web_gzip_strategy = Z_DEFAULT_STRATEGY;
|
||||
}
|
||||
|
||||
web_gzip_level = (int)config_get_number(CONFIG_SECTION_WEB, "gzip compression level", 3);
|
||||
if(web_gzip_level < 1) {
|
||||
netdata_log_error("Invalid compression level %d. Valid levels are 1 (fastest) to 9 (best ratio). Proceeding with level 1 (fastest compression).", web_gzip_level);
|
||||
web_gzip_level = 1;
|
||||
}
|
||||
else if(web_gzip_level > 9) {
|
||||
netdata_log_error("Invalid compression level %d. Valid levels are 1 (fastest) to 9 (best ratio). Proceeding with level 9 (best compression).", web_gzip_level);
|
||||
web_gzip_level = 9;
|
||||
}
|
||||
}
|
||||
|
||||
void netdata_conf_web_security_init(void) {
|
||||
static bool run = false;
|
||||
if(run) return;
|
||||
run = true;
|
||||
|
||||
char filename[FILENAME_MAX + 1];
|
||||
snprintfz(filename, FILENAME_MAX, "%s/ssl/key.pem",netdata_configured_user_config_dir);
|
||||
netdata_ssl_security_key = config_get(CONFIG_SECTION_WEB, "ssl key", filename);
|
||||
|
||||
snprintfz(filename, FILENAME_MAX, "%s/ssl/cert.pem",netdata_configured_user_config_dir);
|
||||
netdata_ssl_security_cert = config_get(CONFIG_SECTION_WEB, "ssl certificate", filename);
|
||||
|
||||
tls_version = config_get(CONFIG_SECTION_WEB, "tls version", "1.3");
|
||||
tls_ciphers = config_get(CONFIG_SECTION_WEB, "tls ciphers", "none");
|
||||
|
||||
netdata_ssl_initialize_openssl();
|
||||
}
|
12
src/daemon/config/netdata-conf-web.h
Normal file
12
src/daemon/config/netdata-conf-web.h
Normal file
|
@ -0,0 +1,12 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_NETDATA_CONF_WEB_H
|
||||
#define NETDATA_NETDATA_CONF_WEB_H
|
||||
|
||||
#include "netdata-conf.h"
|
||||
|
||||
void netdata_conf_section_web(void);
|
||||
void web_server_threading_selection(void);
|
||||
void netdata_conf_web_security_init(void);
|
||||
|
||||
#endif //NETDATA_NETDATA_CONF_WEB_H
|
34
src/daemon/config/netdata-conf.c
Normal file
34
src/daemon/config/netdata-conf.c
Normal file
|
@ -0,0 +1,34 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "netdata-conf.h"
|
||||
|
||||
bool netdata_conf_load(char *filename, char overwrite_used, const char **user) {
|
||||
errno_clear();
|
||||
|
||||
int ret = 0;
|
||||
|
||||
if(filename && *filename) {
|
||||
ret = config_load(filename, overwrite_used, NULL);
|
||||
if(!ret)
|
||||
netdata_log_error("CONFIG: cannot load config file '%s'.", filename);
|
||||
}
|
||||
else {
|
||||
filename = filename_from_path_entry_strdupz(netdata_configured_user_config_dir, "netdata.conf");
|
||||
|
||||
ret = config_load(filename, overwrite_used, NULL);
|
||||
if(!ret) {
|
||||
netdata_log_info("CONFIG: cannot load user config '%s'. Will try the stock version.", filename);
|
||||
freez(filename);
|
||||
|
||||
filename = filename_from_path_entry_strdupz(netdata_configured_stock_config_dir, "netdata.conf");
|
||||
ret = config_load(filename, overwrite_used, NULL);
|
||||
if(!ret)
|
||||
netdata_log_info("CONFIG: cannot load stock config '%s'. Running with internal defaults.", filename);
|
||||
}
|
||||
|
||||
freez(filename);
|
||||
}
|
||||
|
||||
netdata_conf_section_global_run_as_user(user);
|
||||
return ret;
|
||||
}
|
19
src/daemon/config/netdata-conf.h
Normal file
19
src/daemon/config/netdata-conf.h
Normal file
|
@ -0,0 +1,19 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_DAEMON_NETDATA_CONF_H
|
||||
#define NETDATA_DAEMON_NETDATA_CONF_H
|
||||
|
||||
#include "libnetdata/libnetdata.h"
|
||||
|
||||
bool netdata_conf_load(char *filename, char overwrite_used, const char **user);
|
||||
|
||||
#include "netdata-conf-backwards-compatibility.h"
|
||||
#include "netdata-conf-db.h"
|
||||
#include "netdata-conf-directories.h"
|
||||
#include "netdata-conf-global.h"
|
||||
#include "netdata-conf-logs.h"
|
||||
#include "netdata-conf-web.h"
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#endif //NETDATA_DAEMON_NETDATA_CONF_H
|
|
@ -6,6 +6,45 @@
|
|||
char *pidfile = NULL;
|
||||
char *netdata_exe_path = NULL;
|
||||
|
||||
long get_netdata_cpus(void) {
|
||||
static long processors = 0;
|
||||
|
||||
if(processors)
|
||||
return processors;
|
||||
|
||||
long cores_proc_stat = os_get_system_cpus_cached(false, true);
|
||||
long cores_cpuset_v1 = (long)os_read_cpuset_cpus("/sys/fs/cgroup/cpuset/cpuset.cpus", cores_proc_stat);
|
||||
long cores_cpuset_v2 = (long)os_read_cpuset_cpus("/sys/fs/cgroup/cpuset.cpus", cores_proc_stat);
|
||||
|
||||
if(cores_cpuset_v2)
|
||||
processors = cores_cpuset_v2;
|
||||
else if(cores_cpuset_v1)
|
||||
processors = cores_cpuset_v1;
|
||||
else
|
||||
processors = cores_proc_stat;
|
||||
|
||||
long cores_user_configured = config_get_number(CONFIG_SECTION_GLOBAL, "cpu cores", processors);
|
||||
|
||||
errno_clear();
|
||||
internal_error(true,
|
||||
"System CPUs: %ld, ("
|
||||
"system: %ld, cgroups cpuset v1: %ld, cgroups cpuset v2: %ld, netdata.conf: %ld"
|
||||
")"
|
||||
, processors
|
||||
, cores_proc_stat
|
||||
, cores_cpuset_v1
|
||||
, cores_cpuset_v2
|
||||
, cores_user_configured
|
||||
);
|
||||
|
||||
processors = cores_user_configured;
|
||||
|
||||
if(processors < 1)
|
||||
processors = 1;
|
||||
|
||||
return processors;
|
||||
}
|
||||
|
||||
void get_netdata_execution_path(void) {
|
||||
struct passwd *passwd = getpwuid(getuid());
|
||||
char *user = (passwd && passwd->pw_name) ? passwd->pw_name : "";
|
||||
|
|
File diff suppressed because it is too large
Load diff
|
@ -1,64 +0,0 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_GLOBAL_STATISTICS_H
|
||||
#define NETDATA_GLOBAL_STATISTICS_H 1
|
||||
|
||||
#include "database/rrd.h"
|
||||
|
||||
extern struct netdata_buffers_statistics {
|
||||
size_t rrdhost_allocations_size;
|
||||
size_t rrdhost_senders;
|
||||
size_t rrdhost_receivers;
|
||||
size_t query_targets_size;
|
||||
size_t rrdset_done_rda_size;
|
||||
size_t buffers_aclk;
|
||||
size_t buffers_api;
|
||||
size_t buffers_functions;
|
||||
size_t buffers_sqlite;
|
||||
size_t buffers_exporters;
|
||||
size_t buffers_health;
|
||||
size_t buffers_streaming;
|
||||
size_t cbuffers_streaming;
|
||||
size_t buffers_web;
|
||||
} netdata_buffers_statistics;
|
||||
|
||||
extern struct dictionary_stats dictionary_stats_category_collectors;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdhost;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdset_rrddim;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdcontext;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdlabels;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdhealth;
|
||||
extern struct dictionary_stats dictionary_stats_category_functions;
|
||||
extern struct dictionary_stats dictionary_stats_category_replication;
|
||||
|
||||
extern size_t rrddim_db_memory_size;
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// global statistics
|
||||
|
||||
void global_statistics_ml_query_completed(size_t points_read);
|
||||
void global_statistics_ml_models_consulted(size_t models_consulted);
|
||||
void global_statistics_exporters_query_completed(size_t points_read);
|
||||
void global_statistics_backfill_query_completed(size_t points_read);
|
||||
void global_statistics_rrdr_query_completed(size_t queries, uint64_t db_points_read, uint64_t result_points_generated, QUERY_SOURCE query_source);
|
||||
void global_statistics_sqlite3_query_completed(bool success, bool busy, bool locked);
|
||||
void global_statistics_sqlite3_row_completed(void);
|
||||
void global_statistics_rrdset_done_chart_collection_completed(size_t *points_read_per_tier_array);
|
||||
|
||||
void global_statistics_gorilla_buffer_add_hot();
|
||||
|
||||
void global_statistics_tier0_disk_compressed_bytes(uint32_t size);
|
||||
void global_statistics_tier0_disk_uncompressed_bytes(uint32_t size);
|
||||
|
||||
void global_statistics_web_request_completed(uint64_t dt,
|
||||
uint64_t bytes_received,
|
||||
uint64_t bytes_sent,
|
||||
uint64_t content_size,
|
||||
uint64_t compressed_content_size);
|
||||
|
||||
uint64_t global_statistics_web_client_connected(void);
|
||||
void global_statistics_web_client_disconnected(void);
|
||||
|
||||
extern bool global_statistics_enabled;
|
||||
|
||||
#endif /* NETDATA_GLOBAL_STATISTICS_H */
|
|
@ -19,42 +19,3 @@ const char *netdata_configured_abbrev_timezone = NULL;
|
|||
int32_t netdata_configured_utc_offset = 0;
|
||||
|
||||
bool netdata_ready = false;
|
||||
|
||||
long get_netdata_cpus(void) {
|
||||
static long processors = 0;
|
||||
|
||||
if(processors)
|
||||
return processors;
|
||||
|
||||
long cores_proc_stat = os_get_system_cpus_cached(false, true);
|
||||
long cores_cpuset_v1 = (long)os_read_cpuset_cpus("/sys/fs/cgroup/cpuset/cpuset.cpus", cores_proc_stat);
|
||||
long cores_cpuset_v2 = (long)os_read_cpuset_cpus("/sys/fs/cgroup/cpuset.cpus", cores_proc_stat);
|
||||
|
||||
if(cores_cpuset_v2)
|
||||
processors = cores_cpuset_v2;
|
||||
else if(cores_cpuset_v1)
|
||||
processors = cores_cpuset_v1;
|
||||
else
|
||||
processors = cores_proc_stat;
|
||||
|
||||
long cores_user_configured = config_get_number(CONFIG_SECTION_GLOBAL, "cpu cores", processors);
|
||||
|
||||
errno_clear();
|
||||
internal_error(true,
|
||||
"System CPUs: %ld, ("
|
||||
"system: %ld, cgroups cpuset v1: %ld, cgroups cpuset v2: %ld, netdata.conf: %ld"
|
||||
")"
|
||||
, processors
|
||||
, cores_proc_stat
|
||||
, cores_cpuset_v1
|
||||
, cores_cpuset_v2
|
||||
, cores_user_configured
|
||||
);
|
||||
|
||||
processors = cores_user_configured;
|
||||
|
||||
if(processors < 1)
|
||||
processors = 1;
|
||||
|
||||
return processors;
|
||||
}
|
||||
|
|
|
@ -44,6 +44,8 @@ void register_libuv_worker_jobs() {
|
|||
|
||||
// other dbengine events
|
||||
worker_register_job_name(UV_EVENT_DBENGINE_EVICT_MAIN_CACHE, "evict main");
|
||||
worker_register_job_name(UV_EVENT_DBENGINE_EVICT_OPEN_CACHE, "evict open");
|
||||
worker_register_job_name(UV_EVENT_DBENGINE_EVICT_EXTENT_CACHE, "evict extent");
|
||||
worker_register_job_name(UV_EVENT_DBENGINE_BUFFERS_CLEANUP, "dbengine buffers cleanup");
|
||||
worker_register_job_name(UV_EVENT_DBENGINE_QUIESCE, "dbengine quiesce");
|
||||
worker_register_job_name(UV_EVENT_DBENGINE_SHUTDOWN, "dbengine shutdown");
|
||||
|
|
|
@ -36,6 +36,8 @@ enum event_loop_job {
|
|||
|
||||
// other dbengine events
|
||||
UV_EVENT_DBENGINE_EVICT_MAIN_CACHE,
|
||||
UV_EVENT_DBENGINE_EVICT_OPEN_CACHE,
|
||||
UV_EVENT_DBENGINE_EVICT_EXTENT_CACHE,
|
||||
UV_EVENT_DBENGINE_BUFFERS_CLEANUP,
|
||||
UV_EVENT_DBENGINE_QUIESCE,
|
||||
UV_EVENT_DBENGINE_SHUTDOWN,
|
||||
|
|
|
@ -361,6 +361,7 @@ void netdata_cleanup_and_exit(int ret, const char *action, const char *action_re
|
|||
service_wait_exit(SERVICE_EXPORTERS | SERVICE_HEALTH | SERVICE_WEB_SERVER | SERVICE_HTTPD, 3 * USEC_PER_SEC);
|
||||
watcher_step_complete(WATCHER_STEP_ID_STOP_EXPORTERS_HEALTH_AND_WEB_SERVERS_THREADS);
|
||||
|
||||
stream_threads_cancel();
|
||||
service_wait_exit(SERVICE_COLLECTORS | SERVICE_STREAMING, 3 * USEC_PER_SEC);
|
||||
watcher_step_complete(WATCHER_STEP_ID_STOP_COLLECTORS_AND_STREAMING_THREADS);
|
||||
|
||||
|
@ -501,117 +502,6 @@ void netdata_cleanup_and_exit(int ret, const char *action, const char *action_re
|
|||
exit(ret);
|
||||
}
|
||||
|
||||
void web_server_threading_selection(void) {
|
||||
web_server_mode = web_server_mode_id(config_get(CONFIG_SECTION_WEB, "mode", web_server_mode_name(web_server_mode)));
|
||||
|
||||
int static_threaded = (web_server_mode == WEB_SERVER_MODE_STATIC_THREADED);
|
||||
|
||||
int i;
|
||||
for (i = 0; static_threads[i].name; i++) {
|
||||
if (static_threads[i].start_routine == socket_listen_main_static_threaded)
|
||||
static_threads[i].enabled = static_threaded;
|
||||
}
|
||||
}
|
||||
|
||||
int make_dns_decision(const char *section_name, const char *config_name, const char *default_value, SIMPLE_PATTERN *p)
|
||||
{
|
||||
const char *value = config_get(section_name,config_name,default_value);
|
||||
if(!strcmp("yes",value))
|
||||
return 1;
|
||||
if(!strcmp("no",value))
|
||||
return 0;
|
||||
if(strcmp("heuristic",value) != 0)
|
||||
netdata_log_error("Invalid configuration option '%s' for '%s'/'%s'. Valid options are 'yes', 'no' and 'heuristic'. Proceeding with 'heuristic'",
|
||||
value, section_name, config_name);
|
||||
|
||||
return simple_pattern_is_potential_name(p);
|
||||
}
|
||||
|
||||
void web_server_config_options(void)
|
||||
{
|
||||
web_client_timeout =
|
||||
(int)config_get_duration_seconds(CONFIG_SECTION_WEB, "disconnect idle clients after", web_client_timeout);
|
||||
|
||||
web_client_first_request_timeout =
|
||||
(int)config_get_duration_seconds(CONFIG_SECTION_WEB, "timeout for first request", web_client_first_request_timeout);
|
||||
|
||||
web_client_streaming_rate_t =
|
||||
config_get_duration_seconds(CONFIG_SECTION_WEB, "accept a streaming request every", web_client_streaming_rate_t);
|
||||
|
||||
respect_web_browser_do_not_track_policy =
|
||||
config_get_boolean(CONFIG_SECTION_WEB, "respect do not track policy", respect_web_browser_do_not_track_policy);
|
||||
web_x_frame_options = config_get(CONFIG_SECTION_WEB, "x-frame-options response header", "");
|
||||
if(!*web_x_frame_options)
|
||||
web_x_frame_options = NULL;
|
||||
|
||||
web_allow_connections_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow connections from", "localhost *"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_connections_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow connections by dns", "heuristic", web_allow_connections_from);
|
||||
web_allow_dashboard_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow dashboard from", "localhost *"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_dashboard_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow dashboard by dns", "heuristic", web_allow_dashboard_from);
|
||||
web_allow_badges_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow badges from", "*"), NULL, SIMPLE_PATTERN_EXACT,
|
||||
true);
|
||||
web_allow_badges_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow badges by dns", "heuristic", web_allow_badges_from);
|
||||
web_allow_registry_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_REGISTRY, "allow from", "*"), NULL, SIMPLE_PATTERN_EXACT,
|
||||
true);
|
||||
web_allow_registry_dns = make_dns_decision(CONFIG_SECTION_REGISTRY, "allow by dns", "heuristic",
|
||||
web_allow_registry_from);
|
||||
web_allow_streaming_from = simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow streaming from", "*"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_streaming_dns = make_dns_decision(CONFIG_SECTION_WEB, "allow streaming by dns", "heuristic",
|
||||
web_allow_streaming_from);
|
||||
// Note the default is not heuristic, the wildcards could match DNS but the intent is ip-addresses.
|
||||
web_allow_netdataconf_from = simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow netdata.conf from",
|
||||
"localhost fd* 10.* 192.168.* 172.16.* 172.17.* 172.18.*"
|
||||
" 172.19.* 172.20.* 172.21.* 172.22.* 172.23.* 172.24.*"
|
||||
" 172.25.* 172.26.* 172.27.* 172.28.* 172.29.* 172.30.*"
|
||||
" 172.31.* UNKNOWN"), NULL, SIMPLE_PATTERN_EXACT,
|
||||
true);
|
||||
web_allow_netdataconf_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow netdata.conf by dns", "no", web_allow_netdataconf_from);
|
||||
web_allow_mgmt_from =
|
||||
simple_pattern_create(config_get(CONFIG_SECTION_WEB, "allow management from", "localhost"),
|
||||
NULL, SIMPLE_PATTERN_EXACT, true);
|
||||
web_allow_mgmt_dns =
|
||||
make_dns_decision(CONFIG_SECTION_WEB, "allow management by dns","heuristic",web_allow_mgmt_from);
|
||||
|
||||
web_enable_gzip = config_get_boolean(CONFIG_SECTION_WEB, "enable gzip compression", web_enable_gzip);
|
||||
|
||||
const char *s = config_get(CONFIG_SECTION_WEB, "gzip compression strategy", "default");
|
||||
if(!strcmp(s, "default"))
|
||||
web_gzip_strategy = Z_DEFAULT_STRATEGY;
|
||||
else if(!strcmp(s, "filtered"))
|
||||
web_gzip_strategy = Z_FILTERED;
|
||||
else if(!strcmp(s, "huffman only"))
|
||||
web_gzip_strategy = Z_HUFFMAN_ONLY;
|
||||
else if(!strcmp(s, "rle"))
|
||||
web_gzip_strategy = Z_RLE;
|
||||
else if(!strcmp(s, "fixed"))
|
||||
web_gzip_strategy = Z_FIXED;
|
||||
else {
|
||||
netdata_log_error("Invalid compression strategy '%s'. Valid strategies are 'default', 'filtered', 'huffman only', 'rle' and 'fixed'. Proceeding with 'default'.", s);
|
||||
web_gzip_strategy = Z_DEFAULT_STRATEGY;
|
||||
}
|
||||
|
||||
web_gzip_level = (int)config_get_number(CONFIG_SECTION_WEB, "gzip compression level", 3);
|
||||
if(web_gzip_level < 1) {
|
||||
netdata_log_error("Invalid compression level %d. Valid levels are 1 (fastest) to 9 (best ratio). Proceeding with level 1 (fastest compression).", web_gzip_level);
|
||||
web_gzip_level = 1;
|
||||
}
|
||||
else if(web_gzip_level > 9) {
|
||||
netdata_log_error("Invalid compression level %d. Valid levels are 1 (fastest) to 9 (best ratio). Proceeding with level 9 (best compression).", web_gzip_level);
|
||||
web_gzip_level = 9;
|
||||
}
|
||||
}
|
||||
|
||||
static void set_nofile_limit(struct rlimit *rl) {
|
||||
// get the num files allowed
|
||||
if(getrlimit(RLIMIT_NOFILE, rl) != 0) {
|
||||
|
@ -813,597 +703,6 @@ int help(int exitcode) {
|
|||
return exitcode;
|
||||
}
|
||||
|
||||
static void security_init(){
|
||||
char filename[FILENAME_MAX + 1];
|
||||
snprintfz(filename, FILENAME_MAX, "%s/ssl/key.pem",netdata_configured_user_config_dir);
|
||||
netdata_ssl_security_key = config_get(CONFIG_SECTION_WEB, "ssl key", filename);
|
||||
|
||||
snprintfz(filename, FILENAME_MAX, "%s/ssl/cert.pem",netdata_configured_user_config_dir);
|
||||
netdata_ssl_security_cert = config_get(CONFIG_SECTION_WEB, "ssl certificate", filename);
|
||||
|
||||
tls_version = config_get(CONFIG_SECTION_WEB, "tls version", "1.3");
|
||||
tls_ciphers = config_get(CONFIG_SECTION_WEB, "tls ciphers", "none");
|
||||
|
||||
netdata_ssl_initialize_openssl();
|
||||
}
|
||||
|
||||
static void log_init(void) {
|
||||
nd_log_set_facility(config_get(CONFIG_SECTION_LOGS, "facility", "daemon"));
|
||||
|
||||
time_t period = ND_LOG_DEFAULT_THROTTLE_PERIOD;
|
||||
size_t logs = ND_LOG_DEFAULT_THROTTLE_LOGS;
|
||||
period = config_get_duration_seconds(CONFIG_SECTION_LOGS, "logs flood protection period", period);
|
||||
logs = (unsigned long)config_get_number(CONFIG_SECTION_LOGS, "logs to trigger flood protection", (long long int)logs);
|
||||
nd_log_set_flood_protection(logs, period);
|
||||
|
||||
const char *netdata_log_level = getenv("NETDATA_LOG_LEVEL");
|
||||
netdata_log_level = netdata_log_level ? nd_log_id2priority(nd_log_priority2id(netdata_log_level)) : NDLP_INFO_STR;
|
||||
|
||||
nd_log_set_priority_level(config_get(CONFIG_SECTION_LOGS, "level", netdata_log_level));
|
||||
|
||||
char filename[FILENAME_MAX + 1];
|
||||
char* os_default_method = NULL;
|
||||
#if defined(OS_LINUX)
|
||||
os_default_method = is_stderr_connected_to_journal() /* || nd_log_journal_socket_available() */ ? "journal" : NULL;
|
||||
#elif defined(OS_WINDOWS)
|
||||
#if defined(HAVE_ETW)
|
||||
os_default_method = "etw";
|
||||
#elif defined(HAVE_WEL)
|
||||
os_default_method = "wel";
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#if defined(OS_WINDOWS)
|
||||
// on windows, debug log goes to windows events
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
#else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/debug.log", netdata_configured_log_dir);
|
||||
#endif
|
||||
|
||||
nd_log_set_user_settings(NDLS_DEBUG, config_get(CONFIG_SECTION_LOGS, "debug", filename));
|
||||
|
||||
if(os_default_method)
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/daemon.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_DAEMON, config_get(CONFIG_SECTION_LOGS, "daemon", filename));
|
||||
|
||||
if(os_default_method)
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/collector.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_COLLECTORS, config_get(CONFIG_SECTION_LOGS, "collector", filename));
|
||||
|
||||
#if defined(OS_WINDOWS)
|
||||
// on windows, access log goes to windows events
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
#else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/access.log", netdata_configured_log_dir);
|
||||
#endif
|
||||
nd_log_set_user_settings(NDLS_ACCESS, config_get(CONFIG_SECTION_LOGS, "access", filename));
|
||||
|
||||
if(os_default_method)
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/health.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_HEALTH, config_get(CONFIG_SECTION_LOGS, "health", filename));
|
||||
|
||||
aclklog_enabled = config_get_boolean(CONFIG_SECTION_CLOUD, "conversation log", CONFIG_BOOLEAN_NO);
|
||||
if (aclklog_enabled) {
|
||||
#if defined(OS_WINDOWS)
|
||||
// on windows, aclk log goes to windows events
|
||||
snprintfz(filename, FILENAME_MAX, "%s", os_default_method);
|
||||
#else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/aclk.log", netdata_configured_log_dir);
|
||||
#endif
|
||||
nd_log_set_user_settings(NDLS_ACLK, config_get(CONFIG_SECTION_CLOUD, "conversation log file", filename));
|
||||
}
|
||||
|
||||
aclk_config_get_query_scope();
|
||||
}
|
||||
|
||||
static const char *get_varlib_subdir_from_config(const char *prefix, const char *dir) {
|
||||
char filename[FILENAME_MAX + 1];
|
||||
snprintfz(filename, FILENAME_MAX, "%s/%s", prefix, dir);
|
||||
return config_get(CONFIG_SECTION_DIRECTORIES, dir, filename);
|
||||
}
|
||||
|
||||
static void backwards_compatible_config() {
|
||||
// move [global] options to the [web] section
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "http port listen backlog",
|
||||
CONFIG_SECTION_WEB, "listen backlog");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "bind socket to IP",
|
||||
CONFIG_SECTION_WEB, "bind to");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "bind to",
|
||||
CONFIG_SECTION_WEB, "bind to");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "port",
|
||||
CONFIG_SECTION_WEB, "default port");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "default port",
|
||||
CONFIG_SECTION_WEB, "default port");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "disconnect idle web clients after seconds",
|
||||
CONFIG_SECTION_WEB, "disconnect idle clients after seconds");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "respect web browser do not track policy",
|
||||
CONFIG_SECTION_WEB, "respect do not track policy");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web x-frame-options header",
|
||||
CONFIG_SECTION_WEB, "x-frame-options response header");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "enable web responses gzip compression",
|
||||
CONFIG_SECTION_WEB, "enable gzip compression");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web compression strategy",
|
||||
CONFIG_SECTION_WEB, "gzip compression strategy");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web compression level",
|
||||
CONFIG_SECTION_WEB, "gzip compression level");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "config directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "config");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "stock config directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "stock config");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "log directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "log");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "web files directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "web");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "cache directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "cache");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "lib directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "lib");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "home directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "home");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "lock directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "lock");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "plugins directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "plugins");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "health configuration directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "health config");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "stock health configuration directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "stock health config");
|
||||
|
||||
config_move(CONFIG_SECTION_REGISTRY, "registry db directory",
|
||||
CONFIG_SECTION_DIRECTORIES, "registry");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "debug log",
|
||||
CONFIG_SECTION_LOGS, "debug");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "error log",
|
||||
CONFIG_SECTION_LOGS, "error");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "access log",
|
||||
CONFIG_SECTION_LOGS, "access");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "facility log",
|
||||
CONFIG_SECTION_LOGS, "facility");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "errors flood protection period",
|
||||
CONFIG_SECTION_LOGS, "errors flood protection period");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "errors to trigger flood protection",
|
||||
CONFIG_SECTION_LOGS, "errors to trigger flood protection");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "debug flags",
|
||||
CONFIG_SECTION_LOGS, "debug flags");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "TZ environment variable",
|
||||
CONFIG_SECTION_ENV_VARS, "TZ");
|
||||
|
||||
config_move(CONFIG_SECTION_PLUGINS, "PATH environment variable",
|
||||
CONFIG_SECTION_ENV_VARS, "PATH");
|
||||
|
||||
config_move(CONFIG_SECTION_PLUGINS, "PYTHONPATH environment variable",
|
||||
CONFIG_SECTION_ENV_VARS, "PYTHONPATH");
|
||||
|
||||
config_move(CONFIG_SECTION_STATSD, "enabled",
|
||||
CONFIG_SECTION_PLUGINS, "statsd");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "memory mode",
|
||||
CONFIG_SECTION_DB, "db");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "mode",
|
||||
CONFIG_SECTION_DB, "db");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "history",
|
||||
CONFIG_SECTION_DB, "retention");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "update every",
|
||||
CONFIG_SECTION_DB, "update every");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "page cache size",
|
||||
CONFIG_SECTION_DB, "dbengine page cache size");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "dbengine page cache size MB",
|
||||
CONFIG_SECTION_DB, "dbengine page cache size");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "dbengine extent cache size MB",
|
||||
CONFIG_SECTION_DB, "dbengine extent cache size");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "page cache size",
|
||||
CONFIG_SECTION_DB, "dbengine page cache size MB");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "page cache uses malloc",
|
||||
CONFIG_SECTION_DB, "dbengine page cache with malloc");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "page cache with malloc",
|
||||
CONFIG_SECTION_DB, "dbengine page cache with malloc");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "memory deduplication (ksm)",
|
||||
CONFIG_SECTION_DB, "memory deduplication (ksm)");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine page fetch timeout",
|
||||
CONFIG_SECTION_DB, "dbengine page fetch timeout secs");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine page fetch retries",
|
||||
CONFIG_SECTION_DB, "dbengine page fetch retries");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine extent pages",
|
||||
CONFIG_SECTION_DB, "dbengine pages per extent");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "cleanup obsolete charts after seconds",
|
||||
CONFIG_SECTION_DB, "cleanup obsolete charts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "cleanup obsolete charts after secs",
|
||||
CONFIG_SECTION_DB, "cleanup obsolete charts after");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "gap when lost iterations above",
|
||||
CONFIG_SECTION_DB, "gap when lost iterations above");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "cleanup orphan hosts after seconds",
|
||||
CONFIG_SECTION_DB, "cleanup orphan hosts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "cleanup orphan hosts after secs",
|
||||
CONFIG_SECTION_DB, "cleanup orphan hosts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "cleanup ephemeral hosts after secs",
|
||||
CONFIG_SECTION_DB, "cleanup ephemeral hosts after");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "seconds to replicate",
|
||||
CONFIG_SECTION_DB, "replication period");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "seconds per replication step",
|
||||
CONFIG_SECTION_DB, "replication step");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "enable zero metrics",
|
||||
CONFIG_SECTION_DB, "enable zero metrics");
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine disk space",
|
||||
CONFIG_SECTION_DB, "dbengine tier 0 retention size");
|
||||
|
||||
config_move(CONFIG_SECTION_GLOBAL, "dbengine multihost disk space",
|
||||
CONFIG_SECTION_DB, "dbengine tier 0 retention size");
|
||||
|
||||
config_move(CONFIG_SECTION_DB, "dbengine disk space MB",
|
||||
CONFIG_SECTION_DB, "dbengine tier 0 retention size");
|
||||
|
||||
for(size_t tier = 0; tier < RRD_STORAGE_TIERS ;tier++) {
|
||||
char old_config[128], new_config[128];
|
||||
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine tier %zu retention days", tier);
|
||||
snprintfz(new_config, sizeof(new_config), "dbengine tier %zu retention time", tier);
|
||||
config_move(CONFIG_SECTION_DB, old_config,
|
||||
CONFIG_SECTION_DB, new_config);
|
||||
|
||||
if(tier == 0)
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine multihost disk space MB");
|
||||
else
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine tier %zu multihost disk space MB", tier);
|
||||
snprintfz(new_config, sizeof(new_config), "dbengine tier %zu retention size", tier);
|
||||
config_move(CONFIG_SECTION_DB, old_config,
|
||||
CONFIG_SECTION_DB, new_config);
|
||||
|
||||
snprintfz(old_config, sizeof(old_config), "dbengine tier %zu disk space MB", tier);
|
||||
snprintfz(new_config, sizeof(new_config), "dbengine tier %zu retention size", tier);
|
||||
config_move(CONFIG_SECTION_DB, old_config,
|
||||
CONFIG_SECTION_DB, new_config);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "error",
|
||||
CONFIG_SECTION_LOGS, "daemon");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "severity level",
|
||||
CONFIG_SECTION_LOGS, "level");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "errors to trigger flood protection",
|
||||
CONFIG_SECTION_LOGS, "logs to trigger flood protection");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "errors flood protection period",
|
||||
CONFIG_SECTION_LOGS, "logs flood protection period");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "is ephemeral",
|
||||
CONFIG_SECTION_GLOBAL, "is ephemeral node");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "has unstable connection",
|
||||
CONFIG_SECTION_GLOBAL, "has unstable connection");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "run at least every seconds",
|
||||
CONFIG_SECTION_HEALTH, "run at least every");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "postpone alarms during hibernation for seconds",
|
||||
CONFIG_SECTION_HEALTH, "postpone alarms during hibernation for");
|
||||
|
||||
config_move(CONFIG_SECTION_HEALTH, "health log history",
|
||||
CONFIG_SECTION_HEALTH, "health log retention");
|
||||
|
||||
config_move(CONFIG_SECTION_REGISTRY, "registry expire idle persons days",
|
||||
CONFIG_SECTION_REGISTRY, "registry expire idle persons");
|
||||
|
||||
config_move(CONFIG_SECTION_WEB, "disconnect idle clients after seconds",
|
||||
CONFIG_SECTION_WEB, "disconnect idle clients after");
|
||||
|
||||
config_move(CONFIG_SECTION_WEB, "accept a streaming request every seconds",
|
||||
CONFIG_SECTION_WEB, "accept a streaming request every");
|
||||
|
||||
config_move(CONFIG_SECTION_STATSD, "set charts as obsolete after secs",
|
||||
CONFIG_SECTION_STATSD, "set charts as obsolete after");
|
||||
|
||||
config_move(CONFIG_SECTION_STATSD, "disconnect idle tcp clients after seconds",
|
||||
CONFIG_SECTION_STATSD, "disconnect idle tcp clients after");
|
||||
|
||||
config_move("plugin:idlejitter", "loop time in ms",
|
||||
"plugin:idlejitter", "loop time");
|
||||
|
||||
config_move("plugin:proc:/sys/class/infiniband", "refresh ports state every seconds",
|
||||
"plugin:proc:/sys/class/infiniband", "refresh ports state every");
|
||||
}
|
||||
|
||||
static int get_hostname(char *buf, size_t buf_size) {
|
||||
if (netdata_configured_host_prefix && *netdata_configured_host_prefix) {
|
||||
char filename[FILENAME_MAX + 1];
|
||||
snprintfz(filename, FILENAME_MAX, "%s/etc/hostname", netdata_configured_host_prefix);
|
||||
|
||||
if (!read_txt_file(filename, buf, buf_size)) {
|
||||
trim(buf);
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
return gethostname(buf, buf_size);
|
||||
}
|
||||
|
||||
static void get_netdata_configured_variables()
|
||||
{
|
||||
#ifdef ENABLE_DBENGINE
|
||||
legacy_multihost_db_space = config_exists(CONFIG_SECTION_DB, "dbengine multihost disk space MB");
|
||||
if (!legacy_multihost_db_space)
|
||||
legacy_multihost_db_space = config_exists(CONFIG_SECTION_GLOBAL, "dbengine multihost disk space");
|
||||
if (!legacy_multihost_db_space)
|
||||
legacy_multihost_db_space = config_exists(CONFIG_SECTION_GLOBAL, "dbengine disk space");
|
||||
#endif
|
||||
|
||||
backwards_compatible_config();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get the hostname
|
||||
|
||||
netdata_configured_host_prefix = config_get(CONFIG_SECTION_GLOBAL, "host access prefix", "");
|
||||
(void) verify_netdata_host_prefix(true);
|
||||
|
||||
char buf[HOSTNAME_MAX + 1];
|
||||
if (get_hostname(buf, HOSTNAME_MAX))
|
||||
netdata_log_error("Cannot get machine hostname.");
|
||||
|
||||
netdata_configured_hostname = config_get(CONFIG_SECTION_GLOBAL, "hostname", buf);
|
||||
netdata_log_debug(D_OPTIONS, "hostname set to '%s'", netdata_configured_hostname);
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default database update frequency
|
||||
|
||||
default_rrd_update_every = (int) config_get_duration_seconds(CONFIG_SECTION_DB, "update every", UPDATE_EVERY);
|
||||
if(default_rrd_update_every < 1 || default_rrd_update_every > 600) {
|
||||
netdata_log_error("Invalid data collection frequency (update every) %d given. Defaulting to %d.", default_rrd_update_every, UPDATE_EVERY);
|
||||
default_rrd_update_every = UPDATE_EVERY;
|
||||
config_set_duration_seconds(CONFIG_SECTION_DB, "update every", default_rrd_update_every);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get the database selection
|
||||
|
||||
{
|
||||
const char *mode = config_get(CONFIG_SECTION_DB, "db", rrd_memory_mode_name(default_rrd_memory_mode));
|
||||
default_rrd_memory_mode = rrd_memory_mode_id(mode);
|
||||
if(strcmp(mode, rrd_memory_mode_name(default_rrd_memory_mode)) != 0) {
|
||||
netdata_log_error("Invalid memory mode '%s' given. Using '%s'", mode, rrd_memory_mode_name(default_rrd_memory_mode));
|
||||
config_set(CONFIG_SECTION_DB, "db", rrd_memory_mode_name(default_rrd_memory_mode));
|
||||
}
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default database size
|
||||
|
||||
if(default_rrd_memory_mode != RRD_MEMORY_MODE_DBENGINE && default_rrd_memory_mode != RRD_MEMORY_MODE_NONE) {
|
||||
default_rrd_history_entries = (int)config_get_number(
|
||||
CONFIG_SECTION_DB, "retention",
|
||||
align_entries_to_pagesize(default_rrd_memory_mode, RRD_DEFAULT_HISTORY_ENTRIES));
|
||||
|
||||
long h = align_entries_to_pagesize(default_rrd_memory_mode, default_rrd_history_entries);
|
||||
if (h != default_rrd_history_entries) {
|
||||
config_set_number(CONFIG_SECTION_DB, "retention", h);
|
||||
default_rrd_history_entries = (int)h;
|
||||
}
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get system paths
|
||||
|
||||
netdata_configured_user_config_dir = config_get(CONFIG_SECTION_DIRECTORIES, "config", netdata_configured_user_config_dir);
|
||||
netdata_configured_stock_config_dir = config_get(CONFIG_SECTION_DIRECTORIES, "stock config", netdata_configured_stock_config_dir);
|
||||
netdata_configured_log_dir = config_get(CONFIG_SECTION_DIRECTORIES, "log", netdata_configured_log_dir);
|
||||
netdata_configured_web_dir = config_get(CONFIG_SECTION_DIRECTORIES, "web", netdata_configured_web_dir);
|
||||
netdata_configured_cache_dir = config_get(CONFIG_SECTION_DIRECTORIES, "cache", netdata_configured_cache_dir);
|
||||
netdata_configured_varlib_dir = config_get(CONFIG_SECTION_DIRECTORIES, "lib", netdata_configured_varlib_dir);
|
||||
|
||||
netdata_configured_lock_dir = get_varlib_subdir_from_config(netdata_configured_varlib_dir, "lock");
|
||||
netdata_configured_cloud_dir = get_varlib_subdir_from_config(netdata_configured_varlib_dir, "cloud.d");
|
||||
|
||||
{
|
||||
pluginsd_initialize_plugin_directories();
|
||||
netdata_configured_primary_plugins_dir = plugin_directories[PLUGINSD_STOCK_PLUGINS_DIRECTORY_PATH];
|
||||
}
|
||||
|
||||
#ifdef ENABLE_DBENGINE
|
||||
// ------------------------------------------------------------------------
|
||||
// get default Database Engine page type
|
||||
|
||||
const char *page_type = config_get(CONFIG_SECTION_DB, "dbengine page type", "gorilla");
|
||||
if (strcmp(page_type, "gorilla") == 0)
|
||||
tier_page_type[0] = RRDENG_PAGE_TYPE_GORILLA_32BIT;
|
||||
else if (strcmp(page_type, "raw") == 0)
|
||||
tier_page_type[0] = RRDENG_PAGE_TYPE_ARRAY_32BIT;
|
||||
else {
|
||||
tier_page_type[0] = RRDENG_PAGE_TYPE_ARRAY_32BIT;
|
||||
netdata_log_error("Invalid dbengine page type ''%s' given. Defaulting to 'raw'.", page_type);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default Database Engine page cache size in MiB
|
||||
|
||||
default_rrdeng_page_cache_mb = (int) config_get_size_mb(CONFIG_SECTION_DB, "dbengine page cache size", default_rrdeng_page_cache_mb);
|
||||
default_rrdeng_extent_cache_mb = (int) config_get_size_mb(CONFIG_SECTION_DB, "dbengine extent cache size", default_rrdeng_extent_cache_mb);
|
||||
db_engine_journal_check = config_get_boolean(CONFIG_SECTION_DB, "dbengine enable journal integrity check", CONFIG_BOOLEAN_NO);
|
||||
|
||||
if(default_rrdeng_extent_cache_mb < 0) {
|
||||
default_rrdeng_extent_cache_mb = 0;
|
||||
config_set_size_mb(CONFIG_SECTION_DB, "dbengine extent cache size", default_rrdeng_extent_cache_mb);
|
||||
}
|
||||
|
||||
if(default_rrdeng_page_cache_mb < RRDENG_MIN_PAGE_CACHE_SIZE_MB) {
|
||||
netdata_log_error("Invalid page cache size %d given. Defaulting to %d.", default_rrdeng_page_cache_mb, RRDENG_MIN_PAGE_CACHE_SIZE_MB);
|
||||
default_rrdeng_page_cache_mb = RRDENG_MIN_PAGE_CACHE_SIZE_MB;
|
||||
config_set_size_mb(CONFIG_SECTION_DB, "dbengine page cache size", default_rrdeng_page_cache_mb);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// get default Database Engine disk space quota in MiB
|
||||
//
|
||||
// // if (!config_exists(CONFIG_SECTION_DB, "dbengine disk space MB") && !config_exists(CONFIG_SECTION_DB, "dbengine multihost disk space MB"))
|
||||
//
|
||||
// default_rrdeng_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_DB, "dbengine disk space MB", default_rrdeng_disk_quota_mb);
|
||||
// if(default_rrdeng_disk_quota_mb < RRDENG_MIN_DISK_SPACE_MB) {
|
||||
// netdata_log_error("Invalid dbengine disk space %d given. Defaulting to %d.", default_rrdeng_disk_quota_mb, RRDENG_MIN_DISK_SPACE_MB);
|
||||
// default_rrdeng_disk_quota_mb = RRDENG_MIN_DISK_SPACE_MB;
|
||||
// config_set_number(CONFIG_SECTION_DB, "dbengine disk space MB", default_rrdeng_disk_quota_mb);
|
||||
// }
|
||||
//
|
||||
// default_multidb_disk_quota_mb = (int) config_get_number(CONFIG_SECTION_DB, "dbengine multihost disk space MB", compute_multidb_diskspace());
|
||||
// if(default_multidb_disk_quota_mb < RRDENG_MIN_DISK_SPACE_MB) {
|
||||
// netdata_log_error("Invalid multidb disk space %d given. Defaulting to %d.", default_multidb_disk_quota_mb, default_rrdeng_disk_quota_mb);
|
||||
// default_multidb_disk_quota_mb = default_rrdeng_disk_quota_mb;
|
||||
// config_set_number(CONFIG_SECTION_DB, "dbengine multihost disk space MB", default_multidb_disk_quota_mb);
|
||||
// }
|
||||
|
||||
#else
|
||||
if (default_rrd_memory_mode == RRD_MEMORY_MODE_DBENGINE) {
|
||||
error_report("RRD_MEMORY_MODE_DBENGINE is not supported in this platform. The agent will use db mode 'save' instead.");
|
||||
default_rrd_memory_mode = RRD_MEMORY_MODE_RAM;
|
||||
}
|
||||
#endif
|
||||
|
||||
// --------------------------------------------------------------------
|
||||
// get KSM settings
|
||||
|
||||
#ifdef MADV_MERGEABLE
|
||||
enable_ksm = config_get_boolean_ondemand(CONFIG_SECTION_DB, "memory deduplication (ksm)", enable_ksm);
|
||||
#endif
|
||||
|
||||
// --------------------------------------------------------------------
|
||||
|
||||
rrdhost_free_ephemeral_time_s =
|
||||
config_get_duration_seconds(CONFIG_SECTION_DB, "cleanup ephemeral hosts after", rrdhost_free_ephemeral_time_s);
|
||||
|
||||
rrdset_free_obsolete_time_s =
|
||||
config_get_duration_seconds(CONFIG_SECTION_DB, "cleanup obsolete charts after", rrdset_free_obsolete_time_s);
|
||||
|
||||
// Current chart locking and invalidation scheme doesn't prevent Netdata from segmentation faults if a short
|
||||
// cleanup delay is set. Extensive stress tests showed that 10 seconds is quite a safe delay. Look at
|
||||
// https://github.com/netdata/netdata/pull/11222#issuecomment-868367920 for more information.
|
||||
if (rrdset_free_obsolete_time_s < 10) {
|
||||
rrdset_free_obsolete_time_s = 10;
|
||||
netdata_log_info("The \"cleanup obsolete charts after\" option was set to 10 seconds.");
|
||||
config_set_duration_seconds(CONFIG_SECTION_DB, "cleanup obsolete charts after", rrdset_free_obsolete_time_s);
|
||||
}
|
||||
|
||||
gap_when_lost_iterations_above = (int)config_get_number(CONFIG_SECTION_DB, "gap when lost iterations above", gap_when_lost_iterations_above);
|
||||
if (gap_when_lost_iterations_above < 1) {
|
||||
gap_when_lost_iterations_above = 1;
|
||||
config_set_number(CONFIG_SECTION_DB, "gap when lost iterations above", gap_when_lost_iterations_above);
|
||||
}
|
||||
gap_when_lost_iterations_above += 2;
|
||||
|
||||
// --------------------------------------------------------------------
|
||||
// get various system parameters
|
||||
|
||||
os_get_system_cpus_uncached();
|
||||
os_get_system_pid_max();
|
||||
|
||||
|
||||
}
|
||||
|
||||
static void post_conf_load(const char **user)
|
||||
{
|
||||
// --------------------------------------------------------------------
|
||||
// get the user we should run
|
||||
|
||||
// IMPORTANT: this is required before web_files_uid()
|
||||
if(getuid() == 0) {
|
||||
*user = config_get(CONFIG_SECTION_GLOBAL, "run as user", NETDATA_USER);
|
||||
}
|
||||
else {
|
||||
struct passwd *passwd = getpwuid(getuid());
|
||||
*user = config_get(CONFIG_SECTION_GLOBAL, "run as user", (passwd && passwd->pw_name)?passwd->pw_name:"");
|
||||
}
|
||||
}
|
||||
|
||||
static bool load_netdata_conf(char *filename, char overwrite_used, const char **user) {
|
||||
errno_clear();
|
||||
|
||||
int ret = 0;
|
||||
|
||||
if(filename && *filename) {
|
||||
ret = config_load(filename, overwrite_used, NULL);
|
||||
if(!ret)
|
||||
netdata_log_error("CONFIG: cannot load config file '%s'.", filename);
|
||||
}
|
||||
else {
|
||||
filename = filename_from_path_entry_strdupz(netdata_configured_user_config_dir, "netdata.conf");
|
||||
|
||||
ret = config_load(filename, overwrite_used, NULL);
|
||||
if(!ret) {
|
||||
netdata_log_info("CONFIG: cannot load user config '%s'. Will try the stock version.", filename);
|
||||
freez(filename);
|
||||
|
||||
filename = filename_from_path_entry_strdupz(netdata_configured_stock_config_dir, "netdata.conf");
|
||||
ret = config_load(filename, overwrite_used, NULL);
|
||||
if(!ret)
|
||||
netdata_log_info("CONFIG: cannot load stock config '%s'. Running with internal defaults.", filename);
|
||||
}
|
||||
|
||||
freez(filename);
|
||||
}
|
||||
|
||||
post_conf_load(user);
|
||||
return ret;
|
||||
}
|
||||
|
||||
// coverity[ +tainted_string_sanitize_content : arg-0 ]
|
||||
static inline void coverity_remove_taint(char *s)
|
||||
{
|
||||
|
@ -1476,7 +775,7 @@ int julytest(void);
|
|||
int pluginsd_parser_unittest(void);
|
||||
void replication_initialize(void);
|
||||
void bearer_tokens_init(void);
|
||||
int unittest_rrdpush_compressions(void);
|
||||
int unittest_stream_compressions(void);
|
||||
int uuid_unittest(void);
|
||||
int progress_unittest(void);
|
||||
int dyncfg_unittest(void);
|
||||
|
@ -1487,8 +786,8 @@ int windows_perflib_dump(const char *key);
|
|||
#endif
|
||||
|
||||
int unittest_prepare_rrd(const char **user) {
|
||||
post_conf_load(user);
|
||||
get_netdata_configured_variables();
|
||||
netdata_conf_section_global_run_as_user(user);
|
||||
netdata_conf_section_global();
|
||||
default_rrd_update_every = 1;
|
||||
default_rrd_memory_mode = RRD_MEMORY_MODE_RAM;
|
||||
health_plugin_disable();
|
||||
|
@ -1498,7 +797,7 @@ int unittest_prepare_rrd(const char **user) {
|
|||
fprintf(stderr, "rrd_init failed for unittest\n");
|
||||
return 1;
|
||||
}
|
||||
stream_conf_send_enabled = 0;
|
||||
stream_send.enabled = false;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
@ -1554,7 +853,7 @@ int netdata_main(int argc, char **argv) {
|
|||
while( (opt = getopt(argc, argv, optstring)) != -1 ) {
|
||||
switch(opt) {
|
||||
case 'c':
|
||||
if(!load_netdata_conf(optarg, 1, &user)) {
|
||||
if(!netdata_conf_load(optarg, 1, &user)) {
|
||||
netdata_log_error("Cannot load configuration file %s.", optarg);
|
||||
return 1;
|
||||
}
|
||||
|
@ -1725,9 +1024,9 @@ int netdata_main(int argc, char **argv) {
|
|||
unittest_running = true;
|
||||
return pluginsd_parser_unittest();
|
||||
}
|
||||
else if(strcmp(optarg, "rrdpush_compressions_test") == 0) {
|
||||
else if(strcmp(optarg, "stream_compressions_test") == 0) {
|
||||
unittest_running = true;
|
||||
return unittest_rrdpush_compressions();
|
||||
return unittest_stream_compressions();
|
||||
}
|
||||
else if(strcmp(optarg, "progresstest") == 0) {
|
||||
unittest_running = true;
|
||||
|
@ -1742,8 +1041,8 @@ int netdata_main(int argc, char **argv) {
|
|||
else if(strncmp(optarg, createdataset_string, strlen(createdataset_string)) == 0) {
|
||||
optarg += strlen(createdataset_string);
|
||||
unsigned history_seconds = strtoul(optarg, NULL, 0);
|
||||
post_conf_load(&user);
|
||||
get_netdata_configured_variables();
|
||||
netdata_conf_section_global_run_as_user(&user);
|
||||
netdata_conf_section_global();
|
||||
default_rrd_update_every = 1;
|
||||
registry_init();
|
||||
if(rrd_init("dbengine-dataset", NULL, true)) {
|
||||
|
@ -1917,10 +1216,10 @@ int netdata_main(int argc, char **argv) {
|
|||
|
||||
if(!config_loaded) {
|
||||
fprintf(stderr, "warning: no configuration file has been loaded. Use -c CONFIG_FILE, before -W get. Using default config.\n");
|
||||
load_netdata_conf(NULL, 0, &user);
|
||||
netdata_conf_load(NULL, 0, &user);
|
||||
}
|
||||
|
||||
get_netdata_configured_variables();
|
||||
netdata_conf_section_global();
|
||||
|
||||
const char *section = argv[optind];
|
||||
const char *key = argv[optind + 1];
|
||||
|
@ -1944,11 +1243,11 @@ int netdata_main(int argc, char **argv) {
|
|||
|
||||
if(!config_loaded) {
|
||||
fprintf(stderr, "warning: no configuration file has been loaded. Use -c CONFIG_FILE, before -W get. Using default config.\n");
|
||||
load_netdata_conf(NULL, 0, &user);
|
||||
netdata_conf_load(NULL, 0, &user);
|
||||
cloud_conf_load(1);
|
||||
}
|
||||
|
||||
get_netdata_configured_variables();
|
||||
netdata_conf_section_global();
|
||||
|
||||
const char *conf_file = argv[optind]; /* "cloud" is cloud.conf, otherwise netdata.conf */
|
||||
struct config *tmp_config = strcmp(conf_file, "cloud") ? &netdata_config : &cloud_config;
|
||||
|
@ -1994,7 +1293,7 @@ int netdata_main(int argc, char **argv) {
|
|||
}
|
||||
|
||||
if(!config_loaded) {
|
||||
load_netdata_conf(NULL, 0, &user);
|
||||
netdata_conf_load(NULL, 0, &user);
|
||||
cloud_conf_load(0);
|
||||
}
|
||||
|
||||
|
@ -2040,7 +1339,7 @@ int netdata_main(int argc, char **argv) {
|
|||
}
|
||||
|
||||
// prepare configuration environment variables for the plugins
|
||||
get_netdata_configured_variables();
|
||||
netdata_conf_section_global();
|
||||
set_environment_for_plugins_and_scripts();
|
||||
analytics_reset();
|
||||
|
||||
|
@ -2078,7 +1377,7 @@ int netdata_main(int argc, char **argv) {
|
|||
// --------------------------------------------------------------------
|
||||
// get log filenames and settings
|
||||
|
||||
log_init();
|
||||
netdata_conf_section_logs();
|
||||
nd_log_limits_unlimited();
|
||||
|
||||
// initialize the log files
|
||||
|
@ -2096,7 +1395,7 @@ int netdata_main(int argc, char **argv) {
|
|||
// --------------------------------------------------------------------
|
||||
// get the certificate and start security
|
||||
|
||||
security_init();
|
||||
netdata_conf_web_security_init();
|
||||
|
||||
// --------------------------------------------------------------------
|
||||
// This is the safest place to start the SILENCERS structure
|
||||
|
@ -2125,11 +1424,14 @@ int netdata_main(int argc, char **argv) {
|
|||
default_stacksize = 1 * 1024 * 1024;
|
||||
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
config_set_boolean(CONFIG_SECTION_PLUGINS, "netdata monitoring", true);
|
||||
config_set_boolean(CONFIG_SECTION_PLUGINS, "netdata monitoring extended", true);
|
||||
telemetry_enabled = true;
|
||||
telemetry_extended_enabled = true;
|
||||
#endif
|
||||
|
||||
if(config_get_boolean(CONFIG_SECTION_PLUGINS, "netdata monitoring extended", false))
|
||||
telemetry_extended_enabled =
|
||||
config_get_boolean(CONFIG_SECTION_TELEMETRY, "extended telemetry", telemetry_extended_enabled);
|
||||
|
||||
if(telemetry_extended_enabled)
|
||||
// this has to run before starting any other threads that use workers
|
||||
workers_utilization_enable();
|
||||
|
||||
|
@ -2239,7 +1541,7 @@ int netdata_main(int argc, char **argv) {
|
|||
netdata_random_session_id_generate();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// initialize rrd, registry, health, rrdpush, etc.
|
||||
// initialize rrd, registry, health, streaming, etc.
|
||||
|
||||
delta_startup_time("collecting system info");
|
||||
|
||||
|
@ -2295,11 +1597,12 @@ int netdata_main(int argc, char **argv) {
|
|||
// ------------------------------------------------------------------------
|
||||
// spawn the threads
|
||||
|
||||
get_agent_event_time_median_init();
|
||||
bearer_tokens_init();
|
||||
|
||||
delta_startup_time("start the static threads");
|
||||
|
||||
web_server_config_options();
|
||||
netdata_conf_section_web();
|
||||
|
||||
set_late_analytics_variables(system_info);
|
||||
for (i = 0; static_threads[i].name != NULL ; i++) {
|
||||
|
|
|
@ -166,7 +166,7 @@ static void svc_rrdhost_detect_obsolete_charts(RRDHOST *host) {
|
|||
time_t last_entry_t;
|
||||
RRDSET *st;
|
||||
|
||||
time_t child_connect_time = host->child_connect_time;
|
||||
time_t child_connect_time = host->stream.rcv.status.last_connected;
|
||||
|
||||
rrdset_foreach_read(st, host) {
|
||||
if(rrdset_is_replicating(st))
|
||||
|
@ -203,19 +203,19 @@ static void svc_rrd_cleanup_obsolete_charts_from_all_hosts() {
|
|||
if (host == localhost)
|
||||
continue;
|
||||
|
||||
spinlock_lock(&host->receiver_lock);
|
||||
rrdhost_receiver_lock(host);
|
||||
|
||||
time_t now = now_realtime_sec();
|
||||
|
||||
if (host->trigger_chart_obsoletion_check &&
|
||||
((host->child_last_chart_command &&
|
||||
host->child_last_chart_command + host->health.health_delay_up_to < now) ||
|
||||
(host->child_connect_time + TIME_TO_RUN_OBSOLETIONS_ON_CHILD_CONNECT < now))) {
|
||||
if (host->stream.rcv.status.check_obsolete &&
|
||||
((host->stream.rcv.status.last_chart &&
|
||||
host->stream.rcv.status.last_chart + host->health.delay_up_to < now) ||
|
||||
(host->stream.rcv.status.last_connected + TIME_TO_RUN_OBSOLETIONS_ON_CHILD_CONNECT < now))) {
|
||||
svc_rrdhost_detect_obsolete_charts(host);
|
||||
host->trigger_chart_obsoletion_check = 0;
|
||||
host->stream.rcv.status.check_obsolete = false;
|
||||
}
|
||||
|
||||
spinlock_unlock(&host->receiver_lock);
|
||||
rrdhost_receiver_unlock(host);
|
||||
}
|
||||
|
||||
rrd_rdunlock();
|
||||
|
@ -235,7 +235,8 @@ restart_after_removal:
|
|||
continue;
|
||||
|
||||
bool force = false;
|
||||
if (rrdhost_option_check(host, RRDHOST_OPTION_EPHEMERAL_HOST) && now - host->last_connected > rrdhost_free_ephemeral_time_s)
|
||||
if (rrdhost_option_check(host, RRDHOST_OPTION_EPHEMERAL_HOST) &&
|
||||
now - host->stream.snd.status.last_connected > rrdhost_free_ephemeral_time_s)
|
||||
force = true;
|
||||
|
||||
bool is_archived = rrdhost_flag_check(host, RRDHOST_FLAG_ARCHIVED);
|
||||
|
|
|
@ -5,8 +5,6 @@
|
|||
void *aclk_main(void *ptr);
|
||||
void *analytics_main(void *ptr);
|
||||
void *cpuidlejitter_main(void *ptr);
|
||||
void *global_statistics_main(void *ptr);
|
||||
void *global_statistics_extended_main(void *ptr);
|
||||
void *health_main(void *ptr);
|
||||
void *pluginsd_main(void *ptr);
|
||||
void *service_main(void *ptr);
|
||||
|
@ -14,7 +12,7 @@ void *statsd_main(void *ptr);
|
|||
void *profile_main(void *ptr);
|
||||
void *replication_thread_main(void *ptr);
|
||||
|
||||
extern bool global_statistics_enabled;
|
||||
extern bool telemetry_enabled;
|
||||
|
||||
const struct netdata_static_thread static_threads_common[] = {
|
||||
{
|
||||
|
@ -45,26 +43,26 @@ const struct netdata_static_thread static_threads_common[] = {
|
|||
.start_routine = analytics_main
|
||||
},
|
||||
{
|
||||
.name = "STATS_GLOBAL",
|
||||
.name = "TELEMETRY",
|
||||
.config_section = CONFIG_SECTION_PLUGINS,
|
||||
.config_name = "netdata monitoring",
|
||||
.config_name = "netdata telemetry",
|
||||
.env_name = "NETDATA_INTERNALS_MONITORING",
|
||||
.global_variable = &global_statistics_enabled,
|
||||
.global_variable = &telemetry_enabled,
|
||||
.enabled = 1,
|
||||
.thread = NULL,
|
||||
.init_routine = NULL,
|
||||
.start_routine = global_statistics_main
|
||||
.start_routine = telemetry_thread_main
|
||||
},
|
||||
{
|
||||
.name = "STATS_GLOBAL_EXT",
|
||||
.config_section = CONFIG_SECTION_PLUGINS,
|
||||
.config_name = "netdata monitoring extended",
|
||||
.env_name = "NETDATA_INTERNALS_EXTENDED_MONITORING",
|
||||
.global_variable = &global_statistics_enabled,
|
||||
.enabled = 0, // this is ignored - check main() for "netdata monitoring extended"
|
||||
.name = "TLMTRY-SQLITE3",
|
||||
.config_section = CONFIG_SECTION_TELEMETRY,
|
||||
.config_name = "extended telemetry",
|
||||
.env_name = NULL,
|
||||
.global_variable = &telemetry_extended_enabled,
|
||||
.enabled = 0, // the default value - it uses netdata.conf for users to enable it
|
||||
.thread = NULL,
|
||||
.init_routine = NULL,
|
||||
.start_routine = global_statistics_extended_main
|
||||
.start_routine = telemetry_thread_sqlite3_main
|
||||
},
|
||||
{
|
||||
.name = "PLUGINSD",
|
||||
|
@ -109,8 +107,7 @@ const struct netdata_static_thread static_threads_common[] = {
|
|||
.enabled = 0,
|
||||
.thread = NULL,
|
||||
.init_routine = NULL,
|
||||
.start_routine = rrdpush_sender_thread
|
||||
},
|
||||
.start_routine = stream_sender_start_localhost},
|
||||
{
|
||||
.name = "WEB[1]",
|
||||
.config_section = NULL,
|
||||
|
|
164
src/daemon/telemetry/telemetry-aral.c
Normal file
164
src/daemon/telemetry/telemetry-aral.c
Normal file
|
@ -0,0 +1,164 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-aral.h"
|
||||
|
||||
struct aral_info {
|
||||
const char *name;
|
||||
RRDSET *st_memory;
|
||||
RRDDIM *rd_used, *rd_free, *rd_structures;
|
||||
|
||||
RRDSET *st_utilization;
|
||||
RRDDIM *rd_utilization;
|
||||
};
|
||||
|
||||
DEFINE_JUDYL_TYPED(ARAL_STATS, struct aral_info *);
|
||||
|
||||
static struct {
|
||||
SPINLOCK spinlock;
|
||||
ARAL_STATS_JudyLSet idx;
|
||||
} globals = { 0 };
|
||||
|
||||
static void telemetry_aral_register_statistics(struct aral_statistics *stats, const char *name) {
|
||||
if(!name || !stats)
|
||||
return;
|
||||
|
||||
spinlock_lock(&globals.spinlock);
|
||||
struct aral_info *ai = ARAL_STATS_GET(&globals.idx, (Word_t)stats);
|
||||
if(!ai) {
|
||||
ai = callocz(1, sizeof(struct aral_info));
|
||||
ai->name = strdupz(name);
|
||||
ARAL_STATS_SET(&globals.idx, (Word_t)stats, ai);
|
||||
}
|
||||
spinlock_unlock(&globals.spinlock);
|
||||
}
|
||||
|
||||
void telemetry_aral_register(ARAL *ar, const char *name) {
|
||||
if(!ar) return;
|
||||
|
||||
if(!name)
|
||||
name = aral_name(ar);
|
||||
|
||||
struct aral_statistics *stats = aral_get_statistics(ar);
|
||||
|
||||
telemetry_aral_register_statistics(stats, name);
|
||||
}
|
||||
|
||||
void telemetry_aral_unregister(ARAL *ar) {
|
||||
if(!ar) return;
|
||||
struct aral_statistics *stats = aral_get_statistics(ar);
|
||||
|
||||
spinlock_lock(&globals.spinlock);
|
||||
struct aral_info *ai = ARAL_STATS_GET(&globals.idx, (Word_t)stats);
|
||||
if(ai) {
|
||||
ARAL_STATS_DEL(&globals.idx, (Word_t)stats);
|
||||
freez((void *)ai->name);
|
||||
freez(ai);
|
||||
}
|
||||
spinlock_unlock(&globals.spinlock);
|
||||
}
|
||||
|
||||
void telemerty_aral_init(void) {
|
||||
telemetry_aral_register_statistics(aral_by_size_statistics(), "by-size");
|
||||
}
|
||||
|
||||
void telemetry_aral_do(bool extended) {
|
||||
if(!extended) return;
|
||||
|
||||
spinlock_lock(&globals.spinlock);
|
||||
Word_t s = 0;
|
||||
for(struct aral_info *ai = ARAL_STATS_FIRST(&globals.idx, &s);
|
||||
ai;
|
||||
ai = ARAL_STATS_NEXT(&globals.idx, &s)) {
|
||||
struct aral_statistics *stats = (void *)(uintptr_t)s;
|
||||
if (!stats)
|
||||
continue;
|
||||
|
||||
size_t allocated_bytes = __atomic_load_n(&stats->malloc.allocated_bytes, __ATOMIC_RELAXED) +
|
||||
__atomic_load_n(&stats->mmap.allocated_bytes, __ATOMIC_RELAXED);
|
||||
|
||||
size_t used_bytes = __atomic_load_n(&stats->malloc.used_bytes, __ATOMIC_RELAXED) +
|
||||
__atomic_load_n(&stats->mmap.used_bytes, __ATOMIC_RELAXED);
|
||||
|
||||
// slight difference may exist, due to the time needed to get these values
|
||||
// fix the obvious discrepancies
|
||||
if(used_bytes > allocated_bytes)
|
||||
used_bytes = allocated_bytes;
|
||||
|
||||
size_t structures_bytes = __atomic_load_n(&stats->structures.allocated_bytes, __ATOMIC_RELAXED);
|
||||
|
||||
size_t free_bytes = allocated_bytes - used_bytes;
|
||||
|
||||
NETDATA_DOUBLE utilization;
|
||||
if(used_bytes && allocated_bytes)
|
||||
utilization = 100.0 * (NETDATA_DOUBLE)used_bytes / (NETDATA_DOUBLE)allocated_bytes;
|
||||
else
|
||||
utilization = 100.0;
|
||||
|
||||
{
|
||||
if (unlikely(!ai->st_memory)) {
|
||||
char id[256];
|
||||
|
||||
snprintfz(id, sizeof(id), "aral_%s_memory", ai->name);
|
||||
netdata_fix_chart_id(id);
|
||||
|
||||
ai->st_memory = rrdset_create_localhost(
|
||||
"netdata",
|
||||
id,
|
||||
NULL,
|
||||
"ARAL",
|
||||
"netdata.aral_memory",
|
||||
"Array Allocator Memory Utilization",
|
||||
"bytes",
|
||||
"netdata",
|
||||
"telemetry",
|
||||
910000,
|
||||
localhost->rrd_update_every,
|
||||
RRDSET_TYPE_STACKED);
|
||||
|
||||
rrdlabels_add(ai->st_memory->rrdlabels, "ARAL", ai->name, RRDLABEL_SRC_AUTO);
|
||||
|
||||
ai->rd_free = rrddim_add(ai->st_memory, "free", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
ai->rd_used = rrddim_add(ai->st_memory, "used", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
ai->rd_structures = rrddim_add(ai->st_memory, "structures", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(ai->st_memory, ai->rd_used, (collected_number)allocated_bytes);
|
||||
rrddim_set_by_pointer(ai->st_memory, ai->rd_free, (collected_number)free_bytes);
|
||||
rrddim_set_by_pointer(ai->st_memory, ai->rd_structures, (collected_number)structures_bytes);
|
||||
rrdset_done(ai->st_memory);
|
||||
}
|
||||
|
||||
{
|
||||
if (unlikely(!ai->st_utilization)) {
|
||||
char id[256];
|
||||
|
||||
snprintfz(id, sizeof(id), "aral_%s_utilization", ai->name);
|
||||
netdata_fix_chart_id(id);
|
||||
|
||||
ai->st_utilization = rrdset_create_localhost(
|
||||
"netdata",
|
||||
id,
|
||||
NULL,
|
||||
"ARAL",
|
||||
"netdata.aral_utilization",
|
||||
"Array Allocator Memory Utilization",
|
||||
"%",
|
||||
"netdata",
|
||||
"telemetry",
|
||||
910001,
|
||||
localhost->rrd_update_every,
|
||||
RRDSET_TYPE_LINE);
|
||||
|
||||
rrdlabels_add(ai->st_utilization->rrdlabels, "ARAL", ai->name, RRDLABEL_SRC_AUTO);
|
||||
|
||||
ai->rd_utilization = rrddim_add(ai->st_utilization, "utilization", NULL, 1, 10000, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(ai->st_utilization, ai->rd_utilization, (collected_number)(utilization * 10000.0));
|
||||
rrdset_done(ai->st_utilization);
|
||||
}
|
||||
}
|
||||
|
||||
spinlock_unlock(&globals.spinlock);
|
||||
}
|
16
src/daemon/telemetry/telemetry-aral.h
Normal file
16
src/daemon/telemetry/telemetry-aral.h
Normal file
|
@ -0,0 +1,16 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_ARAL_H
|
||||
#define NETDATA_TELEMETRY_ARAL_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
void telemetry_aral_register(ARAL *ar, const char *name);
|
||||
void telemetry_aral_unregister(ARAL *ar);
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemerty_aral_init(void);
|
||||
void telemetry_aral_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_ARAL_H
|
243
src/daemon/telemetry/telemetry-daemon-memory.c
Normal file
243
src/daemon/telemetry/telemetry-daemon-memory.c
Normal file
|
@ -0,0 +1,243 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-daemon-memory.h"
|
||||
|
||||
#define dictionary_stats_memory_total(stats) \
|
||||
((stats).memory.dict + (stats).memory.values + (stats).memory.index)
|
||||
|
||||
struct netdata_buffers_statistics netdata_buffers_statistics = {};
|
||||
|
||||
void telemetry_daemon_memory_do(bool extended) {
|
||||
{
|
||||
static RRDSET *st_memory = NULL;
|
||||
static RRDDIM *rd_database = NULL;
|
||||
#ifdef DICT_WITH_STATS
|
||||
static RRDDIM *rd_collectors = NULL;
|
||||
static RRDDIM *rd_rrdhosts = NULL;
|
||||
static RRDDIM *rd_rrdsets = NULL;
|
||||
static RRDDIM *rd_rrddims = NULL;
|
||||
static RRDDIM *rd_contexts = NULL;
|
||||
static RRDDIM *rd_health = NULL;
|
||||
static RRDDIM *rd_functions = NULL;
|
||||
static RRDDIM *rd_replication = NULL;
|
||||
#else
|
||||
static RRDDIM *rd_metadata = NULL;
|
||||
#endif
|
||||
static RRDDIM *rd_labels = NULL; // labels use dictionary like statistics, but it is not ARAL based dictionary
|
||||
static RRDDIM *rd_ml = NULL;
|
||||
static RRDDIM *rd_strings = NULL;
|
||||
static RRDDIM *rd_streaming = NULL;
|
||||
static RRDDIM *rd_buffers = NULL;
|
||||
static RRDDIM *rd_workers = NULL;
|
||||
static RRDDIM *rd_aral = NULL;
|
||||
static RRDDIM *rd_judy = NULL;
|
||||
static RRDDIM *rd_other = NULL;
|
||||
|
||||
if (unlikely(!st_memory)) {
|
||||
st_memory = rrdset_create_localhost(
|
||||
"netdata",
|
||||
"memory",
|
||||
NULL,
|
||||
"Memory Usage",
|
||||
NULL,
|
||||
"Netdata Memory",
|
||||
"bytes",
|
||||
"netdata",
|
||||
"stats",
|
||||
130100,
|
||||
localhost->rrd_update_every,
|
||||
RRDSET_TYPE_STACKED);
|
||||
|
||||
rd_database = rrddim_add(st_memory, "db", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
#ifdef DICT_WITH_STATS
|
||||
rd_collectors = rrddim_add(st_memory, "collectors", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_rrdhosts = rrddim_add(st_memory, "hosts", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_rrdsets = rrddim_add(st_memory, "rrdset", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_rrddims = rrddim_add(st_memory, "rrddim", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_contexts = rrddim_add(st_memory, "contexts", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_health = rrddim_add(st_memory, "health", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_functions = rrddim_add(st_memory, "functions", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_replication = rrddim_add(st_memory, "replication", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
#else
|
||||
rd_metadata = rrddim_add(st_memory, "metadata", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
#endif
|
||||
rd_labels = rrddim_add(st_memory, "labels", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_ml = rrddim_add(st_memory, "ML", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_strings = rrddim_add(st_memory, "strings", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_streaming = rrddim_add(st_memory, "streaming", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers = rrddim_add(st_memory, "buffers", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_workers = rrddim_add(st_memory, "workers", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_aral = rrddim_add(st_memory, "aral", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_judy = rrddim_add(st_memory, "judy", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_other = rrddim_add(st_memory, "other", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
size_t buffers =
|
||||
netdata_buffers_statistics.query_targets_size +
|
||||
netdata_buffers_statistics.rrdset_done_rda_size +
|
||||
netdata_buffers_statistics.buffers_aclk +
|
||||
netdata_buffers_statistics.buffers_api +
|
||||
netdata_buffers_statistics.buffers_functions +
|
||||
netdata_buffers_statistics.buffers_sqlite +
|
||||
netdata_buffers_statistics.buffers_exporters +
|
||||
netdata_buffers_statistics.buffers_health +
|
||||
netdata_buffers_statistics.buffers_streaming +
|
||||
netdata_buffers_statistics.cbuffers_streaming +
|
||||
netdata_buffers_statistics.buffers_web +
|
||||
replication_allocated_buffers() +
|
||||
aral_by_size_overhead() +
|
||||
judy_aral_overhead();
|
||||
|
||||
size_t strings = 0;
|
||||
string_statistics(NULL, NULL, NULL, NULL, NULL, &strings, NULL, NULL);
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_database,
|
||||
(collected_number)telemetry_dbengine_total_memory + (collected_number)rrddim_db_memory_size);
|
||||
|
||||
#ifdef DICT_WITH_STATS
|
||||
rrddim_set_by_pointer(st_memory, rd_collectors,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_collectors));
|
||||
|
||||
rrddim_set_by_pointer(st_memory,
|
||||
rd_rrdhosts,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_rrdhost) + (collected_number)netdata_buffers_statistics.rrdhost_allocations_size);
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_rrdsets,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_rrdset));
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_rrddims,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_rrddim));
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_contexts,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_rrdcontext));
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_health,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_rrdhealth));
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_functions,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_functions));
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_replication,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_replication) + (collected_number)replication_allocated_memory());
|
||||
#else
|
||||
uint64_t metadata =
|
||||
aral_by_size_used_bytes() +
|
||||
dictionary_stats_category_rrdhost.memory.dict +
|
||||
dictionary_stats_category_rrdset.memory.dict +
|
||||
dictionary_stats_category_rrddim.memory.dict +
|
||||
dictionary_stats_category_rrdcontext.memory.dict +
|
||||
dictionary_stats_category_rrdhealth.memory.dict +
|
||||
dictionary_stats_category_functions.memory.dict +
|
||||
dictionary_stats_category_replication.memory.dict +
|
||||
replication_allocated_memory();
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_metadata, (collected_number)metadata);
|
||||
#endif
|
||||
|
||||
// labels use dictionary like statistics, but it is not ARAL based dictionary
|
||||
rrddim_set_by_pointer(st_memory, rd_labels,
|
||||
(collected_number)dictionary_stats_memory_total(dictionary_stats_category_rrdlabels));
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_ml,
|
||||
(collected_number)telemetry_ml_get_current_memory_usage());
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_strings,
|
||||
(collected_number)strings);
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_streaming,
|
||||
(collected_number)netdata_buffers_statistics.rrdhost_senders + (collected_number)netdata_buffers_statistics.rrdhost_receivers);
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_buffers,
|
||||
(collected_number)buffers);
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_workers,
|
||||
(collected_number) workers_allocated_memory());
|
||||
|
||||
rrddim_set_by_pointer(st_memory, rd_aral,
|
||||
(collected_number) aral_by_size_structures());
|
||||
|
||||
rrddim_set_by_pointer(st_memory,
|
||||
rd_judy, (collected_number) judy_aral_structures());
|
||||
|
||||
rrddim_set_by_pointer(st_memory,
|
||||
rd_other, (collected_number)dictionary_stats_memory_total(dictionary_stats_category_other));
|
||||
|
||||
rrdset_done(st_memory);
|
||||
}
|
||||
|
||||
{
|
||||
static RRDSET *st_memory_buffers = NULL;
|
||||
static RRDDIM *rd_queries = NULL;
|
||||
static RRDDIM *rd_collectors = NULL;
|
||||
static RRDDIM *rd_buffers_aclk = NULL;
|
||||
static RRDDIM *rd_buffers_api = NULL;
|
||||
static RRDDIM *rd_buffers_functions = NULL;
|
||||
static RRDDIM *rd_buffers_sqlite = NULL;
|
||||
static RRDDIM *rd_buffers_exporters = NULL;
|
||||
static RRDDIM *rd_buffers_health = NULL;
|
||||
static RRDDIM *rd_buffers_streaming = NULL;
|
||||
static RRDDIM *rd_cbuffers_streaming = NULL;
|
||||
static RRDDIM *rd_buffers_replication = NULL;
|
||||
static RRDDIM *rd_buffers_web = NULL;
|
||||
static RRDDIM *rd_buffers_aral = NULL;
|
||||
static RRDDIM *rd_buffers_judy = NULL;
|
||||
|
||||
if (unlikely(!st_memory_buffers)) {
|
||||
st_memory_buffers = rrdset_create_localhost(
|
||||
"netdata",
|
||||
"memory_buffers",
|
||||
NULL,
|
||||
"Memory Usage",
|
||||
NULL,
|
||||
"Netdata Memory Buffers",
|
||||
"bytes",
|
||||
"netdata",
|
||||
"stats",
|
||||
130101,
|
||||
localhost->rrd_update_every,
|
||||
RRDSET_TYPE_STACKED);
|
||||
|
||||
rd_queries = rrddim_add(st_memory_buffers, "queries", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_collectors = rrddim_add(st_memory_buffers, "collection", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_aclk = rrddim_add(st_memory_buffers, "aclk", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_api = rrddim_add(st_memory_buffers, "api", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_functions = rrddim_add(st_memory_buffers, "functions", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_sqlite = rrddim_add(st_memory_buffers, "sqlite", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_exporters = rrddim_add(st_memory_buffers, "exporters", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_health = rrddim_add(st_memory_buffers, "health", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_streaming = rrddim_add(st_memory_buffers, "streaming", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_cbuffers_streaming = rrddim_add(st_memory_buffers, "streaming cbuf", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_replication = rrddim_add(st_memory_buffers, "replication", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_web = rrddim_add(st_memory_buffers, "web", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_aral = rrddim_add(st_memory_buffers, "aral", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_buffers_judy = rrddim_add(st_memory_buffers, "judy", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_queries, (collected_number)netdata_buffers_statistics.query_targets_size + (collected_number) onewayalloc_allocated_memory());
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_collectors, (collected_number)netdata_buffers_statistics.rrdset_done_rda_size);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_aclk, (collected_number)netdata_buffers_statistics.buffers_aclk);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_api, (collected_number)netdata_buffers_statistics.buffers_api);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_functions, (collected_number)netdata_buffers_statistics.buffers_functions);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_sqlite, (collected_number)netdata_buffers_statistics.buffers_sqlite);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_exporters, (collected_number)netdata_buffers_statistics.buffers_exporters);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_health, (collected_number)netdata_buffers_statistics.buffers_health);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_streaming, (collected_number)netdata_buffers_statistics.buffers_streaming);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_cbuffers_streaming, (collected_number)netdata_buffers_statistics.cbuffers_streaming);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_replication, (collected_number)replication_allocated_buffers());
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_web, (collected_number)netdata_buffers_statistics.buffers_web);
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_aral, (collected_number)aral_by_size_overhead());
|
||||
rrddim_set_by_pointer(st_memory_buffers, rd_buffers_judy, (collected_number)judy_aral_overhead());
|
||||
|
||||
rrdset_done(st_memory_buffers);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
|
||||
if(!extended)
|
||||
return;
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
|
||||
}
|
29
src/daemon/telemetry/telemetry-daemon-memory.h
Normal file
29
src/daemon/telemetry/telemetry-daemon-memory.h
Normal file
|
@ -0,0 +1,29 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_DAEMON_MEMORY_H
|
||||
#define NETDATA_TELEMETRY_DAEMON_MEMORY_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
extern struct netdata_buffers_statistics {
|
||||
size_t rrdhost_allocations_size;
|
||||
size_t rrdhost_senders;
|
||||
size_t rrdhost_receivers;
|
||||
size_t query_targets_size;
|
||||
size_t rrdset_done_rda_size;
|
||||
size_t buffers_aclk;
|
||||
size_t buffers_api;
|
||||
size_t buffers_functions;
|
||||
size_t buffers_sqlite;
|
||||
size_t buffers_exporters;
|
||||
size_t buffers_health;
|
||||
size_t buffers_streaming;
|
||||
size_t cbuffers_streaming;
|
||||
size_t buffers_web;
|
||||
} netdata_buffers_statistics;
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_daemon_memory_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_DAEMON_MEMORY_H
|
79
src/daemon/telemetry/telemetry-daemon.c
Normal file
79
src/daemon/telemetry/telemetry-daemon.c
Normal file
|
@ -0,0 +1,79 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-daemon.h"
|
||||
|
||||
static void telemetry_daemon_cpu_usage_do(bool extended __maybe_unused) {
|
||||
struct rusage me;
|
||||
getrusage(RUSAGE_SELF, &me);
|
||||
|
||||
{
|
||||
static RRDSET *st_cpu = NULL;
|
||||
static RRDDIM *rd_cpu_user = NULL,
|
||||
*rd_cpu_system = NULL;
|
||||
|
||||
if (unlikely(!st_cpu)) {
|
||||
st_cpu = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "server_cpu"
|
||||
, NULL
|
||||
, "CPU usage"
|
||||
, NULL
|
||||
, "Netdata CPU usage"
|
||||
, "milliseconds/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 130000
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
|
||||
rd_cpu_user = rrddim_add(st_cpu, "user", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_cpu_system = rrddim_add(st_cpu, "system", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_cpu, rd_cpu_user, (collected_number )(me.ru_utime.tv_sec * 1000000ULL + me.ru_utime.tv_usec));
|
||||
rrddim_set_by_pointer(st_cpu, rd_cpu_system, (collected_number )(me.ru_stime.tv_sec * 1000000ULL + me.ru_stime.tv_usec));
|
||||
rrdset_done(st_cpu);
|
||||
}
|
||||
}
|
||||
|
||||
static void telemetry_daemon_uptime_do(bool extended __maybe_unused) {
|
||||
{
|
||||
static time_t netdata_boottime_time = 0;
|
||||
if (!netdata_boottime_time)
|
||||
netdata_boottime_time = now_boottime_sec();
|
||||
|
||||
time_t netdata_uptime = now_boottime_sec() - netdata_boottime_time;
|
||||
|
||||
static RRDSET *st_uptime = NULL;
|
||||
static RRDDIM *rd_uptime = NULL;
|
||||
|
||||
if (unlikely(!st_uptime)) {
|
||||
st_uptime = rrdset_create_localhost(
|
||||
"netdata",
|
||||
"uptime",
|
||||
NULL,
|
||||
"Uptime",
|
||||
NULL,
|
||||
"Netdata uptime",
|
||||
"seconds",
|
||||
"netdata",
|
||||
"stats",
|
||||
130150,
|
||||
localhost->rrd_update_every,
|
||||
RRDSET_TYPE_LINE);
|
||||
|
||||
rd_uptime = rrddim_add(st_uptime, "uptime", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_uptime, rd_uptime, netdata_uptime);
|
||||
rrdset_done(st_uptime);
|
||||
}
|
||||
}
|
||||
|
||||
void telemetry_daemon_do(bool extended) {
|
||||
telemetry_daemon_cpu_usage_do(extended);
|
||||
telemetry_daemon_uptime_do(extended);
|
||||
telemetry_daemon_memory_do(extended);
|
||||
}
|
12
src/daemon/telemetry/telemetry-daemon.h
Normal file
12
src/daemon/telemetry/telemetry-daemon.h
Normal file
|
@ -0,0 +1,12 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_DAEMON_H
|
||||
#define NETDATA_TELEMETRY_DAEMON_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_daemon_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_DAEMON_H
|
1624
src/daemon/telemetry/telemetry-dbengine.c
Normal file
1624
src/daemon/telemetry/telemetry-dbengine.c
Normal file
File diff suppressed because it is too large
Load diff
17
src/daemon/telemetry/telemetry-dbengine.h
Normal file
17
src/daemon/telemetry/telemetry-dbengine.h
Normal file
|
@ -0,0 +1,17 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_DBENGINE_H
|
||||
#define NETDATA_TELEMETRY_DBENGINE_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
extern size_t telemetry_dbengine_total_memory;
|
||||
|
||||
#if defined(ENABLE_DBENGINE)
|
||||
void telemetry_dbengine_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_DBENGINE_H
|
378
src/daemon/telemetry/telemetry-dictionary.c
Normal file
378
src/daemon/telemetry/telemetry-dictionary.c
Normal file
|
@ -0,0 +1,378 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-dictionary.h"
|
||||
|
||||
struct dictionary_stats dictionary_stats_category_collectors = { .name = "collectors" };
|
||||
struct dictionary_stats dictionary_stats_category_rrdhost = { .name = "rrdhost" };
|
||||
struct dictionary_stats dictionary_stats_category_rrdset = { .name = "rrdset" };
|
||||
struct dictionary_stats dictionary_stats_category_rrddim = { .name = "rrddim" };
|
||||
struct dictionary_stats dictionary_stats_category_rrdcontext = { .name = "context" };
|
||||
struct dictionary_stats dictionary_stats_category_rrdlabels = { .name = "labels" };
|
||||
struct dictionary_stats dictionary_stats_category_rrdhealth = { .name = "health" };
|
||||
struct dictionary_stats dictionary_stats_category_functions = { .name = "functions" };
|
||||
struct dictionary_stats dictionary_stats_category_replication = { .name = "replication" };
|
||||
|
||||
size_t rrddim_db_memory_size = 0;
|
||||
|
||||
#ifdef DICT_WITH_STATS
|
||||
struct dictionary_categories {
|
||||
struct dictionary_stats *stats;
|
||||
|
||||
RRDSET *st_dicts;
|
||||
RRDDIM *rd_dicts_active;
|
||||
RRDDIM *rd_dicts_deleted;
|
||||
|
||||
RRDSET *st_items;
|
||||
RRDDIM *rd_items_entries;
|
||||
RRDDIM *rd_items_referenced;
|
||||
RRDDIM *rd_items_pending_deletion;
|
||||
|
||||
RRDSET *st_ops;
|
||||
RRDDIM *rd_ops_creations;
|
||||
RRDDIM *rd_ops_destructions;
|
||||
RRDDIM *rd_ops_flushes;
|
||||
RRDDIM *rd_ops_traversals;
|
||||
RRDDIM *rd_ops_walkthroughs;
|
||||
RRDDIM *rd_ops_garbage_collections;
|
||||
RRDDIM *rd_ops_searches;
|
||||
RRDDIM *rd_ops_inserts;
|
||||
RRDDIM *rd_ops_resets;
|
||||
RRDDIM *rd_ops_deletes;
|
||||
|
||||
RRDSET *st_callbacks;
|
||||
RRDDIM *rd_callbacks_inserts;
|
||||
RRDDIM *rd_callbacks_conflicts;
|
||||
RRDDIM *rd_callbacks_reacts;
|
||||
RRDDIM *rd_callbacks_deletes;
|
||||
|
||||
RRDSET *st_memory;
|
||||
RRDDIM *rd_memory_indexed;
|
||||
RRDDIM *rd_memory_values;
|
||||
RRDDIM *rd_memory_dict;
|
||||
|
||||
RRDSET *st_spins;
|
||||
RRDDIM *rd_spins_use;
|
||||
RRDDIM *rd_spins_search;
|
||||
RRDDIM *rd_spins_insert;
|
||||
RRDDIM *rd_spins_delete;
|
||||
|
||||
} dictionary_categories[] = {
|
||||
{ .stats = &dictionary_stats_category_collectors, },
|
||||
{ .stats = &dictionary_stats_category_rrdhost, },
|
||||
{ .stats = &dictionary_stats_category_rrdset, },
|
||||
{ .stats = &dictionary_stats_category_rrdcontext, },
|
||||
{ .stats = &dictionary_stats_category_rrdlabels, },
|
||||
{ .stats = &dictionary_stats_category_rrdhealth, },
|
||||
{ .stats = &dictionary_stats_category_functions, },
|
||||
{ .stats = &dictionary_stats_category_replication, },
|
||||
{ .stats = &dictionary_stats_category_other, },
|
||||
|
||||
// terminator
|
||||
{ .stats = NULL, NULL, NULL, 0 },
|
||||
};
|
||||
|
||||
#define load_dictionary_stats_entry(x) total += (size_t)(stats.x = __atomic_load_n(&c->stats->x, __ATOMIC_RELAXED))
|
||||
|
||||
static void update_dictionary_category_charts(struct dictionary_categories *c) {
|
||||
struct dictionary_stats stats;
|
||||
stats.name = c->stats->name;
|
||||
int priority = 900000;
|
||||
const char *family = "dictionaries";
|
||||
const char *context_prefix = "dictionaries";
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
size_t total = 0;
|
||||
load_dictionary_stats_entry(dictionaries.active);
|
||||
load_dictionary_stats_entry(dictionaries.deleted);
|
||||
|
||||
if(c->st_dicts || total != 0) {
|
||||
if (unlikely(!c->st_dicts)) {
|
||||
char id[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.dictionaries", context_prefix, stats.name);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.dictionaries", context_prefix);
|
||||
|
||||
c->st_dicts = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, id
|
||||
, NULL
|
||||
, family
|
||||
, context
|
||||
, "Dictionaries"
|
||||
, "dictionaries"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, priority + 0
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
c->rd_dicts_active = rrddim_add(c->st_dicts, "active", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
c->rd_dicts_deleted = rrddim_add(c->st_dicts, "deleted", NULL, -1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
rrdlabels_add(c->st_dicts->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(c->st_dicts, c->rd_dicts_active, (collected_number)stats.dictionaries.active);
|
||||
rrddim_set_by_pointer(c->st_dicts, c->rd_dicts_deleted, (collected_number)stats.dictionaries.deleted);
|
||||
rrdset_done(c->st_dicts);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
total = 0;
|
||||
load_dictionary_stats_entry(items.entries);
|
||||
load_dictionary_stats_entry(items.referenced);
|
||||
load_dictionary_stats_entry(items.pending_deletion);
|
||||
|
||||
if(c->st_items || total != 0) {
|
||||
if (unlikely(!c->st_items)) {
|
||||
char id[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.items", context_prefix, stats.name);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.items", context_prefix);
|
||||
|
||||
c->st_items = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, id
|
||||
, NULL
|
||||
, family
|
||||
, context
|
||||
, "Dictionary Items"
|
||||
, "items"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, priority + 1
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
c->rd_items_entries = rrddim_add(c->st_items, "active", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
c->rd_items_pending_deletion = rrddim_add(c->st_items, "deleted", NULL, -1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
c->rd_items_referenced = rrddim_add(c->st_items, "referenced", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
rrdlabels_add(c->st_items->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(c->st_items, c->rd_items_entries, stats.items.entries);
|
||||
rrddim_set_by_pointer(c->st_items, c->rd_items_pending_deletion, stats.items.pending_deletion);
|
||||
rrddim_set_by_pointer(c->st_items, c->rd_items_referenced, stats.items.referenced);
|
||||
rrdset_done(c->st_items);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
total = 0;
|
||||
load_dictionary_stats_entry(ops.creations);
|
||||
load_dictionary_stats_entry(ops.destructions);
|
||||
load_dictionary_stats_entry(ops.flushes);
|
||||
load_dictionary_stats_entry(ops.traversals);
|
||||
load_dictionary_stats_entry(ops.walkthroughs);
|
||||
load_dictionary_stats_entry(ops.garbage_collections);
|
||||
load_dictionary_stats_entry(ops.searches);
|
||||
load_dictionary_stats_entry(ops.inserts);
|
||||
load_dictionary_stats_entry(ops.resets);
|
||||
load_dictionary_stats_entry(ops.deletes);
|
||||
|
||||
if(c->st_ops || total != 0) {
|
||||
if (unlikely(!c->st_ops)) {
|
||||
char id[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.ops", context_prefix, stats.name);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.ops", context_prefix);
|
||||
|
||||
c->st_ops = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, id
|
||||
, NULL
|
||||
, family
|
||||
, context
|
||||
, "Dictionary Operations"
|
||||
, "ops/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, priority + 2
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
c->rd_ops_creations = rrddim_add(c->st_ops, "creations", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_destructions = rrddim_add(c->st_ops, "destructions", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_flushes = rrddim_add(c->st_ops, "flushes", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_traversals = rrddim_add(c->st_ops, "traversals", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_walkthroughs = rrddim_add(c->st_ops, "walkthroughs", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_garbage_collections = rrddim_add(c->st_ops, "garbage_collections", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_searches = rrddim_add(c->st_ops, "searches", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_inserts = rrddim_add(c->st_ops, "inserts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_resets = rrddim_add(c->st_ops, "resets", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_ops_deletes = rrddim_add(c->st_ops, "deletes", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
|
||||
rrdlabels_add(c->st_ops->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_creations, (collected_number)stats.ops.creations);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_destructions, (collected_number)stats.ops.destructions);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_flushes, (collected_number)stats.ops.flushes);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_traversals, (collected_number)stats.ops.traversals);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_walkthroughs, (collected_number)stats.ops.walkthroughs);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_garbage_collections, (collected_number)stats.ops.garbage_collections);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_searches, (collected_number)stats.ops.searches);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_inserts, (collected_number)stats.ops.inserts);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_resets, (collected_number)stats.ops.resets);
|
||||
rrddim_set_by_pointer(c->st_ops, c->rd_ops_deletes, (collected_number)stats.ops.deletes);
|
||||
|
||||
rrdset_done(c->st_ops);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
total = 0;
|
||||
load_dictionary_stats_entry(callbacks.inserts);
|
||||
load_dictionary_stats_entry(callbacks.conflicts);
|
||||
load_dictionary_stats_entry(callbacks.reacts);
|
||||
load_dictionary_stats_entry(callbacks.deletes);
|
||||
|
||||
if(c->st_callbacks || total != 0) {
|
||||
if (unlikely(!c->st_callbacks)) {
|
||||
char id[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.callbacks", context_prefix, stats.name);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.callbacks", context_prefix);
|
||||
|
||||
c->st_callbacks = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, id
|
||||
, NULL
|
||||
, family
|
||||
, context
|
||||
, "Dictionary Callbacks"
|
||||
, "callbacks/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, priority + 3
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
c->rd_callbacks_inserts = rrddim_add(c->st_callbacks, "inserts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_callbacks_deletes = rrddim_add(c->st_callbacks, "deletes", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_callbacks_conflicts = rrddim_add(c->st_callbacks, "conflicts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_callbacks_reacts = rrddim_add(c->st_callbacks, "reacts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
|
||||
rrdlabels_add(c->st_callbacks->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_inserts, (collected_number)stats.callbacks.inserts);
|
||||
rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_conflicts, (collected_number)stats.callbacks.conflicts);
|
||||
rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_reacts, (collected_number)stats.callbacks.reacts);
|
||||
rrddim_set_by_pointer(c->st_callbacks, c->rd_callbacks_deletes, (collected_number)stats.callbacks.deletes);
|
||||
|
||||
rrdset_done(c->st_callbacks);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
total = 0;
|
||||
load_dictionary_stats_entry(memory.index);
|
||||
load_dictionary_stats_entry(memory.values);
|
||||
load_dictionary_stats_entry(memory.dict);
|
||||
|
||||
if(c->st_memory || total != 0) {
|
||||
if (unlikely(!c->st_memory)) {
|
||||
char id[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.memory", context_prefix, stats.name);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.memory", context_prefix);
|
||||
|
||||
c->st_memory = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, id
|
||||
, NULL
|
||||
, family
|
||||
, context
|
||||
, "Dictionary Memory"
|
||||
, "bytes"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, priority + 4
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
|
||||
c->rd_memory_indexed = rrddim_add(c->st_memory, "index", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
c->rd_memory_values = rrddim_add(c->st_memory, "data", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
c->rd_memory_dict = rrddim_add(c->st_memory, "structures", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
rrdlabels_add(c->st_memory->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(c->st_memory, c->rd_memory_indexed, (collected_number)stats.memory.index);
|
||||
rrddim_set_by_pointer(c->st_memory, c->rd_memory_values, (collected_number)stats.memory.values);
|
||||
rrddim_set_by_pointer(c->st_memory, c->rd_memory_dict, (collected_number)stats.memory.dict);
|
||||
|
||||
rrdset_done(c->st_memory);
|
||||
}
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
total = 0;
|
||||
load_dictionary_stats_entry(spin_locks.use_spins);
|
||||
load_dictionary_stats_entry(spin_locks.search_spins);
|
||||
load_dictionary_stats_entry(spin_locks.insert_spins);
|
||||
load_dictionary_stats_entry(spin_locks.delete_spins);
|
||||
|
||||
if(c->st_spins || total != 0) {
|
||||
if (unlikely(!c->st_spins)) {
|
||||
char id[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(id, RRD_ID_LENGTH_MAX, "%s.%s.spins", context_prefix, stats.name);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(context, RRD_ID_LENGTH_MAX, "netdata.%s.category.spins", context_prefix);
|
||||
|
||||
c->st_spins = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, id
|
||||
, NULL
|
||||
, family
|
||||
, context
|
||||
, "Dictionary Spins"
|
||||
, "count"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, priority + 5
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
c->rd_spins_use = rrddim_add(c->st_spins, "use", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_spins_search = rrddim_add(c->st_spins, "search", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_spins_insert = rrddim_add(c->st_spins, "insert", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
c->rd_spins_delete = rrddim_add(c->st_spins, "delete", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
|
||||
rrdlabels_add(c->st_spins->rrdlabels, "category", stats.name, RRDLABEL_SRC_AUTO);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(c->st_spins, c->rd_spins_use, (collected_number)stats.spin_locks.use_spins);
|
||||
rrddim_set_by_pointer(c->st_spins, c->rd_spins_search, (collected_number)stats.spin_locks.search_spins);
|
||||
rrddim_set_by_pointer(c->st_spins, c->rd_spins_insert, (collected_number)stats.spin_locks.insert_spins);
|
||||
rrddim_set_by_pointer(c->st_spins, c->rd_spins_delete, (collected_number)stats.spin_locks.delete_spins);
|
||||
|
||||
rrdset_done(c->st_spins);
|
||||
}
|
||||
}
|
||||
|
||||
void telemetry_dictionary_do(bool extended) {
|
||||
if(!extended) return;
|
||||
|
||||
for(int i = 0; dictionary_categories[i].stats ;i++) {
|
||||
update_dictionary_category_charts(&dictionary_categories[i]);
|
||||
}
|
||||
}
|
||||
#endif // DICT_WITH_STATS
|
24
src/daemon/telemetry/telemetry-dictionary.h
Normal file
24
src/daemon/telemetry/telemetry-dictionary.h
Normal file
|
@ -0,0 +1,24 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_DICTIONARY_H
|
||||
#define NETDATA_TELEMETRY_DICTIONARY_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
extern struct dictionary_stats dictionary_stats_category_collectors;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdhost;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdset;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrddim;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdcontext;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdlabels;
|
||||
extern struct dictionary_stats dictionary_stats_category_rrdhealth;
|
||||
extern struct dictionary_stats dictionary_stats_category_functions;
|
||||
extern struct dictionary_stats dictionary_stats_category_replication;
|
||||
|
||||
extern size_t rrddim_db_memory_size;
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_dictionary_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_DICTIONARY_H
|
110
src/daemon/telemetry/telemetry-gorilla.c
Normal file
110
src/daemon/telemetry/telemetry-gorilla.c
Normal file
|
@ -0,0 +1,110 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-gorilla.h"
|
||||
|
||||
static struct gorilla_statistics {
|
||||
bool enabled;
|
||||
|
||||
alignas(64) uint64_t tier0_hot_gorilla_buffers;
|
||||
|
||||
alignas(64) uint64_t gorilla_tier0_disk_actual_bytes;
|
||||
alignas(64) uint64_t gorilla_tier0_disk_optimal_bytes;
|
||||
alignas(64) uint64_t gorilla_tier0_disk_original_bytes;
|
||||
} gorilla_statistics = { 0 };
|
||||
|
||||
void telemetry_gorilla_hot_buffer_added() {
|
||||
if(!gorilla_statistics.enabled) return;
|
||||
|
||||
__atomic_fetch_add(&gorilla_statistics.tier0_hot_gorilla_buffers, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_gorilla_tier0_page_flush(uint32_t actual, uint32_t optimal, uint32_t original) {
|
||||
if(!gorilla_statistics.enabled) return;
|
||||
|
||||
__atomic_fetch_add(&gorilla_statistics.gorilla_tier0_disk_actual_bytes, actual, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&gorilla_statistics.gorilla_tier0_disk_optimal_bytes, optimal, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&gorilla_statistics.gorilla_tier0_disk_original_bytes, original, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
static inline void global_statistics_copy(struct gorilla_statistics *gs) {
|
||||
gs->tier0_hot_gorilla_buffers = __atomic_load_n(&gorilla_statistics.tier0_hot_gorilla_buffers, __ATOMIC_RELAXED);
|
||||
gs->gorilla_tier0_disk_actual_bytes = __atomic_load_n(&gorilla_statistics.gorilla_tier0_disk_actual_bytes, __ATOMIC_RELAXED);
|
||||
gs->gorilla_tier0_disk_optimal_bytes = __atomic_load_n(&gorilla_statistics.gorilla_tier0_disk_optimal_bytes, __ATOMIC_RELAXED);
|
||||
gs->gorilla_tier0_disk_original_bytes = __atomic_load_n(&gorilla_statistics.gorilla_tier0_disk_original_bytes, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_gorilla_do(bool extended __maybe_unused) {
|
||||
#ifdef ENABLE_DBENGINE
|
||||
if(!extended) return;
|
||||
gorilla_statistics.enabled = true;
|
||||
|
||||
struct gorilla_statistics gs;
|
||||
global_statistics_copy(&gs);
|
||||
|
||||
if (tier_page_type[0] == RRDENG_PAGE_TYPE_GORILLA_32BIT)
|
||||
{
|
||||
static RRDSET *st_tier0_gorilla_pages = NULL;
|
||||
static RRDDIM *rd_num_gorilla_pages = NULL;
|
||||
|
||||
if (unlikely(!st_tier0_gorilla_pages)) {
|
||||
st_tier0_gorilla_pages = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "tier0_gorilla_pages"
|
||||
, NULL
|
||||
, "dbengine gorilla"
|
||||
, NULL
|
||||
, "Number of gorilla_pages"
|
||||
, "count"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131004
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_num_gorilla_pages = rrddim_add(st_tier0_gorilla_pages, "count", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_tier0_gorilla_pages, rd_num_gorilla_pages, (collected_number)gs.tier0_hot_gorilla_buffers);
|
||||
|
||||
rrdset_done(st_tier0_gorilla_pages);
|
||||
}
|
||||
|
||||
if (tier_page_type[0] == RRDENG_PAGE_TYPE_GORILLA_32BIT)
|
||||
{
|
||||
static RRDSET *st_tier0_compression_info = NULL;
|
||||
|
||||
static RRDDIM *rd_actual_bytes = NULL;
|
||||
static RRDDIM *rd_optimal_bytes = NULL;
|
||||
static RRDDIM *rd_uncompressed_bytes = NULL;
|
||||
|
||||
if (unlikely(!st_tier0_compression_info)) {
|
||||
st_tier0_compression_info = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "tier0_gorilla_efficiency"
|
||||
, NULL
|
||||
, "dbengine gorilla"
|
||||
, NULL
|
||||
, "DBENGINE Gorilla Compression Efficiency on Tier 0"
|
||||
, "bytes"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131005
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_actual_bytes = rrddim_add(st_tier0_compression_info, "actual", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_optimal_bytes = rrddim_add(st_tier0_compression_info, "optimal", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_uncompressed_bytes = rrddim_add(st_tier0_compression_info, "uncompressed", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_tier0_compression_info, rd_actual_bytes, (collected_number)gs.gorilla_tier0_disk_actual_bytes);
|
||||
rrddim_set_by_pointer(st_tier0_compression_info, rd_optimal_bytes, (collected_number)gs.gorilla_tier0_disk_optimal_bytes);
|
||||
rrddim_set_by_pointer(st_tier0_compression_info, rd_uncompressed_bytes, (collected_number)gs.gorilla_tier0_disk_original_bytes);
|
||||
|
||||
rrdset_done(st_tier0_compression_info);
|
||||
}
|
||||
#endif
|
||||
}
|
15
src/daemon/telemetry/telemetry-gorilla.h
Normal file
15
src/daemon/telemetry/telemetry-gorilla.h
Normal file
|
@ -0,0 +1,15 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_GORILLA_H
|
||||
#define NETDATA_TELEMETRY_GORILLA_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
void telemetry_gorilla_hot_buffer_added();
|
||||
void telemetry_gorilla_tier0_page_flush(uint32_t actual, uint32_t optimal, uint32_t original);
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_gorilla_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_GORILLA_H
|
44
src/daemon/telemetry/telemetry-heartbeat.c
Normal file
44
src/daemon/telemetry/telemetry-heartbeat.c
Normal file
|
@ -0,0 +1,44 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-heartbeat.h"
|
||||
|
||||
void telemetry_heartbeat_do(bool extended) {
|
||||
if(!extended) return;
|
||||
|
||||
static RRDSET *st_heartbeat = NULL;
|
||||
static RRDDIM *rd_heartbeat_min = NULL;
|
||||
static RRDDIM *rd_heartbeat_max = NULL;
|
||||
static RRDDIM *rd_heartbeat_avg = NULL;
|
||||
|
||||
if (unlikely(!st_heartbeat)) {
|
||||
st_heartbeat = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "heartbeat"
|
||||
, NULL
|
||||
, "heartbeat"
|
||||
, NULL
|
||||
, "System clock jitter"
|
||||
, "microseconds"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 900000
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_AREA);
|
||||
|
||||
rd_heartbeat_min = rrddim_add(st_heartbeat, "min", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_heartbeat_max = rrddim_add(st_heartbeat, "max", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_heartbeat_avg = rrddim_add(st_heartbeat, "average", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
usec_t min, max, average;
|
||||
size_t count;
|
||||
|
||||
heartbeat_statistics(&min, &max, &average, &count);
|
||||
|
||||
rrddim_set_by_pointer(st_heartbeat, rd_heartbeat_min, (collected_number)min);
|
||||
rrddim_set_by_pointer(st_heartbeat, rd_heartbeat_max, (collected_number)max);
|
||||
rrddim_set_by_pointer(st_heartbeat, rd_heartbeat_avg, (collected_number)average);
|
||||
|
||||
rrdset_done(st_heartbeat);
|
||||
}
|
12
src/daemon/telemetry/telemetry-heartbeat.h
Normal file
12
src/daemon/telemetry/telemetry-heartbeat.h
Normal file
|
@ -0,0 +1,12 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_HEARTBEAT_H
|
||||
#define NETDATA_TELEMETRY_HEARTBEAT_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_heartbeat_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_HEARTBEAT_H
|
263
src/daemon/telemetry/telemetry-http-api.c
Normal file
263
src/daemon/telemetry/telemetry-http-api.c
Normal file
|
@ -0,0 +1,263 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-http-api.h"
|
||||
|
||||
#define GLOBAL_STATS_RESET_WEB_USEC_MAX 0x01
|
||||
|
||||
static struct web_statistics {
|
||||
bool extended;
|
||||
|
||||
uint16_t connected_clients;
|
||||
uint64_t web_client_count; // oops! this is used for giving unique IDs to web_clients!
|
||||
|
||||
uint64_t web_requests;
|
||||
uint64_t web_usec;
|
||||
uint64_t web_usec_max;
|
||||
uint64_t bytes_received;
|
||||
uint64_t bytes_sent;
|
||||
|
||||
uint64_t content_size_uncompressed;
|
||||
uint64_t content_size_compressed;
|
||||
} web_statistics;
|
||||
|
||||
uint64_t telemetry_web_client_connected(void) {
|
||||
__atomic_fetch_add(&web_statistics.connected_clients, 1, __ATOMIC_RELAXED);
|
||||
return __atomic_fetch_add(&web_statistics.web_client_count, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_web_client_disconnected(void) {
|
||||
__atomic_fetch_sub(&web_statistics.connected_clients, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_web_request_completed(uint64_t dt,
|
||||
uint64_t bytes_received,
|
||||
uint64_t bytes_sent,
|
||||
uint64_t content_size,
|
||||
uint64_t compressed_content_size) {
|
||||
uint64_t old_web_usec_max = web_statistics.web_usec_max;
|
||||
while(dt > old_web_usec_max)
|
||||
__atomic_compare_exchange(&web_statistics.web_usec_max, &old_web_usec_max, &dt, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
|
||||
|
||||
__atomic_fetch_add(&web_statistics.web_requests, 1, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&web_statistics.web_usec, dt, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&web_statistics.bytes_received, bytes_received, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&web_statistics.bytes_sent, bytes_sent, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&web_statistics.content_size_uncompressed, content_size, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&web_statistics.content_size_compressed, compressed_content_size, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
static inline void telemetry_web_copy(struct web_statistics *gs, uint8_t options) {
|
||||
gs->connected_clients = __atomic_load_n(&web_statistics.connected_clients, __ATOMIC_RELAXED);
|
||||
gs->web_requests = __atomic_load_n(&web_statistics.web_requests, __ATOMIC_RELAXED);
|
||||
gs->web_usec = __atomic_load_n(&web_statistics.web_usec, __ATOMIC_RELAXED);
|
||||
gs->web_usec_max = __atomic_load_n(&web_statistics.web_usec_max, __ATOMIC_RELAXED);
|
||||
gs->bytes_received = __atomic_load_n(&web_statistics.bytes_received, __ATOMIC_RELAXED);
|
||||
gs->bytes_sent = __atomic_load_n(&web_statistics.bytes_sent, __ATOMIC_RELAXED);
|
||||
gs->content_size_uncompressed = __atomic_load_n(&web_statistics.content_size_uncompressed, __ATOMIC_RELAXED);
|
||||
gs->content_size_compressed = __atomic_load_n(&web_statistics.content_size_compressed, __ATOMIC_RELAXED);
|
||||
gs->web_client_count = __atomic_load_n(&web_statistics.web_client_count, __ATOMIC_RELAXED);
|
||||
|
||||
if(options & GLOBAL_STATS_RESET_WEB_USEC_MAX) {
|
||||
uint64_t n = 0;
|
||||
__atomic_compare_exchange(&web_statistics.web_usec_max, (uint64_t *) &gs->web_usec_max, &n, 1, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
|
||||
}
|
||||
}
|
||||
|
||||
void telemetry_web_do(bool extended) {
|
||||
static struct web_statistics gs;
|
||||
telemetry_web_copy(&gs, GLOBAL_STATS_RESET_WEB_USEC_MAX);
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
{
|
||||
static RRDSET *st_clients = NULL;
|
||||
static RRDDIM *rd_clients = NULL;
|
||||
|
||||
if (unlikely(!st_clients)) {
|
||||
st_clients = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "clients"
|
||||
, NULL
|
||||
, "HTTP API"
|
||||
, NULL
|
||||
, "Netdata Web API Clients"
|
||||
, "connected clients"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 130200
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_clients = rrddim_add(st_clients, "clients", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_clients, rd_clients, gs.connected_clients);
|
||||
rrdset_done(st_clients);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
{
|
||||
static RRDSET *st_reqs = NULL;
|
||||
static RRDDIM *rd_requests = NULL;
|
||||
|
||||
if (unlikely(!st_reqs)) {
|
||||
st_reqs = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "requests"
|
||||
, NULL
|
||||
, "HTTP API"
|
||||
, NULL
|
||||
, "Netdata Web API Requests Received"
|
||||
, "requests/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 130300
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_requests = rrddim_add(st_reqs, "requests", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_reqs, rd_requests, (collected_number) gs.web_requests);
|
||||
rrdset_done(st_reqs);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
{
|
||||
static RRDSET *st_bytes = NULL;
|
||||
static RRDDIM *rd_in = NULL,
|
||||
*rd_out = NULL;
|
||||
|
||||
if (unlikely(!st_bytes)) {
|
||||
st_bytes = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "net"
|
||||
, NULL
|
||||
, "HTTP API"
|
||||
, NULL
|
||||
, "Netdata Web API Network Traffic"
|
||||
, "kilobits/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 130400
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_AREA
|
||||
);
|
||||
|
||||
rd_in = rrddim_add(st_bytes, "in", NULL, 8, BITS_IN_A_KILOBIT, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_out = rrddim_add(st_bytes, "out", NULL, -8, BITS_IN_A_KILOBIT, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_bytes, rd_in, (collected_number) gs.bytes_received);
|
||||
rrddim_set_by_pointer(st_bytes, rd_out, (collected_number) gs.bytes_sent);
|
||||
rrdset_done(st_bytes);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
{
|
||||
static unsigned long long old_web_requests = 0, old_web_usec = 0;
|
||||
static collected_number average_response_time = -1;
|
||||
|
||||
static RRDSET *st_duration = NULL;
|
||||
static RRDDIM *rd_average = NULL,
|
||||
*rd_max = NULL;
|
||||
|
||||
if (unlikely(!st_duration)) {
|
||||
st_duration = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "response_time"
|
||||
, NULL
|
||||
, "HTTP API"
|
||||
, NULL
|
||||
, "Netdata Web API Response Time"
|
||||
, "milliseconds/request"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 130500
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_average = rrddim_add(st_duration, "average", NULL, 1, 1000, RRD_ALGORITHM_ABSOLUTE);
|
||||
rd_max = rrddim_add(st_duration, "max", NULL, 1, 1000, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
uint64_t gweb_usec = gs.web_usec;
|
||||
uint64_t gweb_requests = gs.web_requests;
|
||||
|
||||
uint64_t web_usec = (gweb_usec >= old_web_usec) ? gweb_usec - old_web_usec : 0;
|
||||
uint64_t web_requests = (gweb_requests >= old_web_requests) ? gweb_requests - old_web_requests : 0;
|
||||
|
||||
old_web_usec = gweb_usec;
|
||||
old_web_requests = gweb_requests;
|
||||
|
||||
if (web_requests)
|
||||
average_response_time = (collected_number) (web_usec / web_requests);
|
||||
|
||||
if (unlikely(average_response_time != -1))
|
||||
rrddim_set_by_pointer(st_duration, rd_average, average_response_time);
|
||||
else
|
||||
rrddim_set_by_pointer(st_duration, rd_average, 0);
|
||||
|
||||
rrddim_set_by_pointer(st_duration, rd_max, ((gs.web_usec_max)?(collected_number)gs.web_usec_max:average_response_time));
|
||||
rrdset_done(st_duration);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
if(!extended) return;
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
{
|
||||
static unsigned long long old_content_size = 0, old_compressed_content_size = 0;
|
||||
static collected_number compression_ratio = -1;
|
||||
|
||||
static RRDSET *st_compression = NULL;
|
||||
static RRDDIM *rd_savings = NULL;
|
||||
|
||||
if (unlikely(!st_compression)) {
|
||||
st_compression = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "compression_ratio"
|
||||
, NULL
|
||||
, "HTTP API"
|
||||
, NULL
|
||||
, "Netdata Web API Responses Compression Savings Ratio"
|
||||
, "percentage"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 130600
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_savings = rrddim_add(st_compression, "savings", NULL, 1, 1000, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
// since we don't lock here to read the telemetry
|
||||
// read the smaller value first
|
||||
unsigned long long gcompressed_content_size = gs.content_size_compressed;
|
||||
unsigned long long gcontent_size = gs.content_size_uncompressed;
|
||||
|
||||
unsigned long long compressed_content_size = gcompressed_content_size - old_compressed_content_size;
|
||||
unsigned long long content_size = gcontent_size - old_content_size;
|
||||
|
||||
old_compressed_content_size = gcompressed_content_size;
|
||||
old_content_size = gcontent_size;
|
||||
|
||||
if (content_size && content_size >= compressed_content_size)
|
||||
compression_ratio = ((content_size - compressed_content_size) * 100 * 1000) / content_size;
|
||||
|
||||
if (compression_ratio != -1)
|
||||
rrddim_set_by_pointer(st_compression, rd_savings, compression_ratio);
|
||||
|
||||
rrdset_done(st_compression);
|
||||
}
|
||||
}
|
21
src/daemon/telemetry/telemetry-http-api.h
Normal file
21
src/daemon/telemetry/telemetry-http-api.h
Normal file
|
@ -0,0 +1,21 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_HTTP_API_H
|
||||
#define NETDATA_TELEMETRY_HTTP_API_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
uint64_t telemetry_web_client_connected(void);
|
||||
void telemetry_web_client_disconnected(void);
|
||||
|
||||
void telemetry_web_request_completed(uint64_t dt,
|
||||
uint64_t bytes_received,
|
||||
uint64_t bytes_sent,
|
||||
uint64_t content_size,
|
||||
uint64_t compressed_content_size);
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_web_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_HTTP_API_H
|
58
src/daemon/telemetry/telemetry-ingestion.c
Normal file
58
src/daemon/telemetry/telemetry-ingestion.c
Normal file
|
@ -0,0 +1,58 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-ingestion.h"
|
||||
|
||||
static struct ingest_statistics {
|
||||
uint64_t db_points_stored_per_tier[RRD_STORAGE_TIERS];
|
||||
} ingest_statistics;
|
||||
|
||||
void telemetry_queries_rrdset_collection_completed(size_t *points_read_per_tier_array) {
|
||||
for(size_t tier = 0; tier < storage_tiers ;tier++) {
|
||||
__atomic_fetch_add(&ingest_statistics.db_points_stored_per_tier[tier], points_read_per_tier_array[tier], __ATOMIC_RELAXED);
|
||||
points_read_per_tier_array[tier] = 0;
|
||||
}
|
||||
}
|
||||
|
||||
static inline void telemetry_ingestion_copy(struct ingest_statistics *gs) {
|
||||
for(size_t tier = 0; tier < storage_tiers ;tier++)
|
||||
gs->db_points_stored_per_tier[tier] = __atomic_load_n(&ingest_statistics.db_points_stored_per_tier[tier], __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_ingestion_do(bool extended __maybe_unused) {
|
||||
static struct ingest_statistics gs;
|
||||
telemetry_ingestion_copy(&gs);
|
||||
|
||||
{
|
||||
static RRDSET *st_points_stored = NULL;
|
||||
static RRDDIM *rds[RRD_STORAGE_TIERS] = {};
|
||||
|
||||
if (unlikely(!st_points_stored)) {
|
||||
st_points_stored = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "db_samples_collected"
|
||||
, NULL
|
||||
, "Data Collection Samples"
|
||||
, NULL
|
||||
, "Netdata Time-Series Collected Samples"
|
||||
, "samples/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131003
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
|
||||
for(size_t tier = 0; tier < storage_tiers ;tier++) {
|
||||
char buf[30 + 1];
|
||||
snprintfz(buf, sizeof(buf) - 1, "tier%zu", tier);
|
||||
rds[tier] = rrddim_add(st_points_stored, buf, NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
}
|
||||
|
||||
for(size_t tier = 0; tier < storage_tiers ;tier++)
|
||||
rrddim_set_by_pointer(st_points_stored, rds[tier], (collected_number)gs.db_points_stored_per_tier[tier]);
|
||||
|
||||
rrdset_done(st_points_stored);
|
||||
}
|
||||
}
|
14
src/daemon/telemetry/telemetry-ingestion.h
Normal file
14
src/daemon/telemetry/telemetry-ingestion.h
Normal file
|
@ -0,0 +1,14 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_INGESTION_H
|
||||
#define NETDATA_TELEMETRY_INGESTION_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
void telemetry_queries_rrdset_collection_completed(size_t *points_read_per_tier_array);
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_ingestion_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_INGESTION_H
|
89
src/daemon/telemetry/telemetry-ml.c
Normal file
89
src/daemon/telemetry/telemetry-ml.c
Normal file
|
@ -0,0 +1,89 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-ml.h"
|
||||
|
||||
static struct ml_statistics {
|
||||
alignas(64) uint64_t ml_models_consulted;
|
||||
alignas(64) uint64_t ml_models_received;
|
||||
alignas(64) uint64_t ml_models_ignored;
|
||||
alignas(64) uint64_t ml_models_sent;
|
||||
alignas(64) uint64_t ml_models_deserialization_failures;
|
||||
alignas(64) uint64_t ml_memory_consumption;
|
||||
alignas(64) uint64_t ml_memory_new;
|
||||
alignas(64) uint64_t ml_memory_delete;
|
||||
} ml_statistics = {0};
|
||||
|
||||
void telemetry_ml_models_received()
|
||||
{
|
||||
__atomic_fetch_add(&ml_statistics.ml_models_received, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_ml_models_ignored()
|
||||
{
|
||||
__atomic_fetch_add(&ml_statistics.ml_models_ignored, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_ml_models_sent()
|
||||
{
|
||||
__atomic_fetch_add(&ml_statistics.ml_models_sent, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void global_statistics_ml_models_deserialization_failures()
|
||||
{
|
||||
__atomic_fetch_add(&ml_statistics.ml_models_deserialization_failures, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_ml_models_consulted(size_t models_consulted)
|
||||
{
|
||||
__atomic_fetch_add(&ml_statistics.ml_models_consulted, models_consulted, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_ml_memory_allocated(size_t n)
|
||||
{
|
||||
__atomic_fetch_add(&ml_statistics.ml_memory_consumption, n, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&ml_statistics.ml_memory_new, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_ml_memory_freed(size_t n)
|
||||
{
|
||||
__atomic_fetch_sub(&ml_statistics.ml_memory_consumption, n, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&ml_statistics.ml_memory_delete, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
uint64_t telemetry_ml_get_current_memory_usage(void) {
|
||||
return __atomic_load_n(&ml_statistics.ml_memory_consumption, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
static inline void ml_statistics_copy(struct ml_statistics *gs)
|
||||
{
|
||||
gs->ml_models_consulted = __atomic_load_n(&ml_statistics.ml_models_consulted, __ATOMIC_RELAXED);
|
||||
gs->ml_models_received = __atomic_load_n(&ml_statistics.ml_models_received, __ATOMIC_RELAXED);
|
||||
gs->ml_models_sent = __atomic_load_n(&ml_statistics.ml_models_sent, __ATOMIC_RELAXED);
|
||||
gs->ml_models_ignored = __atomic_load_n(&ml_statistics.ml_models_ignored, __ATOMIC_RELAXED);
|
||||
gs->ml_models_deserialization_failures =
|
||||
__atomic_load_n(&ml_statistics.ml_models_deserialization_failures, __ATOMIC_RELAXED);
|
||||
|
||||
gs->ml_memory_consumption = __atomic_load_n(&ml_statistics.ml_memory_consumption, __ATOMIC_RELAXED);
|
||||
gs->ml_memory_new = __atomic_load_n(&ml_statistics.ml_memory_new, __ATOMIC_RELAXED);
|
||||
gs->ml_memory_delete = __atomic_load_n(&ml_statistics.ml_memory_delete, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_ml_do(bool extended)
|
||||
{
|
||||
if (!extended)
|
||||
return;
|
||||
|
||||
struct ml_statistics gs;
|
||||
ml_statistics_copy(&gs);
|
||||
|
||||
ml_update_global_statistics_charts(
|
||||
gs.ml_models_consulted,
|
||||
gs.ml_models_received,
|
||||
gs.ml_models_sent,
|
||||
gs.ml_models_ignored,
|
||||
gs.ml_models_deserialization_failures,
|
||||
gs.ml_memory_consumption,
|
||||
gs.ml_memory_new,
|
||||
gs.ml_memory_delete);
|
||||
}
|
33
src/daemon/telemetry/telemetry-ml.h
Normal file
33
src/daemon/telemetry/telemetry-ml.h
Normal file
|
@ -0,0 +1,33 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_ML_H
|
||||
#define NETDATA_TELEMETRY_ML_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
void telemetry_ml_models_consulted(size_t models_consulted);
|
||||
void telemetry_ml_models_received();
|
||||
void telemetry_ml_models_ignored();
|
||||
void telemetry_ml_models_sent();
|
||||
|
||||
void telemetry_ml_memory_allocated(size_t n);
|
||||
void telemetry_ml_memory_freed(size_t n);
|
||||
|
||||
void global_statistics_ml_models_deserialization_failures();
|
||||
|
||||
uint64_t telemetry_ml_get_current_memory_usage(void);
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_ml_do(bool extended);
|
||||
#endif
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
||||
|
||||
#endif //NETDATA_TELEMETRY_ML_H
|
261
src/daemon/telemetry/telemetry-queries.c
Normal file
261
src/daemon/telemetry/telemetry-queries.c
Normal file
|
@ -0,0 +1,261 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-queries.h"
|
||||
|
||||
static struct query_statistics {
|
||||
uint64_t api_data_queries_made;
|
||||
uint64_t api_data_db_points_read;
|
||||
uint64_t api_data_result_points_generated;
|
||||
|
||||
uint64_t api_weights_queries_made;
|
||||
uint64_t api_weights_db_points_read;
|
||||
uint64_t api_weights_result_points_generated;
|
||||
|
||||
uint64_t api_badges_queries_made;
|
||||
uint64_t api_badges_db_points_read;
|
||||
uint64_t api_badges_result_points_generated;
|
||||
|
||||
uint64_t health_queries_made;
|
||||
uint64_t health_db_points_read;
|
||||
uint64_t health_result_points_generated;
|
||||
|
||||
uint64_t ml_queries_made;
|
||||
uint64_t ml_db_points_read;
|
||||
uint64_t ml_result_points_generated;
|
||||
|
||||
uint64_t backfill_queries_made;
|
||||
uint64_t backfill_db_points_read;
|
||||
|
||||
uint64_t exporters_queries_made;
|
||||
uint64_t exporters_db_points_read;
|
||||
} query_statistics;
|
||||
|
||||
void telemetry_queries_ml_query_completed(size_t points_read) {
|
||||
__atomic_fetch_add(&query_statistics.ml_queries_made, 1, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.ml_db_points_read, points_read, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_queries_exporters_query_completed(size_t points_read) {
|
||||
__atomic_fetch_add(&query_statistics.exporters_queries_made, 1, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.exporters_db_points_read, points_read, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_queries_backfill_query_completed(size_t points_read) {
|
||||
__atomic_fetch_add(&query_statistics.backfill_queries_made, 1, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.backfill_db_points_read, points_read, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_queries_rrdr_query_completed(size_t queries, uint64_t db_points_read, uint64_t result_points_generated, QUERY_SOURCE query_source) {
|
||||
switch(query_source) {
|
||||
case QUERY_SOURCE_API_DATA:
|
||||
__atomic_fetch_add(&query_statistics.api_data_queries_made, queries, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.api_data_db_points_read, db_points_read, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.api_data_result_points_generated, result_points_generated, __ATOMIC_RELAXED);
|
||||
break;
|
||||
|
||||
case QUERY_SOURCE_ML:
|
||||
__atomic_fetch_add(&query_statistics.ml_queries_made, queries, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.ml_db_points_read, db_points_read, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.ml_result_points_generated, result_points_generated, __ATOMIC_RELAXED);
|
||||
break;
|
||||
|
||||
case QUERY_SOURCE_API_WEIGHTS:
|
||||
__atomic_fetch_add(&query_statistics.api_weights_queries_made, queries, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.api_weights_db_points_read, db_points_read, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.api_weights_result_points_generated, result_points_generated, __ATOMIC_RELAXED);
|
||||
break;
|
||||
|
||||
case QUERY_SOURCE_API_BADGE:
|
||||
__atomic_fetch_add(&query_statistics.api_badges_queries_made, queries, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.api_badges_db_points_read, db_points_read, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.api_badges_result_points_generated, result_points_generated, __ATOMIC_RELAXED);
|
||||
break;
|
||||
|
||||
case QUERY_SOURCE_HEALTH:
|
||||
__atomic_fetch_add(&query_statistics.health_queries_made, queries, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.health_db_points_read, db_points_read, __ATOMIC_RELAXED);
|
||||
__atomic_fetch_add(&query_statistics.health_result_points_generated, result_points_generated, __ATOMIC_RELAXED);
|
||||
break;
|
||||
|
||||
default:
|
||||
case QUERY_SOURCE_UNITTEST:
|
||||
case QUERY_SOURCE_UNKNOWN:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
static inline void telemetry_queries_copy(struct query_statistics *gs) {
|
||||
gs->api_data_queries_made = __atomic_load_n(&query_statistics.api_data_queries_made, __ATOMIC_RELAXED);
|
||||
gs->api_data_db_points_read = __atomic_load_n(&query_statistics.api_data_db_points_read, __ATOMIC_RELAXED);
|
||||
gs->api_data_result_points_generated = __atomic_load_n(&query_statistics.api_data_result_points_generated, __ATOMIC_RELAXED);
|
||||
|
||||
gs->api_weights_queries_made = __atomic_load_n(&query_statistics.api_weights_queries_made, __ATOMIC_RELAXED);
|
||||
gs->api_weights_db_points_read = __atomic_load_n(&query_statistics.api_weights_db_points_read, __ATOMIC_RELAXED);
|
||||
gs->api_weights_result_points_generated = __atomic_load_n(&query_statistics.api_weights_result_points_generated, __ATOMIC_RELAXED);
|
||||
|
||||
gs->api_badges_queries_made = __atomic_load_n(&query_statistics.api_badges_queries_made, __ATOMIC_RELAXED);
|
||||
gs->api_badges_db_points_read = __atomic_load_n(&query_statistics.api_badges_db_points_read, __ATOMIC_RELAXED);
|
||||
gs->api_badges_result_points_generated = __atomic_load_n(&query_statistics.api_badges_result_points_generated, __ATOMIC_RELAXED);
|
||||
|
||||
gs->health_queries_made = __atomic_load_n(&query_statistics.health_queries_made, __ATOMIC_RELAXED);
|
||||
gs->health_db_points_read = __atomic_load_n(&query_statistics.health_db_points_read, __ATOMIC_RELAXED);
|
||||
gs->health_result_points_generated = __atomic_load_n(&query_statistics.health_result_points_generated, __ATOMIC_RELAXED);
|
||||
|
||||
gs->ml_queries_made = __atomic_load_n(&query_statistics.ml_queries_made, __ATOMIC_RELAXED);
|
||||
gs->ml_db_points_read = __atomic_load_n(&query_statistics.ml_db_points_read, __ATOMIC_RELAXED);
|
||||
gs->ml_result_points_generated = __atomic_load_n(&query_statistics.ml_result_points_generated, __ATOMIC_RELAXED);
|
||||
|
||||
gs->exporters_queries_made = __atomic_load_n(&query_statistics.exporters_queries_made, __ATOMIC_RELAXED);
|
||||
gs->exporters_db_points_read = __atomic_load_n(&query_statistics.exporters_db_points_read, __ATOMIC_RELAXED);
|
||||
gs->backfill_queries_made = __atomic_load_n(&query_statistics.backfill_queries_made, __ATOMIC_RELAXED);
|
||||
gs->backfill_db_points_read = __atomic_load_n(&query_statistics.backfill_db_points_read, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
void telemetry_queries_do(bool extended __maybe_unused) {
|
||||
static struct query_statistics gs;
|
||||
telemetry_queries_copy(&gs);
|
||||
|
||||
struct replication_query_statistics replication = replication_get_query_statistics();
|
||||
|
||||
{
|
||||
static RRDSET *st_queries = NULL;
|
||||
static RRDDIM *rd_api_data_queries = NULL;
|
||||
static RRDDIM *rd_api_weights_queries = NULL;
|
||||
static RRDDIM *rd_api_badges_queries = NULL;
|
||||
static RRDDIM *rd_health_queries = NULL;
|
||||
static RRDDIM *rd_ml_queries = NULL;
|
||||
static RRDDIM *rd_exporters_queries = NULL;
|
||||
static RRDDIM *rd_backfill_queries = NULL;
|
||||
static RRDDIM *rd_replication_queries = NULL;
|
||||
|
||||
if (unlikely(!st_queries)) {
|
||||
st_queries = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "queries"
|
||||
, NULL
|
||||
, "Time-Series Queries"
|
||||
, NULL
|
||||
, "Netdata Time-Series DB Queries"
|
||||
, "queries/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131000
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
|
||||
rd_api_data_queries = rrddim_add(st_queries, "/api/vX/data", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_api_weights_queries = rrddim_add(st_queries, "/api/vX/weights", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_api_badges_queries = rrddim_add(st_queries, "/api/vX/badge", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_health_queries = rrddim_add(st_queries, "health", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_ml_queries = rrddim_add(st_queries, "ml", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_exporters_queries = rrddim_add(st_queries, "exporters", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_backfill_queries = rrddim_add(st_queries, "backfill", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_replication_queries = rrddim_add(st_queries, "replication", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_queries, rd_api_data_queries, (collected_number)gs.api_data_queries_made);
|
||||
rrddim_set_by_pointer(st_queries, rd_api_weights_queries, (collected_number)gs.api_weights_queries_made);
|
||||
rrddim_set_by_pointer(st_queries, rd_api_badges_queries, (collected_number)gs.api_badges_queries_made);
|
||||
rrddim_set_by_pointer(st_queries, rd_health_queries, (collected_number)gs.health_queries_made);
|
||||
rrddim_set_by_pointer(st_queries, rd_ml_queries, (collected_number)gs.ml_queries_made);
|
||||
rrddim_set_by_pointer(st_queries, rd_exporters_queries, (collected_number)gs.exporters_queries_made);
|
||||
rrddim_set_by_pointer(st_queries, rd_backfill_queries, (collected_number)gs.backfill_queries_made);
|
||||
rrddim_set_by_pointer(st_queries, rd_replication_queries, (collected_number)replication.queries_finished);
|
||||
|
||||
rrdset_done(st_queries);
|
||||
}
|
||||
|
||||
{
|
||||
static RRDSET *st_points_read = NULL;
|
||||
static RRDDIM *rd_api_data_points_read = NULL;
|
||||
static RRDDIM *rd_api_weights_points_read = NULL;
|
||||
static RRDDIM *rd_api_badges_points_read = NULL;
|
||||
static RRDDIM *rd_health_points_read = NULL;
|
||||
static RRDDIM *rd_ml_points_read = NULL;
|
||||
static RRDDIM *rd_exporters_points_read = NULL;
|
||||
static RRDDIM *rd_backfill_points_read = NULL;
|
||||
static RRDDIM *rd_replication_points_read = NULL;
|
||||
|
||||
if (unlikely(!st_points_read)) {
|
||||
st_points_read = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "db_points_read"
|
||||
, NULL
|
||||
, "Time-Series Queries"
|
||||
, NULL
|
||||
, "Netdata Time-Series DB Samples Read"
|
||||
, "points/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131001
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
|
||||
rd_api_data_points_read = rrddim_add(st_points_read, "/api/vX/data", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_api_weights_points_read = rrddim_add(st_points_read, "/api/vX/weights", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_api_badges_points_read = rrddim_add(st_points_read, "/api/vX/badge", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_health_points_read = rrddim_add(st_points_read, "health", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_ml_points_read = rrddim_add(st_points_read, "ml", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_exporters_points_read = rrddim_add(st_points_read, "exporters", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_backfill_points_read = rrddim_add(st_points_read, "backfill", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_replication_points_read = rrddim_add(st_points_read, "replication", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_points_read, rd_api_data_points_read, (collected_number)gs.api_data_db_points_read);
|
||||
rrddim_set_by_pointer(st_points_read, rd_api_weights_points_read, (collected_number)gs.api_weights_db_points_read);
|
||||
rrddim_set_by_pointer(st_points_read, rd_api_badges_points_read, (collected_number)gs.api_badges_db_points_read);
|
||||
rrddim_set_by_pointer(st_points_read, rd_health_points_read, (collected_number)gs.health_db_points_read);
|
||||
rrddim_set_by_pointer(st_points_read, rd_ml_points_read, (collected_number)gs.ml_db_points_read);
|
||||
rrddim_set_by_pointer(st_points_read, rd_exporters_points_read, (collected_number)gs.exporters_db_points_read);
|
||||
rrddim_set_by_pointer(st_points_read, rd_backfill_points_read, (collected_number)gs.backfill_db_points_read);
|
||||
rrddim_set_by_pointer(st_points_read, rd_replication_points_read, (collected_number)replication.points_read);
|
||||
|
||||
rrdset_done(st_points_read);
|
||||
}
|
||||
|
||||
if(gs.api_data_result_points_generated || replication.points_generated) {
|
||||
static RRDSET *st_points_generated = NULL;
|
||||
static RRDDIM *rd_api_data_points_generated = NULL;
|
||||
static RRDDIM *rd_api_weights_points_generated = NULL;
|
||||
static RRDDIM *rd_api_badges_points_generated = NULL;
|
||||
static RRDDIM *rd_health_points_generated = NULL;
|
||||
static RRDDIM *rd_ml_points_generated = NULL;
|
||||
static RRDDIM *rd_replication_points_generated = NULL;
|
||||
|
||||
if (unlikely(!st_points_generated)) {
|
||||
st_points_generated = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "db_points_results"
|
||||
, NULL
|
||||
, "Time-Series Queries"
|
||||
, NULL
|
||||
, "Netdata Time-Series Samples Generated"
|
||||
, "points/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131002
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
|
||||
rd_api_data_points_generated = rrddim_add(st_points_generated, "/api/vX/data", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_api_weights_points_generated = rrddim_add(st_points_generated, "/api/vX/weights", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_api_badges_points_generated = rrddim_add(st_points_generated, "/api/vX/badge", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_health_points_generated = rrddim_add(st_points_generated, "health", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_ml_points_generated = rrddim_add(st_points_generated, "ml", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_replication_points_generated = rrddim_add(st_points_generated, "replication", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_points_generated, rd_api_data_points_generated, (collected_number)gs.api_data_result_points_generated);
|
||||
rrddim_set_by_pointer(st_points_generated, rd_api_weights_points_generated, (collected_number)gs.api_weights_result_points_generated);
|
||||
rrddim_set_by_pointer(st_points_generated, rd_api_badges_points_generated, (collected_number)gs.api_badges_result_points_generated);
|
||||
rrddim_set_by_pointer(st_points_generated, rd_health_points_generated, (collected_number)gs.health_result_points_generated);
|
||||
rrddim_set_by_pointer(st_points_generated, rd_ml_points_generated, (collected_number)gs.ml_result_points_generated);
|
||||
rrddim_set_by_pointer(st_points_generated, rd_replication_points_generated, (collected_number)replication.points_generated);
|
||||
|
||||
rrdset_done(st_points_generated);
|
||||
}
|
||||
}
|
17
src/daemon/telemetry/telemetry-queries.h
Normal file
17
src/daemon/telemetry/telemetry-queries.h
Normal file
|
@ -0,0 +1,17 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_QUERIES_H
|
||||
#define NETDATA_TELEMETRY_QUERIES_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
void telemetry_queries_ml_query_completed(size_t points_read);
|
||||
void telemetry_queries_exporters_query_completed(size_t points_read);
|
||||
void telemetry_queries_backfill_query_completed(size_t points_read);
|
||||
void telemetry_queries_rrdr_query_completed(size_t queries, uint64_t db_points_read, uint64_t result_points_generated, QUERY_SOURCE query_source);
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_queries_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_QUERIES_H
|
314
src/daemon/telemetry/telemetry-sqlite3.c
Normal file
314
src/daemon/telemetry/telemetry-sqlite3.c
Normal file
|
@ -0,0 +1,314 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-sqlite3.h"
|
||||
|
||||
static struct sqlite3_statistics {
|
||||
bool enabled;
|
||||
|
||||
alignas(64) uint64_t sqlite3_queries_made;
|
||||
alignas(64) uint64_t sqlite3_queries_ok;
|
||||
alignas(64) uint64_t sqlite3_queries_failed;
|
||||
alignas(64) uint64_t sqlite3_queries_failed_busy;
|
||||
alignas(64) uint64_t sqlite3_queries_failed_locked;
|
||||
alignas(64) uint64_t sqlite3_rows;
|
||||
alignas(64) uint64_t sqlite3_metadata_cache_hit;
|
||||
alignas(64) uint64_t sqlite3_context_cache_hit;
|
||||
alignas(64) uint64_t sqlite3_metadata_cache_miss;
|
||||
alignas(64) uint64_t sqlite3_context_cache_miss;
|
||||
alignas(64) uint64_t sqlite3_metadata_cache_spill;
|
||||
alignas(64) uint64_t sqlite3_context_cache_spill;
|
||||
alignas(64) uint64_t sqlite3_metadata_cache_write;
|
||||
alignas(64) uint64_t sqlite3_context_cache_write;
|
||||
} sqlite3_statistics = { };
|
||||
|
||||
void telemetry_sqlite3_query_completed(bool success, bool busy, bool locked) {
|
||||
if(!sqlite3_statistics.enabled) return;
|
||||
|
||||
__atomic_fetch_add(&sqlite3_statistics.sqlite3_queries_made, 1, __ATOMIC_RELAXED);
|
||||
|
||||
if(success) {
|
||||
__atomic_fetch_add(&sqlite3_statistics.sqlite3_queries_ok, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
else {
|
||||
__atomic_fetch_add(&sqlite3_statistics.sqlite3_queries_failed, 1, __ATOMIC_RELAXED);
|
||||
|
||||
if(busy)
|
||||
__atomic_fetch_add(&sqlite3_statistics.sqlite3_queries_failed_busy, 1, __ATOMIC_RELAXED);
|
||||
|
||||
if(locked)
|
||||
__atomic_fetch_add(&sqlite3_statistics.sqlite3_queries_failed_locked, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
}
|
||||
|
||||
void telemetry_sqlite3_row_completed(void) {
|
||||
if(!sqlite3_statistics.enabled) return;
|
||||
|
||||
__atomic_fetch_add(&sqlite3_statistics.sqlite3_rows, 1, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
static inline void sqlite3_statistics_copy(struct sqlite3_statistics *gs) {
|
||||
static usec_t last_run = 0;
|
||||
|
||||
gs->sqlite3_queries_made = __atomic_load_n(&sqlite3_statistics.sqlite3_queries_made, __ATOMIC_RELAXED);
|
||||
gs->sqlite3_queries_ok = __atomic_load_n(&sqlite3_statistics.sqlite3_queries_ok, __ATOMIC_RELAXED);
|
||||
gs->sqlite3_queries_failed = __atomic_load_n(&sqlite3_statistics.sqlite3_queries_failed, __ATOMIC_RELAXED);
|
||||
gs->sqlite3_queries_failed_busy = __atomic_load_n(&sqlite3_statistics.sqlite3_queries_failed_busy, __ATOMIC_RELAXED);
|
||||
gs->sqlite3_queries_failed_locked = __atomic_load_n(&sqlite3_statistics.sqlite3_queries_failed_locked, __ATOMIC_RELAXED);
|
||||
gs->sqlite3_rows = __atomic_load_n(&sqlite3_statistics.sqlite3_rows, __ATOMIC_RELAXED);
|
||||
|
||||
usec_t timeout = default_rrd_update_every * USEC_PER_SEC + default_rrd_update_every * USEC_PER_SEC / 3;
|
||||
usec_t now = now_monotonic_usec();
|
||||
if(!last_run)
|
||||
last_run = now;
|
||||
usec_t delta = now - last_run;
|
||||
bool query_sqlite3 = delta < timeout;
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_metadata_cache_hit = (uint64_t) sql_metadata_cache_stats(SQLITE_DBSTATUS_CACHE_HIT);
|
||||
else {
|
||||
gs->sqlite3_metadata_cache_hit = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_context_cache_hit = (uint64_t) sql_context_cache_stats(SQLITE_DBSTATUS_CACHE_HIT);
|
||||
else {
|
||||
gs->sqlite3_context_cache_hit = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_metadata_cache_miss = (uint64_t) sql_metadata_cache_stats(SQLITE_DBSTATUS_CACHE_MISS);
|
||||
else {
|
||||
gs->sqlite3_metadata_cache_miss = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_context_cache_miss = (uint64_t) sql_context_cache_stats(SQLITE_DBSTATUS_CACHE_MISS);
|
||||
else {
|
||||
gs->sqlite3_context_cache_miss = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_metadata_cache_spill = (uint64_t) sql_metadata_cache_stats(SQLITE_DBSTATUS_CACHE_SPILL);
|
||||
else {
|
||||
gs->sqlite3_metadata_cache_spill = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_context_cache_spill = (uint64_t) sql_context_cache_stats(SQLITE_DBSTATUS_CACHE_SPILL);
|
||||
else {
|
||||
gs->sqlite3_context_cache_spill = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_metadata_cache_write = (uint64_t) sql_metadata_cache_stats(SQLITE_DBSTATUS_CACHE_WRITE);
|
||||
else {
|
||||
gs->sqlite3_metadata_cache_write = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
if(query_sqlite3 && now_monotonic_usec() - last_run < timeout)
|
||||
gs->sqlite3_context_cache_write = (uint64_t) sql_context_cache_stats(SQLITE_DBSTATUS_CACHE_WRITE);
|
||||
else {
|
||||
gs->sqlite3_context_cache_write = UINT64_MAX;
|
||||
query_sqlite3 = false;
|
||||
}
|
||||
|
||||
last_run = now_monotonic_usec();
|
||||
}
|
||||
|
||||
void telemetry_sqlite3_do(bool extended) {
|
||||
if(!extended) return;
|
||||
sqlite3_statistics.enabled = true;
|
||||
|
||||
struct sqlite3_statistics gs;
|
||||
sqlite3_statistics_copy(&gs);
|
||||
|
||||
if(gs.sqlite3_queries_made) {
|
||||
static RRDSET *st_sqlite3_queries = NULL;
|
||||
static RRDDIM *rd_queries = NULL;
|
||||
|
||||
if (unlikely(!st_sqlite3_queries)) {
|
||||
st_sqlite3_queries = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "sqlite3_queries"
|
||||
, NULL
|
||||
, "sqlite3"
|
||||
, NULL
|
||||
, "Netdata SQLite3 Queries"
|
||||
, "queries/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131100
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_queries = rrddim_add(st_sqlite3_queries, "queries", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_sqlite3_queries, rd_queries, (collected_number)gs.sqlite3_queries_made);
|
||||
|
||||
rrdset_done(st_sqlite3_queries);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
if(gs.sqlite3_queries_ok || gs.sqlite3_queries_failed) {
|
||||
static RRDSET *st_sqlite3_queries_by_status = NULL;
|
||||
static RRDDIM *rd_ok = NULL, *rd_failed = NULL, *rd_busy = NULL, *rd_locked = NULL;
|
||||
|
||||
if (unlikely(!st_sqlite3_queries_by_status)) {
|
||||
st_sqlite3_queries_by_status = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "sqlite3_queries_by_status"
|
||||
, NULL
|
||||
, "sqlite3"
|
||||
, NULL
|
||||
, "Netdata SQLite3 Queries by status"
|
||||
, "queries/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131101
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_ok = rrddim_add(st_sqlite3_queries_by_status, "ok", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_failed = rrddim_add(st_sqlite3_queries_by_status, "failed", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_busy = rrddim_add(st_sqlite3_queries_by_status, "busy", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_locked = rrddim_add(st_sqlite3_queries_by_status, "locked", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_sqlite3_queries_by_status, rd_ok, (collected_number)gs.sqlite3_queries_made);
|
||||
rrddim_set_by_pointer(st_sqlite3_queries_by_status, rd_failed, (collected_number)gs.sqlite3_queries_failed);
|
||||
rrddim_set_by_pointer(st_sqlite3_queries_by_status, rd_busy, (collected_number)gs.sqlite3_queries_failed_busy);
|
||||
rrddim_set_by_pointer(st_sqlite3_queries_by_status, rd_locked, (collected_number)gs.sqlite3_queries_failed_locked);
|
||||
|
||||
rrdset_done(st_sqlite3_queries_by_status);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
if(gs.sqlite3_rows) {
|
||||
static RRDSET *st_sqlite3_rows = NULL;
|
||||
static RRDDIM *rd_rows = NULL;
|
||||
|
||||
if (unlikely(!st_sqlite3_rows)) {
|
||||
st_sqlite3_rows = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "sqlite3_rows"
|
||||
, NULL
|
||||
, "sqlite3"
|
||||
, NULL
|
||||
, "Netdata SQLite3 Rows"
|
||||
, "rows/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131102
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_rows = rrddim_add(st_sqlite3_rows, "ok", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_sqlite3_rows, rd_rows, (collected_number)gs.sqlite3_rows);
|
||||
|
||||
rrdset_done(st_sqlite3_rows);
|
||||
}
|
||||
|
||||
if(gs.sqlite3_metadata_cache_hit) {
|
||||
static RRDSET *st_sqlite3_cache = NULL;
|
||||
static RRDDIM *rd_cache_hit = NULL;
|
||||
static RRDDIM *rd_cache_miss= NULL;
|
||||
static RRDDIM *rd_cache_spill= NULL;
|
||||
static RRDDIM *rd_cache_write= NULL;
|
||||
|
||||
if (unlikely(!st_sqlite3_cache)) {
|
||||
st_sqlite3_cache = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "sqlite3_metatada_cache"
|
||||
, NULL
|
||||
, "sqlite3"
|
||||
, NULL
|
||||
, "Netdata SQLite3 metadata cache"
|
||||
, "ops/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131103
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_cache_hit = rrddim_add(st_sqlite3_cache, "cache_hit", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_cache_miss = rrddim_add(st_sqlite3_cache, "cache_miss", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_cache_spill = rrddim_add(st_sqlite3_cache, "cache_spill", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_cache_write = rrddim_add(st_sqlite3_cache, "cache_write", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
if(gs.sqlite3_metadata_cache_hit != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_hit, (collected_number)gs.sqlite3_metadata_cache_hit);
|
||||
|
||||
if(gs.sqlite3_metadata_cache_miss != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_miss, (collected_number)gs.sqlite3_metadata_cache_miss);
|
||||
|
||||
if(gs.sqlite3_metadata_cache_spill != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_spill, (collected_number)gs.sqlite3_metadata_cache_spill);
|
||||
|
||||
if(gs.sqlite3_metadata_cache_write != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_write, (collected_number)gs.sqlite3_metadata_cache_write);
|
||||
|
||||
rrdset_done(st_sqlite3_cache);
|
||||
}
|
||||
|
||||
if(gs.sqlite3_context_cache_hit) {
|
||||
static RRDSET *st_sqlite3_cache = NULL;
|
||||
static RRDDIM *rd_cache_hit = NULL;
|
||||
static RRDDIM *rd_cache_miss= NULL;
|
||||
static RRDDIM *rd_cache_spill= NULL;
|
||||
static RRDDIM *rd_cache_write= NULL;
|
||||
|
||||
if (unlikely(!st_sqlite3_cache)) {
|
||||
st_sqlite3_cache = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "sqlite3_context_cache"
|
||||
, NULL
|
||||
, "sqlite3"
|
||||
, NULL
|
||||
, "Netdata SQLite3 context cache"
|
||||
, "ops/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 131104
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
rd_cache_hit = rrddim_add(st_sqlite3_cache, "cache_hit", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_cache_miss = rrddim_add(st_sqlite3_cache, "cache_miss", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_cache_spill = rrddim_add(st_sqlite3_cache, "cache_spill", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_cache_write = rrddim_add(st_sqlite3_cache, "cache_write", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
}
|
||||
|
||||
if(gs.sqlite3_context_cache_hit != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_hit, (collected_number)gs.sqlite3_context_cache_hit);
|
||||
|
||||
if(gs.sqlite3_context_cache_miss != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_miss, (collected_number)gs.sqlite3_context_cache_miss);
|
||||
|
||||
if(gs.sqlite3_context_cache_spill != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_spill, (collected_number)gs.sqlite3_context_cache_spill);
|
||||
|
||||
if(gs.sqlite3_context_cache_write != UINT64_MAX)
|
||||
rrddim_set_by_pointer(st_sqlite3_cache, rd_cache_write, (collected_number)gs.sqlite3_context_cache_write);
|
||||
|
||||
rrdset_done(st_sqlite3_cache);
|
||||
}
|
||||
}
|
15
src/daemon/telemetry/telemetry-sqlite3.h
Normal file
15
src/daemon/telemetry/telemetry-sqlite3.h
Normal file
|
@ -0,0 +1,15 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_SQLITE3_H
|
||||
#define NETDATA_TELEMETRY_SQLITE3_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
void telemetry_sqlite3_query_completed(bool success, bool busy, bool locked);
|
||||
void telemetry_sqlite3_row_completed(void);
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_sqlite3_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_SQLITE3_H
|
101
src/daemon/telemetry/telemetry-string.c
Normal file
101
src/daemon/telemetry/telemetry-string.c
Normal file
|
@ -0,0 +1,101 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-string.h"
|
||||
|
||||
void telemetry_string_do(bool extended) {
|
||||
if(!extended) return;
|
||||
|
||||
static RRDSET *st_ops = NULL, *st_entries = NULL, *st_mem = NULL;
|
||||
static RRDDIM *rd_ops_inserts = NULL, *rd_ops_deletes = NULL;
|
||||
static RRDDIM *rd_entries_entries = NULL;
|
||||
static RRDDIM *rd_mem = NULL;
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
static RRDDIM *rd_entries_refs = NULL, *rd_ops_releases = NULL, *rd_ops_duplications = NULL, *rd_ops_searches = NULL;
|
||||
#endif
|
||||
|
||||
size_t inserts, deletes, searches, entries, references, memory, duplications, releases;
|
||||
|
||||
string_statistics(&inserts, &deletes, &searches, &entries, &references, &memory, &duplications, &releases);
|
||||
|
||||
if (unlikely(!st_ops)) {
|
||||
st_ops = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "strings_ops"
|
||||
, NULL
|
||||
, "strings"
|
||||
, NULL
|
||||
, "Strings operations"
|
||||
, "ops/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 910000
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE);
|
||||
|
||||
rd_ops_inserts = rrddim_add(st_ops, "inserts", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_ops_deletes = rrddim_add(st_ops, "deletes", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
rd_ops_searches = rrddim_add(st_ops, "searches", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_ops_duplications = rrddim_add(st_ops, "duplications", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
rd_ops_releases = rrddim_add(st_ops, "releases", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
#endif
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_ops, rd_ops_inserts, (collected_number)inserts);
|
||||
rrddim_set_by_pointer(st_ops, rd_ops_deletes, (collected_number)deletes);
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
rrddim_set_by_pointer(st_ops, rd_ops_searches, (collected_number)searches);
|
||||
rrddim_set_by_pointer(st_ops, rd_ops_duplications, (collected_number)duplications);
|
||||
rrddim_set_by_pointer(st_ops, rd_ops_releases, (collected_number)releases);
|
||||
#endif
|
||||
rrdset_done(st_ops);
|
||||
|
||||
if (unlikely(!st_entries)) {
|
||||
st_entries = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "strings_entries"
|
||||
, NULL
|
||||
, "strings"
|
||||
, NULL
|
||||
, "Strings entries"
|
||||
, "entries"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 910001
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_AREA);
|
||||
|
||||
rd_entries_entries = rrddim_add(st_entries, "entries", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
rd_entries_refs = rrddim_add(st_entries, "references", NULL, 1, -1, RRD_ALGORITHM_ABSOLUTE);
|
||||
#endif
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_entries, rd_entries_entries, (collected_number)entries);
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
rrddim_set_by_pointer(st_entries, rd_entries_refs, (collected_number)references);
|
||||
#endif
|
||||
rrdset_done(st_entries);
|
||||
|
||||
if (unlikely(!st_mem)) {
|
||||
st_mem = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "strings_memory"
|
||||
, NULL
|
||||
, "strings"
|
||||
, NULL
|
||||
, "Strings memory"
|
||||
, "bytes"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 910001
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_AREA);
|
||||
|
||||
rd_mem = rrddim_add(st_mem, "memory", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(st_mem, rd_mem, (collected_number)memory);
|
||||
rrdset_done(st_mem);
|
||||
}
|
12
src/daemon/telemetry/telemetry-string.h
Normal file
12
src/daemon/telemetry/telemetry-string.h
Normal file
|
@ -0,0 +1,12 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_STRING_H
|
||||
#define NETDATA_TELEMETRY_STRING_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_string_do(bool extended);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_STRING_H
|
147
src/daemon/telemetry/telemetry-trace-allocations.c
Normal file
147
src/daemon/telemetry/telemetry-trace-allocations.c
Normal file
|
@ -0,0 +1,147 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-trace-allocations.h"
|
||||
|
||||
#ifdef NETDATA_TRACE_ALLOCATIONS
|
||||
|
||||
struct memory_trace_data {
|
||||
RRDSET *st_memory;
|
||||
RRDSET *st_allocations;
|
||||
RRDSET *st_avg_alloc;
|
||||
RRDSET *st_ops;
|
||||
};
|
||||
|
||||
static int do_memory_trace_item(void *item, void *data) {
|
||||
struct memory_trace_data *tmp = data;
|
||||
struct malloc_trace *p = item;
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
if(!p->rd_bytes)
|
||||
p->rd_bytes = rrddim_add(tmp->st_memory, p->function, NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
collected_number bytes = (collected_number)__atomic_load_n(&p->bytes, __ATOMIC_RELAXED);
|
||||
rrddim_set_by_pointer(tmp->st_memory, p->rd_bytes, bytes);
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
if(!p->rd_allocations)
|
||||
p->rd_allocations = rrddim_add(tmp->st_allocations, p->function, NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
collected_number allocs = (collected_number)__atomic_load_n(&p->allocations, __ATOMIC_RELAXED);
|
||||
rrddim_set_by_pointer(tmp->st_allocations, p->rd_allocations, allocs);
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
if(!p->rd_avg_alloc)
|
||||
p->rd_avg_alloc = rrddim_add(tmp->st_avg_alloc, p->function, NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
collected_number avg_alloc = (allocs)?(bytes * 100 / allocs):0;
|
||||
rrddim_set_by_pointer(tmp->st_avg_alloc, p->rd_avg_alloc, avg_alloc);
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
if(!p->rd_ops)
|
||||
p->rd_ops = rrddim_add(tmp->st_ops, p->function, NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL);
|
||||
|
||||
collected_number ops = 0;
|
||||
ops += (collected_number)__atomic_load_n(&p->malloc_calls, __ATOMIC_RELAXED);
|
||||
ops += (collected_number)__atomic_load_n(&p->calloc_calls, __ATOMIC_RELAXED);
|
||||
ops += (collected_number)__atomic_load_n(&p->realloc_calls, __ATOMIC_RELAXED);
|
||||
ops += (collected_number)__atomic_load_n(&p->strdup_calls, __ATOMIC_RELAXED);
|
||||
ops += (collected_number)__atomic_load_n(&p->free_calls, __ATOMIC_RELAXED);
|
||||
rrddim_set_by_pointer(tmp->st_ops, p->rd_ops, ops);
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
void telemetry_trace_allocations_do(bool extended) {
|
||||
if(!extended) return;
|
||||
|
||||
static struct memory_trace_data tmp = {
|
||||
.st_memory = NULL,
|
||||
.st_allocations = NULL,
|
||||
.st_avg_alloc = NULL,
|
||||
.st_ops = NULL,
|
||||
};
|
||||
|
||||
if(!tmp.st_memory) {
|
||||
tmp.st_memory = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "memory_size"
|
||||
, NULL
|
||||
, "memory"
|
||||
, "netdata.memory.size"
|
||||
, "Netdata Memory Used by Function"
|
||||
, "bytes"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 900000
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
}
|
||||
|
||||
if(!tmp.st_ops) {
|
||||
tmp.st_ops = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "memory_operations"
|
||||
, NULL
|
||||
, "memory"
|
||||
, "netdata.memory.operations"
|
||||
, "Netdata Memory Operations by Function"
|
||||
, "ops/s"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 900001
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
}
|
||||
|
||||
if(!tmp.st_allocations) {
|
||||
tmp.st_allocations = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "memory_allocations"
|
||||
, NULL
|
||||
, "memory"
|
||||
, "netdata.memory.allocations"
|
||||
, "Netdata Memory Allocations by Function"
|
||||
, "allocations"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 900002
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
}
|
||||
|
||||
if(!tmp.st_avg_alloc) {
|
||||
tmp.st_avg_alloc = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, "memory_avg_alloc"
|
||||
, NULL
|
||||
, "memory"
|
||||
, "netdata.memory.avg_alloc"
|
||||
, "Netdata Average Allocation Size by Function"
|
||||
, "bytes"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, 900003
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
}
|
||||
|
||||
malloc_trace_walkthrough(do_memory_trace_item, &tmp);
|
||||
|
||||
rrdset_done(tmp.st_memory);
|
||||
rrdset_done(tmp.st_ops);
|
||||
rrdset_done(tmp.st_allocations);
|
||||
rrdset_done(tmp.st_avg_alloc);
|
||||
}
|
||||
|
||||
#endif
|
14
src/daemon/telemetry/telemetry-trace-allocations.h
Normal file
14
src/daemon/telemetry/telemetry-trace-allocations.h
Normal file
|
@ -0,0 +1,14 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_TRACE_ALLOCATIONS_H
|
||||
#define NETDATA_TELEMETRY_TRACE_ALLOCATIONS_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
#ifdef NETDATA_TRACE_ALLOCATIONS
|
||||
void telemetry_trace_allocations_do(bool extended);
|
||||
#endif
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_TRACE_ALLOCATIONS_H
|
791
src/daemon/telemetry/telemetry-workers.c
Normal file
791
src/daemon/telemetry/telemetry-workers.c
Normal file
|
@ -0,0 +1,791 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "telemetry-workers.h"
|
||||
|
||||
#define WORKERS_MIN_PERCENT_DEFAULT 10000.0
|
||||
|
||||
struct worker_job_type_gs {
|
||||
STRING *name;
|
||||
STRING *units;
|
||||
|
||||
size_t jobs_started;
|
||||
usec_t busy_time;
|
||||
|
||||
RRDDIM *rd_jobs_started;
|
||||
RRDDIM *rd_busy_time;
|
||||
|
||||
WORKER_METRIC_TYPE type;
|
||||
NETDATA_DOUBLE min_value;
|
||||
NETDATA_DOUBLE max_value;
|
||||
NETDATA_DOUBLE sum_value;
|
||||
size_t count_value;
|
||||
|
||||
RRDSET *st;
|
||||
RRDDIM *rd_min;
|
||||
RRDDIM *rd_max;
|
||||
RRDDIM *rd_avg;
|
||||
};
|
||||
|
||||
struct worker_thread {
|
||||
pid_t pid;
|
||||
bool enabled;
|
||||
|
||||
bool cpu_enabled;
|
||||
double cpu;
|
||||
|
||||
kernel_uint_t utime;
|
||||
kernel_uint_t stime;
|
||||
|
||||
kernel_uint_t utime_old;
|
||||
kernel_uint_t stime_old;
|
||||
|
||||
usec_t collected_time;
|
||||
usec_t collected_time_old;
|
||||
|
||||
size_t jobs_started;
|
||||
usec_t busy_time;
|
||||
|
||||
struct worker_thread *next;
|
||||
struct worker_thread *prev;
|
||||
};
|
||||
|
||||
struct worker_utilization {
|
||||
const char *name;
|
||||
const char *family;
|
||||
size_t priority;
|
||||
uint32_t flags;
|
||||
|
||||
char *name_lowercase;
|
||||
|
||||
struct worker_job_type_gs per_job_type[WORKER_UTILIZATION_MAX_JOB_TYPES];
|
||||
|
||||
size_t workers_max_job_id;
|
||||
size_t workers_registered;
|
||||
size_t workers_busy;
|
||||
usec_t workers_total_busy_time;
|
||||
usec_t workers_total_duration;
|
||||
size_t workers_total_jobs_started;
|
||||
double workers_min_busy_time;
|
||||
double workers_max_busy_time;
|
||||
|
||||
size_t workers_cpu_registered;
|
||||
double workers_cpu_min;
|
||||
double workers_cpu_max;
|
||||
double workers_cpu_total;
|
||||
|
||||
struct worker_thread *threads;
|
||||
|
||||
RRDSET *st_workers_time;
|
||||
RRDDIM *rd_workers_time_avg;
|
||||
RRDDIM *rd_workers_time_min;
|
||||
RRDDIM *rd_workers_time_max;
|
||||
|
||||
RRDSET *st_workers_cpu;
|
||||
RRDDIM *rd_workers_cpu_avg;
|
||||
RRDDIM *rd_workers_cpu_min;
|
||||
RRDDIM *rd_workers_cpu_max;
|
||||
|
||||
RRDSET *st_workers_threads;
|
||||
RRDDIM *rd_workers_threads_free;
|
||||
RRDDIM *rd_workers_threads_busy;
|
||||
|
||||
RRDSET *st_workers_jobs_per_job_type;
|
||||
RRDSET *st_workers_busy_per_job_type;
|
||||
|
||||
RRDDIM *rd_total_cpu_utilizaton;
|
||||
};
|
||||
|
||||
static struct worker_utilization all_workers_utilization[] = {
|
||||
{ .name = "STATS", .family = "workers telemetry", .priority = 1000000 },
|
||||
{ .name = "HEALTH", .family = "workers health alarms", .priority = 1000000 },
|
||||
{ .name = "MLTRAIN", .family = "workers ML training", .priority = 1000000 },
|
||||
{ .name = "MLDETECT", .family = "workers ML detection", .priority = 1000000 },
|
||||
{ .name = "STREAM", .family = "workers streaming", .priority = 1000000 },
|
||||
{ .name = "STREAMCNT", .family = "workers streaming connect", .priority = 1000000 },
|
||||
{ .name = "DBENGINE", .family = "workers dbengine instances", .priority = 1000000 },
|
||||
{ .name = "LIBUV", .family = "workers libuv threadpool", .priority = 1000000 },
|
||||
{ .name = "WEB", .family = "workers web server", .priority = 1000000 },
|
||||
{ .name = "ACLKSYNC", .family = "workers aclk sync", .priority = 1000000 },
|
||||
{ .name = "METASYNC", .family = "workers metadata sync", .priority = 1000000 },
|
||||
{ .name = "PLUGINSD", .family = "workers plugins.d", .priority = 1000000 },
|
||||
{ .name = "STATSD", .family = "workers plugin statsd", .priority = 1000000 },
|
||||
{ .name = "STATSDFLUSH", .family = "workers plugin statsd flush", .priority = 1000000 },
|
||||
{ .name = "PROC", .family = "workers plugin proc", .priority = 1000000 },
|
||||
{ .name = "WIN", .family = "workers plugin windows", .priority = 1000000 },
|
||||
{ .name = "NETDEV", .family = "workers plugin proc netdev", .priority = 1000000 },
|
||||
{ .name = "FREEBSD", .family = "workers plugin freebsd", .priority = 1000000 },
|
||||
{ .name = "MACOS", .family = "workers plugin macos", .priority = 1000000 },
|
||||
{ .name = "CGROUPS", .family = "workers plugin cgroups", .priority = 1000000 },
|
||||
{ .name = "CGROUPSDISC", .family = "workers plugin cgroups find", .priority = 1000000 },
|
||||
{ .name = "DISKSPACE", .family = "workers plugin diskspace", .priority = 1000000 },
|
||||
{ .name = "TC", .family = "workers plugin tc", .priority = 1000000 },
|
||||
{ .name = "TIMEX", .family = "workers plugin timex", .priority = 1000000 },
|
||||
{ .name = "IDLEJITTER", .family = "workers plugin idlejitter", .priority = 1000000 },
|
||||
{ .name = "RRDCONTEXT", .family = "workers contexts", .priority = 1000000 },
|
||||
{ .name = "REPLICATION", .family = "workers replication sender", .priority = 1000000 },
|
||||
{ .name = "SERVICE", .family = "workers service", .priority = 1000000 },
|
||||
{ .name = "PROFILER", .family = "workers profile", .priority = 1000000 },
|
||||
{ .name = "PGCEVICT", .family = "workers dbengine eviction", .priority = 1000000 },
|
||||
|
||||
// has to be terminated with a NULL
|
||||
{ .name = NULL, .family = NULL }
|
||||
};
|
||||
|
||||
static void workers_total_cpu_utilization_chart(void) {
|
||||
size_t i, cpu_enabled = 0;
|
||||
for(i = 0; all_workers_utilization[i].name ;i++)
|
||||
if(all_workers_utilization[i].workers_cpu_registered) cpu_enabled++;
|
||||
|
||||
if(!cpu_enabled) return;
|
||||
|
||||
static RRDSET *st = NULL;
|
||||
|
||||
if(!st) {
|
||||
st = rrdset_create_localhost(
|
||||
"netdata",
|
||||
"workers_cpu",
|
||||
NULL,
|
||||
"workers",
|
||||
"netdata.workers.cpu_total",
|
||||
"Netdata Workers CPU Utilization (100% = 1 core)",
|
||||
"%",
|
||||
"netdata",
|
||||
"stats",
|
||||
999000,
|
||||
localhost->rrd_update_every,
|
||||
RRDSET_TYPE_STACKED);
|
||||
}
|
||||
|
||||
for(i = 0; all_workers_utilization[i].name ;i++) {
|
||||
struct worker_utilization *wu = &all_workers_utilization[i];
|
||||
if(!wu->workers_cpu_registered) continue;
|
||||
|
||||
if(!wu->rd_total_cpu_utilizaton)
|
||||
wu->rd_total_cpu_utilizaton = rrddim_add(st, wu->name_lowercase, NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
rrddim_set_by_pointer(st, wu->rd_total_cpu_utilizaton, (collected_number)((double)wu->workers_cpu_total * 100.0));
|
||||
}
|
||||
|
||||
rrdset_done(st);
|
||||
}
|
||||
|
||||
#define WORKER_CHART_DECIMAL_PRECISION 100
|
||||
|
||||
static void workers_utilization_update_chart(struct worker_utilization *wu) {
|
||||
if(!wu->workers_registered) return;
|
||||
|
||||
//fprintf(stderr, "%-12s WORKER UTILIZATION: %-3.2f%%, %zu jobs done, %zu running, on %zu workers, min %-3.02f%%, max %-3.02f%%.\n",
|
||||
// wu->name,
|
||||
// (double)wu->workers_total_busy_time * 100.0 / (double)wu->workers_total_duration,
|
||||
// wu->workers_total_jobs_started, wu->workers_busy, wu->workers_registered,
|
||||
// wu->workers_min_busy_time, wu->workers_max_busy_time);
|
||||
|
||||
// ----------------------------------------------------------------------
|
||||
|
||||
if(unlikely(!wu->st_workers_time)) {
|
||||
char name[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(name, RRD_ID_LENGTH_MAX, "workers_time_%s", wu->name_lowercase);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.time", wu->name_lowercase);
|
||||
|
||||
wu->st_workers_time = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, name
|
||||
, NULL
|
||||
, wu->family
|
||||
, context
|
||||
, "Netdata Workers Busy Time (100% = all workers busy)"
|
||||
, "%"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, wu->priority
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_AREA
|
||||
);
|
||||
}
|
||||
|
||||
// we add the min and max dimensions only when we have multiple workers
|
||||
|
||||
if(unlikely(!wu->rd_workers_time_min && wu->workers_registered > 1))
|
||||
wu->rd_workers_time_min = rrddim_add(wu->st_workers_time, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
if(unlikely(!wu->rd_workers_time_max && wu->workers_registered > 1))
|
||||
wu->rd_workers_time_max = rrddim_add(wu->st_workers_time, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
if(unlikely(!wu->rd_workers_time_avg))
|
||||
wu->rd_workers_time_avg = rrddim_add(wu->st_workers_time, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
if(unlikely(wu->workers_min_busy_time == WORKERS_MIN_PERCENT_DEFAULT)) wu->workers_min_busy_time = 0.0;
|
||||
|
||||
if(wu->rd_workers_time_min)
|
||||
rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_min, (collected_number)((double)wu->workers_min_busy_time * WORKER_CHART_DECIMAL_PRECISION));
|
||||
|
||||
if(wu->rd_workers_time_max)
|
||||
rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_max, (collected_number)((double)wu->workers_max_busy_time * WORKER_CHART_DECIMAL_PRECISION));
|
||||
|
||||
if(wu->workers_total_duration == 0)
|
||||
rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_avg, 0);
|
||||
else
|
||||
rrddim_set_by_pointer(wu->st_workers_time, wu->rd_workers_time_avg, (collected_number)((double)wu->workers_total_busy_time * 100.0 * WORKER_CHART_DECIMAL_PRECISION / (double)wu->workers_total_duration));
|
||||
|
||||
rrdset_done(wu->st_workers_time);
|
||||
|
||||
// ----------------------------------------------------------------------
|
||||
|
||||
#ifdef __linux__
|
||||
if(wu->workers_cpu_registered || wu->st_workers_cpu) {
|
||||
if(unlikely(!wu->st_workers_cpu)) {
|
||||
char name[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(name, RRD_ID_LENGTH_MAX, "workers_cpu_%s", wu->name_lowercase);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.cpu", wu->name_lowercase);
|
||||
|
||||
wu->st_workers_cpu = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, name
|
||||
, NULL
|
||||
, wu->family
|
||||
, context
|
||||
, "Netdata Workers CPU Utilization (100% = all workers busy)"
|
||||
, "%"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, wu->priority + 1
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_AREA
|
||||
);
|
||||
}
|
||||
|
||||
if (unlikely(!wu->rd_workers_cpu_min && wu->workers_registered > 1))
|
||||
wu->rd_workers_cpu_min = rrddim_add(wu->st_workers_cpu, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
if (unlikely(!wu->rd_workers_cpu_max && wu->workers_registered > 1))
|
||||
wu->rd_workers_cpu_max = rrddim_add(wu->st_workers_cpu, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
if(unlikely(!wu->rd_workers_cpu_avg))
|
||||
wu->rd_workers_cpu_avg = rrddim_add(wu->st_workers_cpu, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
if(unlikely(wu->workers_cpu_min == WORKERS_MIN_PERCENT_DEFAULT)) wu->workers_cpu_min = 0.0;
|
||||
|
||||
if(wu->rd_workers_cpu_min)
|
||||
rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_min, (collected_number)(wu->workers_cpu_min * WORKER_CHART_DECIMAL_PRECISION));
|
||||
|
||||
if(wu->rd_workers_cpu_max)
|
||||
rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_max, (collected_number)(wu->workers_cpu_max * WORKER_CHART_DECIMAL_PRECISION));
|
||||
|
||||
if(wu->workers_cpu_registered == 0)
|
||||
rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, 0);
|
||||
else
|
||||
rrddim_set_by_pointer(wu->st_workers_cpu, wu->rd_workers_cpu_avg, (collected_number)( wu->workers_cpu_total * WORKER_CHART_DECIMAL_PRECISION / (NETDATA_DOUBLE)wu->workers_cpu_registered ));
|
||||
|
||||
rrdset_done(wu->st_workers_cpu);
|
||||
}
|
||||
#endif
|
||||
|
||||
// ----------------------------------------------------------------------
|
||||
|
||||
if(unlikely(!wu->st_workers_jobs_per_job_type)) {
|
||||
char name[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(name, RRD_ID_LENGTH_MAX, "workers_jobs_by_type_%s", wu->name_lowercase);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.jobs_started_by_type", wu->name_lowercase);
|
||||
|
||||
wu->st_workers_jobs_per_job_type = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, name
|
||||
, NULL
|
||||
, wu->family
|
||||
, context
|
||||
, "Netdata Workers Jobs Started by Type"
|
||||
, "jobs"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, wu->priority + 2
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
}
|
||||
|
||||
{
|
||||
size_t i;
|
||||
for(i = 0; i <= wu->workers_max_job_id ;i++) {
|
||||
if(unlikely(wu->per_job_type[i].type != WORKER_METRIC_IDLE_BUSY))
|
||||
continue;
|
||||
|
||||
if (wu->per_job_type[i].name) {
|
||||
|
||||
if(unlikely(!wu->per_job_type[i].rd_jobs_started))
|
||||
wu->per_job_type[i].rd_jobs_started = rrddim_add(wu->st_workers_jobs_per_job_type, string2str(wu->per_job_type[i].name), NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
rrddim_set_by_pointer(wu->st_workers_jobs_per_job_type, wu->per_job_type[i].rd_jobs_started, (collected_number)(wu->per_job_type[i].jobs_started));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
rrdset_done(wu->st_workers_jobs_per_job_type);
|
||||
|
||||
// ----------------------------------------------------------------------
|
||||
|
||||
if(unlikely(!wu->st_workers_busy_per_job_type)) {
|
||||
char name[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(name, RRD_ID_LENGTH_MAX, "workers_busy_time_by_type_%s", wu->name_lowercase);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.time_by_type", wu->name_lowercase);
|
||||
|
||||
wu->st_workers_busy_per_job_type = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, name
|
||||
, NULL
|
||||
, wu->family
|
||||
, context
|
||||
, "Netdata Workers Busy Time by Type"
|
||||
, "ms"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, wu->priority + 3
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
}
|
||||
|
||||
{
|
||||
size_t i;
|
||||
for(i = 0; i <= wu->workers_max_job_id ;i++) {
|
||||
if(unlikely(wu->per_job_type[i].type != WORKER_METRIC_IDLE_BUSY))
|
||||
continue;
|
||||
|
||||
if (wu->per_job_type[i].name) {
|
||||
|
||||
if(unlikely(!wu->per_job_type[i].rd_busy_time))
|
||||
wu->per_job_type[i].rd_busy_time = rrddim_add(wu->st_workers_busy_per_job_type, string2str(wu->per_job_type[i].name), NULL, 1, USEC_PER_MS, RRD_ALGORITHM_ABSOLUTE);
|
||||
|
||||
rrddim_set_by_pointer(wu->st_workers_busy_per_job_type, wu->per_job_type[i].rd_busy_time, (collected_number)(wu->per_job_type[i].busy_time));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
rrdset_done(wu->st_workers_busy_per_job_type);
|
||||
|
||||
// ----------------------------------------------------------------------
|
||||
|
||||
if(wu->st_workers_threads || wu->workers_registered > 1) {
|
||||
if(unlikely(!wu->st_workers_threads)) {
|
||||
char name[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(name, RRD_ID_LENGTH_MAX, "workers_threads_%s", wu->name_lowercase);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.threads", wu->name_lowercase);
|
||||
|
||||
wu->st_workers_threads = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, name
|
||||
, NULL
|
||||
, wu->family
|
||||
, context
|
||||
, "Netdata Workers Threads"
|
||||
, "threads"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, wu->priority + 4
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_STACKED
|
||||
);
|
||||
|
||||
wu->rd_workers_threads_free = rrddim_add(wu->st_workers_threads, "free", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
wu->rd_workers_threads_busy = rrddim_add(wu->st_workers_threads, "busy", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(wu->st_workers_threads, wu->rd_workers_threads_free, (collected_number)(wu->workers_registered - wu->workers_busy));
|
||||
rrddim_set_by_pointer(wu->st_workers_threads, wu->rd_workers_threads_busy, (collected_number)(wu->workers_busy));
|
||||
rrdset_done(wu->st_workers_threads);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------
|
||||
// custom metric types WORKER_METRIC_ABSOLUTE
|
||||
|
||||
{
|
||||
size_t i;
|
||||
for (i = 0; i <= wu->workers_max_job_id; i++) {
|
||||
if(wu->per_job_type[i].type != WORKER_METRIC_ABSOLUTE)
|
||||
continue;
|
||||
|
||||
if(!wu->per_job_type[i].count_value)
|
||||
continue;
|
||||
|
||||
if(!wu->per_job_type[i].st) {
|
||||
size_t job_name_len = string_strlen(wu->per_job_type[i].name);
|
||||
if(job_name_len > RRD_ID_LENGTH_MAX) job_name_len = RRD_ID_LENGTH_MAX;
|
||||
|
||||
char job_name_sanitized[job_name_len + 1];
|
||||
rrdset_strncpyz_name(job_name_sanitized, string2str(wu->per_job_type[i].name), job_name_len);
|
||||
|
||||
char name[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(name, RRD_ID_LENGTH_MAX, "workers_%s_value_%s", wu->name_lowercase, job_name_sanitized);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.value.%s", wu->name_lowercase, job_name_sanitized);
|
||||
|
||||
char title[1000 + 1];
|
||||
snprintf(title, 1000, "Netdata Workers %s value of %s", wu->name_lowercase, string2str(wu->per_job_type[i].name));
|
||||
|
||||
wu->per_job_type[i].st = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, name
|
||||
, NULL
|
||||
, wu->family
|
||||
, context
|
||||
, title
|
||||
, (wu->per_job_type[i].units)?string2str(wu->per_job_type[i].units):"value"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, wu->priority + 5 + i
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
wu->per_job_type[i].rd_min = rrddim_add(wu->per_job_type[i].st, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
wu->per_job_type[i].rd_max = rrddim_add(wu->per_job_type[i].st, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
wu->per_job_type[i].rd_avg = rrddim_add(wu->per_job_type[i].st, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_min, (collected_number)(wu->per_job_type[i].min_value * WORKER_CHART_DECIMAL_PRECISION));
|
||||
rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_max, (collected_number)(wu->per_job_type[i].max_value * WORKER_CHART_DECIMAL_PRECISION));
|
||||
rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_avg, (collected_number)(wu->per_job_type[i].sum_value / wu->per_job_type[i].count_value * WORKER_CHART_DECIMAL_PRECISION));
|
||||
|
||||
rrdset_done(wu->per_job_type[i].st);
|
||||
}
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------
|
||||
// custom metric types WORKER_METRIC_INCREMENTAL
|
||||
|
||||
{
|
||||
size_t i;
|
||||
for (i = 0; i <= wu->workers_max_job_id ; i++) {
|
||||
if(wu->per_job_type[i].type != WORKER_METRIC_INCREMENT && wu->per_job_type[i].type != WORKER_METRIC_INCREMENTAL_TOTAL)
|
||||
continue;
|
||||
|
||||
if(!wu->per_job_type[i].count_value)
|
||||
continue;
|
||||
|
||||
if(!wu->per_job_type[i].st) {
|
||||
size_t job_name_len = string_strlen(wu->per_job_type[i].name);
|
||||
if(job_name_len > RRD_ID_LENGTH_MAX) job_name_len = RRD_ID_LENGTH_MAX;
|
||||
|
||||
char job_name_sanitized[job_name_len + 1];
|
||||
rrdset_strncpyz_name(job_name_sanitized, string2str(wu->per_job_type[i].name), job_name_len);
|
||||
|
||||
char name[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintfz(name, RRD_ID_LENGTH_MAX, "workers_%s_rate_%s", wu->name_lowercase, job_name_sanitized);
|
||||
|
||||
char context[RRD_ID_LENGTH_MAX + 1];
|
||||
snprintf(context, RRD_ID_LENGTH_MAX, "netdata.workers.%s.rate.%s", wu->name_lowercase, job_name_sanitized);
|
||||
|
||||
char title[1000 + 1];
|
||||
snprintf(title, 1000, "Netdata Workers %s rate of %s", wu->name_lowercase, string2str(wu->per_job_type[i].name));
|
||||
|
||||
wu->per_job_type[i].st = rrdset_create_localhost(
|
||||
"netdata"
|
||||
, name
|
||||
, NULL
|
||||
, wu->family
|
||||
, context
|
||||
, title
|
||||
, (wu->per_job_type[i].units)?string2str(wu->per_job_type[i].units):"rate"
|
||||
, "netdata"
|
||||
, "stats"
|
||||
, wu->priority + 5 + i
|
||||
, localhost->rrd_update_every
|
||||
, RRDSET_TYPE_LINE
|
||||
);
|
||||
|
||||
wu->per_job_type[i].rd_min = rrddim_add(wu->per_job_type[i].st, "min", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
wu->per_job_type[i].rd_max = rrddim_add(wu->per_job_type[i].st, "max", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
wu->per_job_type[i].rd_avg = rrddim_add(wu->per_job_type[i].st, "average", NULL, 1, WORKER_CHART_DECIMAL_PRECISION, RRD_ALGORITHM_ABSOLUTE);
|
||||
}
|
||||
|
||||
rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_min, (collected_number)(wu->per_job_type[i].min_value * WORKER_CHART_DECIMAL_PRECISION));
|
||||
rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_max, (collected_number)(wu->per_job_type[i].max_value * WORKER_CHART_DECIMAL_PRECISION));
|
||||
rrddim_set_by_pointer(wu->per_job_type[i].st, wu->per_job_type[i].rd_avg, (collected_number)(wu->per_job_type[i].sum_value / wu->per_job_type[i].count_value * WORKER_CHART_DECIMAL_PRECISION));
|
||||
|
||||
rrdset_done(wu->per_job_type[i].st);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static void workers_utilization_reset_statistics(struct worker_utilization *wu) {
|
||||
wu->workers_registered = 0;
|
||||
wu->workers_busy = 0;
|
||||
wu->workers_total_busy_time = 0;
|
||||
wu->workers_total_duration = 0;
|
||||
wu->workers_total_jobs_started = 0;
|
||||
wu->workers_min_busy_time = WORKERS_MIN_PERCENT_DEFAULT;
|
||||
wu->workers_max_busy_time = 0;
|
||||
|
||||
wu->workers_cpu_registered = 0;
|
||||
wu->workers_cpu_min = WORKERS_MIN_PERCENT_DEFAULT;
|
||||
wu->workers_cpu_max = 0;
|
||||
wu->workers_cpu_total = 0;
|
||||
|
||||
size_t i;
|
||||
for(i = 0; i < WORKER_UTILIZATION_MAX_JOB_TYPES ;i++) {
|
||||
if(unlikely(!wu->name_lowercase)) {
|
||||
wu->name_lowercase = strdupz(wu->name);
|
||||
char *s = wu->name_lowercase;
|
||||
for( ; *s ; s++) *s = tolower(*s);
|
||||
}
|
||||
|
||||
wu->per_job_type[i].jobs_started = 0;
|
||||
wu->per_job_type[i].busy_time = 0;
|
||||
|
||||
wu->per_job_type[i].min_value = NAN;
|
||||
wu->per_job_type[i].max_value = NAN;
|
||||
wu->per_job_type[i].sum_value = NAN;
|
||||
wu->per_job_type[i].count_value = 0;
|
||||
}
|
||||
|
||||
struct worker_thread *wt;
|
||||
for(wt = wu->threads; wt ; wt = wt->next) {
|
||||
wt->enabled = false;
|
||||
wt->cpu_enabled = false;
|
||||
}
|
||||
}
|
||||
|
||||
#define TASK_STAT_PREFIX "/proc/self/task/"
|
||||
#define TASK_STAT_SUFFIX "/stat"
|
||||
|
||||
static int read_thread_cpu_time_from_proc_stat(pid_t pid __maybe_unused, kernel_uint_t *utime __maybe_unused, kernel_uint_t *stime __maybe_unused) {
|
||||
#ifdef __linux__
|
||||
static char filename[sizeof(TASK_STAT_PREFIX) + sizeof(TASK_STAT_SUFFIX) + 20] = TASK_STAT_PREFIX;
|
||||
static size_t start_pos = sizeof(TASK_STAT_PREFIX) - 1;
|
||||
static procfile *ff = NULL;
|
||||
|
||||
// construct the filename
|
||||
size_t end_pos = snprintfz(&filename[start_pos], 20, "%d", pid);
|
||||
strcpy(&filename[start_pos + end_pos], TASK_STAT_SUFFIX);
|
||||
|
||||
// (re)open the procfile to the new filename
|
||||
bool set_quotes = (ff == NULL) ? true : false;
|
||||
ff = procfile_reopen(ff, filename, NULL, PROCFILE_FLAG_ERROR_ON_ERROR_LOG);
|
||||
if(unlikely(!ff)) return -1;
|
||||
|
||||
if(set_quotes)
|
||||
procfile_set_open_close(ff, "(", ")");
|
||||
|
||||
// read the entire file and split it to lines and words
|
||||
ff = procfile_readall(ff);
|
||||
if(unlikely(!ff)) return -1;
|
||||
|
||||
// parse the numbers we are interested
|
||||
*utime = str2kernel_uint_t(procfile_lineword(ff, 0, 13));
|
||||
*stime = str2kernel_uint_t(procfile_lineword(ff, 0, 14));
|
||||
|
||||
// leave the file open for the next iteration
|
||||
|
||||
return 0;
|
||||
#else
|
||||
// TODO: add here cpu time detection per thread, for FreeBSD and MacOS
|
||||
*utime = 0;
|
||||
*stime = 0;
|
||||
return 1;
|
||||
#endif
|
||||
}
|
||||
|
||||
static Pvoid_t workers_by_pid_JudyL_array = NULL;
|
||||
|
||||
static void workers_threads_cleanup(struct worker_utilization *wu) {
|
||||
struct worker_thread *t = wu->threads;
|
||||
while(t) {
|
||||
struct worker_thread *next = t->next;
|
||||
|
||||
if(!t->enabled) {
|
||||
JudyLDel(&workers_by_pid_JudyL_array, t->pid, PJE0);
|
||||
DOUBLE_LINKED_LIST_REMOVE_ITEM_UNSAFE(wu->threads, t, prev, next);
|
||||
freez(t);
|
||||
}
|
||||
t = next;
|
||||
}
|
||||
}
|
||||
|
||||
static struct worker_thread *worker_thread_find(struct worker_utilization *wu __maybe_unused, pid_t pid) {
|
||||
struct worker_thread *wt = NULL;
|
||||
|
||||
Pvoid_t *PValue = JudyLGet(workers_by_pid_JudyL_array, pid, PJE0);
|
||||
if(PValue)
|
||||
wt = *PValue;
|
||||
|
||||
return wt;
|
||||
}
|
||||
|
||||
static struct worker_thread *worker_thread_create(struct worker_utilization *wu, pid_t pid) {
|
||||
struct worker_thread *wt;
|
||||
|
||||
wt = (struct worker_thread *)callocz(1, sizeof(struct worker_thread));
|
||||
wt->pid = pid;
|
||||
|
||||
Pvoid_t *PValue = JudyLIns(&workers_by_pid_JudyL_array, pid, PJE0);
|
||||
*PValue = wt;
|
||||
|
||||
// link it
|
||||
DOUBLE_LINKED_LIST_APPEND_ITEM_UNSAFE(wu->threads, wt, prev, next);
|
||||
|
||||
return wt;
|
||||
}
|
||||
|
||||
static struct worker_thread *worker_thread_find_or_create(struct worker_utilization *wu, pid_t pid) {
|
||||
struct worker_thread *wt;
|
||||
wt = worker_thread_find(wu, pid);
|
||||
if(!wt) wt = worker_thread_create(wu, pid);
|
||||
|
||||
return wt;
|
||||
}
|
||||
|
||||
static void worker_utilization_charts_callback(void *ptr
|
||||
, pid_t pid __maybe_unused
|
||||
, const char *thread_tag __maybe_unused
|
||||
, size_t max_job_id __maybe_unused
|
||||
, size_t utilization_usec __maybe_unused
|
||||
, size_t duration_usec __maybe_unused
|
||||
, size_t jobs_started __maybe_unused
|
||||
, size_t is_running __maybe_unused
|
||||
, STRING **job_types_names __maybe_unused
|
||||
, STRING **job_types_units __maybe_unused
|
||||
, WORKER_METRIC_TYPE *job_types_metric_types __maybe_unused
|
||||
, size_t *job_types_jobs_started __maybe_unused
|
||||
, usec_t *job_types_busy_time __maybe_unused
|
||||
, NETDATA_DOUBLE *job_types_custom_metrics __maybe_unused
|
||||
) {
|
||||
struct worker_utilization *wu = (struct worker_utilization *)ptr;
|
||||
|
||||
// find the worker_thread in the list
|
||||
struct worker_thread *wt = worker_thread_find_or_create(wu, pid);
|
||||
|
||||
if(utilization_usec > duration_usec)
|
||||
utilization_usec = duration_usec;
|
||||
|
||||
wt->enabled = true;
|
||||
wt->busy_time = utilization_usec;
|
||||
wt->jobs_started = jobs_started;
|
||||
|
||||
wt->utime_old = wt->utime;
|
||||
wt->stime_old = wt->stime;
|
||||
wt->collected_time_old = wt->collected_time;
|
||||
|
||||
if(max_job_id > wu->workers_max_job_id)
|
||||
wu->workers_max_job_id = max_job_id;
|
||||
|
||||
wu->workers_total_busy_time += utilization_usec;
|
||||
wu->workers_total_duration += duration_usec;
|
||||
wu->workers_total_jobs_started += jobs_started;
|
||||
wu->workers_busy += is_running;
|
||||
wu->workers_registered++;
|
||||
|
||||
double util = (double)utilization_usec * 100.0 / (double)duration_usec;
|
||||
if(util > wu->workers_max_busy_time)
|
||||
wu->workers_max_busy_time = util;
|
||||
|
||||
if(util < wu->workers_min_busy_time)
|
||||
wu->workers_min_busy_time = util;
|
||||
|
||||
// accumulate per job type statistics
|
||||
size_t i;
|
||||
for(i = 0; i <= max_job_id ;i++) {
|
||||
if(!wu->per_job_type[i].name && job_types_names[i])
|
||||
wu->per_job_type[i].name = string_dup(job_types_names[i]);
|
||||
|
||||
if(!wu->per_job_type[i].units && job_types_units[i])
|
||||
wu->per_job_type[i].units = string_dup(job_types_units[i]);
|
||||
|
||||
wu->per_job_type[i].type = job_types_metric_types[i];
|
||||
|
||||
wu->per_job_type[i].jobs_started += job_types_jobs_started[i];
|
||||
wu->per_job_type[i].busy_time += job_types_busy_time[i];
|
||||
|
||||
NETDATA_DOUBLE value = job_types_custom_metrics[i];
|
||||
if(netdata_double_isnumber(value)) {
|
||||
if(!wu->per_job_type[i].count_value) {
|
||||
wu->per_job_type[i].count_value = 1;
|
||||
wu->per_job_type[i].min_value = value;
|
||||
wu->per_job_type[i].max_value = value;
|
||||
wu->per_job_type[i].sum_value = value;
|
||||
}
|
||||
else {
|
||||
wu->per_job_type[i].count_value++;
|
||||
wu->per_job_type[i].sum_value += value;
|
||||
if(value < wu->per_job_type[i].min_value) wu->per_job_type[i].min_value = value;
|
||||
if(value > wu->per_job_type[i].max_value) wu->per_job_type[i].max_value = value;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// find its CPU utilization
|
||||
if((!read_thread_cpu_time_from_proc_stat(pid, &wt->utime, &wt->stime))) {
|
||||
wt->collected_time = now_realtime_usec();
|
||||
usec_t delta = wt->collected_time - wt->collected_time_old;
|
||||
|
||||
double utime = (double)(wt->utime - wt->utime_old) / (double)system_hz * 100.0 * (double)USEC_PER_SEC / (double)delta;
|
||||
double stime = (double)(wt->stime - wt->stime_old) / (double)system_hz * 100.0 * (double)USEC_PER_SEC / (double)delta;
|
||||
double cpu = utime + stime;
|
||||
wt->cpu = cpu;
|
||||
wt->cpu_enabled = true;
|
||||
|
||||
wu->workers_cpu_total += cpu;
|
||||
if(cpu < wu->workers_cpu_min) wu->workers_cpu_min = cpu;
|
||||
if(cpu > wu->workers_cpu_max) wu->workers_cpu_max = cpu;
|
||||
}
|
||||
wu->workers_cpu_registered += (wt->cpu_enabled) ? 1 : 0;
|
||||
}
|
||||
|
||||
void telemetry_workers_cleanup(void) {
|
||||
int i, j;
|
||||
for(i = 0; all_workers_utilization[i].name ;i++) {
|
||||
struct worker_utilization *wu = &all_workers_utilization[i];
|
||||
|
||||
if(wu->name_lowercase) {
|
||||
freez(wu->name_lowercase);
|
||||
wu->name_lowercase = NULL;
|
||||
}
|
||||
|
||||
for(j = 0; j < WORKER_UTILIZATION_MAX_JOB_TYPES ;j++) {
|
||||
string_freez(wu->per_job_type[j].name);
|
||||
wu->per_job_type[j].name = NULL;
|
||||
|
||||
string_freez(wu->per_job_type[j].units);
|
||||
wu->per_job_type[j].units = NULL;
|
||||
}
|
||||
|
||||
// mark all threads as not enabled
|
||||
struct worker_thread *t;
|
||||
for(t = wu->threads; t ; t = t->next)
|
||||
t->enabled = false;
|
||||
|
||||
// let the cleanup job free them
|
||||
workers_threads_cleanup(wu);
|
||||
}
|
||||
}
|
||||
|
||||
void telemetry_workers_do(bool extended) {
|
||||
if(!extended) return;
|
||||
|
||||
static size_t iterations = 0;
|
||||
iterations++;
|
||||
|
||||
for(int i = 0; all_workers_utilization[i].name ;i++) {
|
||||
workers_utilization_reset_statistics(&all_workers_utilization[i]);
|
||||
|
||||
workers_foreach(all_workers_utilization[i].name, worker_utilization_charts_callback, &all_workers_utilization[i]);
|
||||
|
||||
// skip the first iteration, so that we don't accumulate startup utilization to our charts
|
||||
if(likely(iterations > 1))
|
||||
workers_utilization_update_chart(&all_workers_utilization[i]);
|
||||
|
||||
workers_threads_cleanup(&all_workers_utilization[i]);
|
||||
}
|
||||
|
||||
workers_total_cpu_utilization_chart();
|
||||
}
|
13
src/daemon/telemetry/telemetry-workers.h
Normal file
13
src/daemon/telemetry/telemetry-workers.h
Normal file
|
@ -0,0 +1,13 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_WORKERS_H
|
||||
#define NETDATA_TELEMETRY_WORKERS_H
|
||||
|
||||
#include "daemon/common.h"
|
||||
|
||||
#if defined(TELEMETRY_INTERNALS)
|
||||
void telemetry_workers_do(bool extended);
|
||||
void telemetry_workers_cleanup(void);
|
||||
#endif
|
||||
|
||||
#endif //NETDATA_TELEMETRY_WORKERS_H
|
198
src/daemon/telemetry/telemetry.c
Normal file
198
src/daemon/telemetry/telemetry.c
Normal file
|
@ -0,0 +1,198 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#define TELEMETRY_INTERNALS 1
|
||||
#include "daemon/common.h"
|
||||
|
||||
#define WORKER_JOB_TELEMETRY_DAEMON 0
|
||||
#define WORKER_JOB_SQLITE3 1
|
||||
#define WORKER_JOB_TELEMETRY_HTTP_API 2
|
||||
#define WORKER_JOB_TELEMETRY_QUERIES 3
|
||||
#define WORKER_JOB_TELEMETRY_INGESTION 4
|
||||
#define WORKER_JOB_DBENGINE 5
|
||||
#define WORKER_JOB_STRINGS 6
|
||||
#define WORKER_JOB_DICTIONARIES 7
|
||||
#define WORKER_JOB_TELEMETRY_ML 8
|
||||
#define WORKER_JOB_TELEMETRY_GORILLA 9
|
||||
#define WORKER_JOB_HEARTBEAT 10
|
||||
#define WORKER_JOB_WORKERS 11
|
||||
#define WORKER_JOB_MALLOC_TRACE 12
|
||||
#define WORKER_JOB_REGISTRY 13
|
||||
#define WORKER_JOB_ARAL 14
|
||||
|
||||
#if WORKER_UTILIZATION_MAX_JOB_TYPES < 15
|
||||
#error "WORKER_UTILIZATION_MAX_JOB_TYPES has to be at least 14"
|
||||
#endif
|
||||
|
||||
bool telemetry_enabled = true;
|
||||
bool telemetry_extended_enabled = false;
|
||||
|
||||
static void telemetry_register_workers(void) {
|
||||
worker_register("STATS");
|
||||
|
||||
worker_register_job_name(WORKER_JOB_TELEMETRY_DAEMON, "daemon");
|
||||
worker_register_job_name(WORKER_JOB_SQLITE3, "sqlite3");
|
||||
worker_register_job_name(WORKER_JOB_TELEMETRY_HTTP_API, "http-api");
|
||||
worker_register_job_name(WORKER_JOB_TELEMETRY_QUERIES, "queries");
|
||||
worker_register_job_name(WORKER_JOB_TELEMETRY_INGESTION, "ingestion");
|
||||
worker_register_job_name(WORKER_JOB_DBENGINE, "dbengine");
|
||||
worker_register_job_name(WORKER_JOB_STRINGS, "strings");
|
||||
worker_register_job_name(WORKER_JOB_DICTIONARIES, "dictionaries");
|
||||
worker_register_job_name(WORKER_JOB_TELEMETRY_ML, "ML");
|
||||
worker_register_job_name(WORKER_JOB_TELEMETRY_GORILLA, "gorilla");
|
||||
worker_register_job_name(WORKER_JOB_HEARTBEAT, "heartbeat");
|
||||
worker_register_job_name(WORKER_JOB_WORKERS, "workers");
|
||||
worker_register_job_name(WORKER_JOB_MALLOC_TRACE, "malloc_trace");
|
||||
worker_register_job_name(WORKER_JOB_REGISTRY, "registry");
|
||||
worker_register_job_name(WORKER_JOB_ARAL, "aral");
|
||||
}
|
||||
|
||||
static void telementry_cleanup(void *pptr)
|
||||
{
|
||||
struct netdata_static_thread *static_thread = CLEANUP_FUNCTION_GET_PTR(pptr);
|
||||
if(!static_thread) return;
|
||||
|
||||
static_thread->enabled = NETDATA_MAIN_THREAD_EXITING;
|
||||
|
||||
telemetry_workers_cleanup();
|
||||
worker_unregister();
|
||||
netdata_log_info("cleaning up...");
|
||||
|
||||
static_thread->enabled = NETDATA_MAIN_THREAD_EXITED;
|
||||
}
|
||||
|
||||
void *telemetry_thread_main(void *ptr) {
|
||||
CLEANUP_FUNCTION_REGISTER(telementry_cleanup) cleanup_ptr = ptr;
|
||||
telemetry_register_workers();
|
||||
|
||||
int update_every =
|
||||
(int)config_get_duration_seconds(CONFIG_SECTION_TELEMETRY, "update every", localhost->rrd_update_every);
|
||||
if (update_every < localhost->rrd_update_every) {
|
||||
update_every = localhost->rrd_update_every;
|
||||
config_set_duration_seconds(CONFIG_SECTION_TELEMETRY, "update every", update_every);
|
||||
}
|
||||
|
||||
telemerty_aral_init();
|
||||
|
||||
usec_t step = update_every * USEC_PER_SEC;
|
||||
heartbeat_t hb;
|
||||
heartbeat_init(&hb, USEC_PER_SEC);
|
||||
usec_t real_step = USEC_PER_SEC;
|
||||
|
||||
// keep the randomness at zero
|
||||
// to make sure we are not close to any other thread
|
||||
hb.randomness = 0;
|
||||
|
||||
while (service_running(SERVICE_COLLECTORS)) {
|
||||
worker_is_idle();
|
||||
heartbeat_next(&hb);
|
||||
if (real_step < step) {
|
||||
real_step += USEC_PER_SEC;
|
||||
continue;
|
||||
}
|
||||
real_step = USEC_PER_SEC;
|
||||
|
||||
worker_is_busy(WORKER_JOB_TELEMETRY_INGESTION);
|
||||
telemetry_ingestion_do(telemetry_extended_enabled);
|
||||
|
||||
worker_is_busy(WORKER_JOB_TELEMETRY_HTTP_API);
|
||||
telemetry_web_do(telemetry_extended_enabled);
|
||||
|
||||
worker_is_busy(WORKER_JOB_TELEMETRY_QUERIES);
|
||||
telemetry_queries_do(telemetry_extended_enabled);
|
||||
|
||||
worker_is_busy(WORKER_JOB_TELEMETRY_ML);
|
||||
telemetry_ml_do(telemetry_extended_enabled);
|
||||
|
||||
worker_is_busy(WORKER_JOB_TELEMETRY_GORILLA);
|
||||
telemetry_gorilla_do(telemetry_extended_enabled);
|
||||
|
||||
worker_is_busy(WORKER_JOB_HEARTBEAT);
|
||||
telemetry_heartbeat_do(telemetry_extended_enabled);
|
||||
|
||||
#ifdef ENABLE_DBENGINE
|
||||
if(dbengine_enabled) {
|
||||
worker_is_busy(WORKER_JOB_DBENGINE);
|
||||
telemetry_dbengine_do(telemetry_extended_enabled);
|
||||
}
|
||||
#endif
|
||||
|
||||
worker_is_busy(WORKER_JOB_REGISTRY);
|
||||
registry_statistics();
|
||||
|
||||
worker_is_busy(WORKER_JOB_STRINGS);
|
||||
telemetry_string_do(telemetry_extended_enabled);
|
||||
|
||||
#ifdef DICT_WITH_STATS
|
||||
worker_is_busy(WORKER_JOB_DICTIONARIES);
|
||||
telemetry_dictionary_do(telemetry_extended_enabled);
|
||||
#endif
|
||||
|
||||
#ifdef NETDATA_TRACE_ALLOCATIONS
|
||||
worker_is_busy(WORKER_JOB_MALLOC_TRACE);
|
||||
telemetry_trace_allocations_do(telemetry_extended_enabled);
|
||||
#endif
|
||||
|
||||
worker_is_busy(WORKER_JOB_WORKERS);
|
||||
telemetry_workers_do(telemetry_extended_enabled);
|
||||
|
||||
worker_is_busy(WORKER_JOB_ARAL);
|
||||
telemetry_aral_do(telemetry_extended_enabled);
|
||||
|
||||
// keep this last to have access to the memory counters
|
||||
// exposed by everyone else
|
||||
worker_is_busy(WORKER_JOB_TELEMETRY_DAEMON);
|
||||
telemetry_daemon_do(telemetry_extended_enabled);
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------------------------------------
|
||||
// telemetry extended thread
|
||||
|
||||
static void telemetry_thread_sqlite3_cleanup(void *pptr)
|
||||
{
|
||||
struct netdata_static_thread *static_thread = CLEANUP_FUNCTION_GET_PTR(pptr);
|
||||
if (!static_thread)
|
||||
return;
|
||||
|
||||
static_thread->enabled = NETDATA_MAIN_THREAD_EXITING;
|
||||
|
||||
netdata_log_info("cleaning up...");
|
||||
|
||||
worker_unregister();
|
||||
|
||||
static_thread->enabled = NETDATA_MAIN_THREAD_EXITED;
|
||||
}
|
||||
|
||||
void *telemetry_thread_sqlite3_main(void *ptr) {
|
||||
CLEANUP_FUNCTION_REGISTER(telemetry_thread_sqlite3_cleanup) cleanup_ptr = ptr;
|
||||
telemetry_register_workers();
|
||||
|
||||
int update_every =
|
||||
(int)config_get_duration_seconds(CONFIG_SECTION_TELEMETRY, "update every", localhost->rrd_update_every);
|
||||
if (update_every < localhost->rrd_update_every) {
|
||||
update_every = localhost->rrd_update_every;
|
||||
config_set_duration_seconds(CONFIG_SECTION_TELEMETRY, "update every", update_every);
|
||||
}
|
||||
|
||||
usec_t step = update_every * USEC_PER_SEC;
|
||||
heartbeat_t hb;
|
||||
heartbeat_init(&hb, USEC_PER_SEC);
|
||||
usec_t real_step = USEC_PER_SEC;
|
||||
|
||||
while (service_running(SERVICE_COLLECTORS)) {
|
||||
worker_is_idle();
|
||||
heartbeat_next(&hb);
|
||||
if (real_step < step) {
|
||||
real_step += USEC_PER_SEC;
|
||||
continue;
|
||||
}
|
||||
real_step = USEC_PER_SEC;
|
||||
|
||||
worker_is_busy(WORKER_JOB_SQLITE3);
|
||||
telemetry_sqlite3_do(telemetry_extended_enabled);
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
30
src/daemon/telemetry/telemetry.h
Normal file
30
src/daemon/telemetry/telemetry.h
Normal file
|
@ -0,0 +1,30 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_TELEMETRY_H
|
||||
#define NETDATA_TELEMETRY_H 1
|
||||
|
||||
#include "database/rrd.h"
|
||||
|
||||
extern bool telemetry_enabled;
|
||||
extern bool telemetry_extended_enabled;
|
||||
|
||||
#include "telemetry-http-api.h"
|
||||
#include "telemetry-queries.h"
|
||||
#include "telemetry-ingestion.h"
|
||||
#include "telemetry-ml.h"
|
||||
#include "telemetry-gorilla.h"
|
||||
#include "telemetry-daemon.h"
|
||||
#include "telemetry-daemon-memory.h"
|
||||
#include "telemetry-sqlite3.h"
|
||||
#include "telemetry-dbengine.h"
|
||||
#include "telemetry-string.h"
|
||||
#include "telemetry-heartbeat.h"
|
||||
#include "telemetry-dictionary.h"
|
||||
#include "telemetry-workers.h"
|
||||
#include "telemetry-trace-allocations.h"
|
||||
#include "telemetry-aral.h"
|
||||
|
||||
void *telemetry_thread_main(void *ptr);
|
||||
void *telemetry_thread_sqlite3_main(void *ptr);
|
||||
|
||||
#endif /* NETDATA_TELEMETRY_H */
|
|
@ -215,7 +215,7 @@ static void rrdhost_receiver_to_json(BUFFER *wb, RRDHOST_STATUS *s, const char *
|
|||
buffer_json_member_add_object(wb, key);
|
||||
{
|
||||
buffer_json_member_add_uint64(wb, "id", s->ingest.id);
|
||||
buffer_json_member_add_uint64(wb, "hops", s->ingest.hops);
|
||||
buffer_json_member_add_int64(wb, "hops", s->ingest.hops);
|
||||
buffer_json_member_add_string(wb, "type", rrdhost_ingest_type_to_string(s->ingest.type));
|
||||
buffer_json_member_add_string(wb, "status", rrdhost_ingest_status_to_string(s->ingest.status));
|
||||
buffer_json_member_add_time_t(wb, "since", s->ingest.since);
|
||||
|
@ -272,15 +272,13 @@ static void rrdhost_sender_to_json(BUFFER *wb, RRDHOST_STATUS *s, const char *ke
|
|||
if (s->stream.status == RRDHOST_STREAM_STATUS_OFFLINE)
|
||||
buffer_json_member_add_string(wb, "reason", stream_handshake_error_to_string(s->stream.reason));
|
||||
|
||||
if (s->stream.status == RRDHOST_STREAM_STATUS_REPLICATING) {
|
||||
buffer_json_member_add_object(wb, "replication");
|
||||
{
|
||||
buffer_json_member_add_boolean(wb, "in_progress", s->stream.replication.in_progress);
|
||||
buffer_json_member_add_double(wb, "completion", s->stream.replication.completion);
|
||||
buffer_json_member_add_uint64(wb, "instances", s->stream.replication.instances);
|
||||
}
|
||||
buffer_json_object_close(wb);
|
||||
buffer_json_member_add_object(wb, "replication");
|
||||
{
|
||||
buffer_json_member_add_boolean(wb, "in_progress", s->stream.replication.in_progress);
|
||||
buffer_json_member_add_double(wb, "completion", s->stream.replication.completion);
|
||||
buffer_json_member_add_uint64(wb, "instances", s->stream.replication.instances);
|
||||
}
|
||||
buffer_json_object_close(wb); // replication
|
||||
|
||||
buffer_json_member_add_object(wb, "destination");
|
||||
{
|
||||
|
@ -300,35 +298,14 @@ static void rrdhost_sender_to_json(BUFFER *wb, RRDHOST_STATUS *s, const char *ke
|
|||
buffer_json_member_add_uint64(wb, "metadata", s->stream.sent_bytes_on_this_connection_per_type[STREAM_TRAFFIC_TYPE_METADATA]);
|
||||
buffer_json_member_add_uint64(wb, "functions", s->stream.sent_bytes_on_this_connection_per_type[STREAM_TRAFFIC_TYPE_FUNCTIONS]);
|
||||
buffer_json_member_add_uint64(wb, "replication", s->stream.sent_bytes_on_this_connection_per_type[STREAM_TRAFFIC_TYPE_REPLICATION]);
|
||||
buffer_json_member_add_uint64(wb, "dyncfg", s->stream.sent_bytes_on_this_connection_per_type[STREAM_TRAFFIC_TYPE_DYNCFG]);
|
||||
}
|
||||
buffer_json_object_close(wb); // traffic
|
||||
|
||||
buffer_json_member_add_array(wb, "candidates");
|
||||
struct rrdpush_destinations *d;
|
||||
for (d = s->host->destinations; d; d = d->next) {
|
||||
buffer_json_add_array_item_object(wb);
|
||||
buffer_json_member_add_uint64(wb, "attempts", d->attempts);
|
||||
{
|
||||
buffer_json_member_add_array(wb, "parents");
|
||||
rrdhost_stream_parents_to_json(wb, s);
|
||||
buffer_json_array_close(wb); // parents
|
||||
|
||||
if (d->ssl) {
|
||||
snprintfz(buf, sizeof(buf) - 1, "%s:SSL", string2str(d->destination));
|
||||
buffer_json_member_add_string(wb, "destination", buf);
|
||||
}
|
||||
else
|
||||
buffer_json_member_add_string(wb, "destination", string2str(d->destination));
|
||||
|
||||
buffer_json_member_add_time_t(wb, "since", d->since);
|
||||
buffer_json_member_add_time_t(wb, "age", s->now - d->since);
|
||||
buffer_json_member_add_string(wb, "last_handshake", stream_handshake_error_to_string(d->reason));
|
||||
if(d->postpone_reconnection_until > s->now) {
|
||||
buffer_json_member_add_time_t(wb, "next_check", d->postpone_reconnection_until);
|
||||
buffer_json_member_add_time_t(wb, "next_in", d->postpone_reconnection_until - s->now);
|
||||
}
|
||||
}
|
||||
buffer_json_object_close(wb); // each candidate
|
||||
}
|
||||
buffer_json_array_close(wb); // candidates
|
||||
rrdhost_stream_path_to_json(wb, s->host, STREAM_PATH_JSON_MEMBER, false);
|
||||
}
|
||||
buffer_json_object_close(wb); // destination
|
||||
}
|
||||
|
@ -365,7 +342,7 @@ static inline void rrdhost_health_to_json_v2(BUFFER *wb, const char *key, RRDHOS
|
|||
buffer_json_member_add_object(wb, key);
|
||||
{
|
||||
buffer_json_member_add_string(wb, "status", rrdhost_health_status_to_string(s->health.status));
|
||||
if (s->health.status == RRDHOST_HEALTH_STATUS_RUNNING) {
|
||||
if (s->health.status == RRDHOST_HEALTH_STATUS_RUNNING || s->health.status == RRDHOST_HEALTH_STATUS_INITIALIZING) {
|
||||
buffer_json_member_add_object(wb, "alerts");
|
||||
{
|
||||
buffer_json_member_add_uint64(wb, "critical", s->health.alerts.critical);
|
||||
|
|
|
@ -49,7 +49,7 @@ void buffer_json_agents_v2(BUFFER *wb, struct query_timings *timings, time_t now
|
|||
available_instances += __atomic_load_n(&host->rrdctx.instances_count, __ATOMIC_RELAXED);
|
||||
available_contexts += __atomic_load_n(&host->rrdctx.contexts_count, __ATOMIC_RELAXED);
|
||||
|
||||
if(rrdhost_flag_check(host, RRDHOST_FLAG_RRDPUSH_SENDER_CONNECTED))
|
||||
if(rrdhost_flag_check(host, RRDHOST_FLAG_STREAM_SENDER_CONNECTED))
|
||||
sending++;
|
||||
|
||||
if (rrdhost_is_online(host)) {
|
||||
|
@ -103,12 +103,12 @@ void buffer_json_agents_v2(BUFFER *wb, struct query_timings *timings, time_t now
|
|||
buffer_json_object_close(wb); // api
|
||||
|
||||
buffer_json_member_add_array(wb, "db_size");
|
||||
size_t group_seconds = localhost->rrd_update_every;
|
||||
size_t group_seconds;
|
||||
for (size_t tier = 0; tier < storage_tiers; tier++) {
|
||||
STORAGE_ENGINE *eng = localhost->db[tier].eng;
|
||||
if (!eng) continue;
|
||||
|
||||
group_seconds *= storage_tiers_grouping_iterations[tier];
|
||||
group_seconds = get_tier_grouping(tier) * localhost->rrd_update_every;
|
||||
uint64_t max = storage_engine_disk_space_max(eng->seb, localhost->db[tier].si);
|
||||
uint64_t used = storage_engine_disk_space_used(eng->seb, localhost->db[tier].si);
|
||||
#ifdef ENABLE_DBENGINE
|
||||
|
|
|
@ -81,7 +81,7 @@ void contexts_v2_alert_config_to_json_from_sql_alert_config_data(struct sql_aler
|
|||
{
|
||||
buffer_json_member_add_string(wb, "type", "agent");
|
||||
buffer_json_member_add_string(wb, "exec", t->notification.exec ? t->notification.exec : NULL);
|
||||
buffer_json_member_add_string(wb, "to", t->notification.to_key ? t->notification.to_key : string2str(localhost->health.health_default_recipient));
|
||||
buffer_json_member_add_string(wb, "to", t->notification.to_key ? t->notification.to_key : string2str(localhost->health.default_recipient));
|
||||
buffer_json_member_add_string(wb, "delay", t->notification.delay);
|
||||
buffer_json_member_add_string(wb, "repeat", t->notification.repeat);
|
||||
buffer_json_member_add_string(wb, "options", t->notification.options);
|
||||
|
|
|
@ -240,7 +240,7 @@ static void contexts_v2_alert_transition_callback(struct sql_alert_transition_da
|
|||
[ATF_CLASS] = t->classification,
|
||||
[ATF_TYPE] = t->type,
|
||||
[ATF_COMPONENT] = t->component,
|
||||
[ATF_ROLE] = t->recipient && *t->recipient ? t->recipient : string2str(localhost->health.health_default_recipient),
|
||||
[ATF_ROLE] = t->recipient && *t->recipient ? t->recipient : string2str(localhost->health.default_recipient),
|
||||
[ATF_NODE] = machine_guid,
|
||||
[ATF_ALERT_NAME] = t->alert_name,
|
||||
[ATF_CHART_NAME] = t->chart_name,
|
||||
|
@ -411,9 +411,9 @@ void contexts_v2_alert_transitions_to_json(BUFFER *wb, struct rrdcontext_to_json
|
|||
buffer_json_member_add_time_t(wb, "delay", t->delay);
|
||||
buffer_json_member_add_time_t(wb, "delay_up_to_time", t->delay_up_to_timestamp);
|
||||
health_entry_flags_to_json_array(wb, "flags", t->flags);
|
||||
buffer_json_member_add_string(wb, "exec", *t->exec ? t->exec : string2str(localhost->health.health_default_exec));
|
||||
buffer_json_member_add_string(wb, "exec", *t->exec ? t->exec : string2str(localhost->health.default_exec));
|
||||
buffer_json_member_add_uint64(wb, "exec_code", t->exec_code);
|
||||
buffer_json_member_add_string(wb, "to", *t->recipient ? t->recipient : string2str(localhost->health.health_default_recipient));
|
||||
buffer_json_member_add_string(wb, "to", *t->recipient ? t->recipient : string2str(localhost->health.default_recipient));
|
||||
}
|
||||
buffer_json_object_close(wb); // notification
|
||||
}
|
||||
|
|
|
@ -159,7 +159,7 @@ Then `x 2` is the worst case estimate for the dirty queue. If all collected metr
|
|||
|
||||
The memory we saved with the above is used to improve the LRU cache. So, although we reserved 32MiB for the LRU, in bigger setups (Netdata Parents) the LRU grows a lot more, within the limits of the equation.
|
||||
|
||||
In practice, the main cache sizes itself with `hot x 1.5` instead of `host x 2`. The reason is that 5% of the main cache is reserved for expanding open cache, 5% for expanding extent cache, and we need Room for the extensive buffers that are allocated in these setups. When the main cache exceeds `hot x 1.5` it enters a mode of critical evictions, and aggressively frees pages from the LRU to maintain a healthy memory footprint within its design limits.
|
||||
In practice, the main cache sizes itself with `hot x 1.5` instead of `hot x 2`. The reason is that 5% of the main cache is reserved for expanding open cache, 5% for expanding extent cache, and we need Room for the extensive buffers that are allocated in these setups. When the main cache exceeds `hot x 1.5` it enters a mode of critical evictions, and aggressively frees pages from the LRU to maintain a healthy memory footprint within its design limits.
|
||||
|
||||
#### Open Cache
|
||||
|
||||
|
|
File diff suppressed because it is too large
Load diff
|
@ -14,12 +14,12 @@ typedef struct pgc_page PGC_PAGE;
|
|||
|
||||
typedef enum __attribute__ ((__packed__)) {
|
||||
PGC_OPTIONS_NONE = 0,
|
||||
PGC_OPTIONS_EVICT_PAGES_INLINE = (1 << 0),
|
||||
PGC_OPTIONS_FLUSH_PAGES_INLINE = (1 << 1),
|
||||
PGC_OPTIONS_AUTOSCALE = (1 << 2),
|
||||
PGC_OPTIONS_EVICT_PAGES_NO_INLINE = (1 << 0),
|
||||
PGC_OPTIONS_FLUSH_PAGES_NO_INLINE = (1 << 1),
|
||||
PGC_OPTIONS_AUTOSCALE = (1 << 2),
|
||||
} PGC_OPTIONS;
|
||||
|
||||
#define PGC_OPTIONS_DEFAULT (PGC_OPTIONS_EVICT_PAGES_INLINE | PGC_OPTIONS_FLUSH_PAGES_INLINE | PGC_OPTIONS_AUTOSCALE)
|
||||
#define PGC_OPTIONS_DEFAULT (PGC_OPTIONS_EVICT_PAGES_NO_INLINE | PGC_OPTIONS_AUTOSCALE)
|
||||
|
||||
typedef struct pgc_entry {
|
||||
Word_t section; // the section this belongs to
|
||||
|
@ -33,137 +33,138 @@ typedef struct pgc_entry {
|
|||
uint8_t *custom_data;
|
||||
} PGC_ENTRY;
|
||||
|
||||
#define PGC_CACHE_LINE_PADDING(x) uint8_t padding##x[64]
|
||||
struct pgc_size_histogram_entry {
|
||||
size_t upto;
|
||||
size_t count;
|
||||
};
|
||||
|
||||
#define PGC_SIZE_HISTOGRAM_ENTRIES 15
|
||||
#define PGC_QUEUE_HOT 0
|
||||
#define PGC_QUEUE_DIRTY 1
|
||||
#define PGC_QUEUE_CLEAN 2
|
||||
|
||||
struct pgc_size_histogram {
|
||||
struct pgc_size_histogram_entry array[PGC_SIZE_HISTOGRAM_ENTRIES];
|
||||
};
|
||||
|
||||
struct pgc_queue_statistics {
|
||||
size_t entries;
|
||||
size_t size;
|
||||
struct pgc_size_histogram size_histogram;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(1);
|
||||
alignas(64) size_t entries;
|
||||
alignas(64) size_t size;
|
||||
|
||||
size_t max_entries;
|
||||
size_t max_size;
|
||||
alignas(64) size_t max_entries;
|
||||
alignas(64) size_t max_size;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(2);
|
||||
alignas(64) size_t added_entries;
|
||||
alignas(64) size_t added_size;
|
||||
|
||||
size_t added_entries;
|
||||
size_t added_size;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(3);
|
||||
|
||||
size_t removed_entries;
|
||||
size_t removed_size;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(4);
|
||||
alignas(64) size_t removed_entries;
|
||||
alignas(64) size_t removed_size;
|
||||
};
|
||||
|
||||
struct pgc_statistics {
|
||||
size_t wanted_cache_size;
|
||||
size_t current_cache_size;
|
||||
alignas(64) size_t wanted_cache_size;
|
||||
alignas(64) size_t current_cache_size;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(1);
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// volume
|
||||
|
||||
size_t added_entries;
|
||||
size_t added_size;
|
||||
alignas(64) size_t entries; // all the entries (includes clean, dirty, hot)
|
||||
alignas(64) size_t size; // all the entries (includes clean, dirty, hot)
|
||||
|
||||
PGC_CACHE_LINE_PADDING(2);
|
||||
alignas(64) size_t referenced_entries; // all the entries currently referenced
|
||||
alignas(64) size_t referenced_size; // all the entries currently referenced
|
||||
|
||||
size_t removed_entries;
|
||||
size_t removed_size;
|
||||
alignas(64) size_t added_entries;
|
||||
alignas(64) size_t added_size;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(3);
|
||||
|
||||
size_t entries; // all the entries (includes clean, dirty, hot)
|
||||
size_t size; // all the entries (includes clean, dirty, hot)
|
||||
|
||||
size_t evicting_entries;
|
||||
size_t evicting_size;
|
||||
|
||||
size_t flushing_entries;
|
||||
size_t flushing_size;
|
||||
|
||||
size_t hot2dirty_entries;
|
||||
size_t hot2dirty_size;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(4);
|
||||
|
||||
size_t acquires;
|
||||
PGC_CACHE_LINE_PADDING(4a);
|
||||
size_t releases;
|
||||
PGC_CACHE_LINE_PADDING(4b);
|
||||
size_t acquires_for_deletion;
|
||||
PGC_CACHE_LINE_PADDING(4c);
|
||||
|
||||
size_t referenced_entries; // all the entries currently referenced
|
||||
size_t referenced_size; // all the entries currently referenced
|
||||
|
||||
PGC_CACHE_LINE_PADDING(5);
|
||||
|
||||
size_t searches_exact;
|
||||
size_t searches_exact_hits;
|
||||
size_t searches_exact_misses;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(6);
|
||||
|
||||
size_t searches_closest;
|
||||
size_t searches_closest_hits;
|
||||
size_t searches_closest_misses;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(7);
|
||||
|
||||
size_t flushes_completed;
|
||||
size_t flushes_completed_size;
|
||||
size_t flushes_cancelled;
|
||||
size_t flushes_cancelled_size;
|
||||
alignas(64) size_t removed_entries;
|
||||
alignas(64) size_t removed_size;
|
||||
|
||||
#ifdef PGC_COUNT_POINTS_COLLECTED
|
||||
PGC_CACHE_LINE_PADDING(8);
|
||||
size_t points_collected;
|
||||
alignas(64) size_t points_collected;
|
||||
#endif
|
||||
|
||||
PGC_CACHE_LINE_PADDING(9);
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// migrations
|
||||
|
||||
size_t insert_spins;
|
||||
size_t evict_spins;
|
||||
size_t release_spins;
|
||||
size_t acquire_spins;
|
||||
size_t delete_spins;
|
||||
size_t flush_spins;
|
||||
alignas(64) size_t evicting_entries;
|
||||
alignas(64) size_t evicting_size;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(10);
|
||||
alignas(64) size_t flushing_entries;
|
||||
alignas(64) size_t flushing_size;
|
||||
|
||||
size_t workers_search;
|
||||
size_t workers_add;
|
||||
size_t workers_evict;
|
||||
size_t workers_flush;
|
||||
size_t workers_jv2_flush;
|
||||
size_t workers_hot2dirty;
|
||||
alignas(64) size_t hot2dirty_entries;
|
||||
alignas(64) size_t hot2dirty_size;
|
||||
|
||||
size_t evict_skipped;
|
||||
size_t hot_empty_pages_evicted_immediately;
|
||||
size_t hot_empty_pages_evicted_later;
|
||||
alignas(64) size_t hot_empty_pages_evicted_immediately;
|
||||
alignas(64) size_t hot_empty_pages_evicted_later;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(11);
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// workload
|
||||
|
||||
// events
|
||||
size_t events_cache_under_severe_pressure;
|
||||
size_t events_cache_needs_space_aggressively;
|
||||
size_t events_flush_critical;
|
||||
alignas(64) size_t acquires;
|
||||
alignas(64) size_t releases;
|
||||
|
||||
PGC_CACHE_LINE_PADDING(12);
|
||||
alignas(64) size_t acquires_for_deletion;
|
||||
|
||||
struct {
|
||||
PGC_CACHE_LINE_PADDING(0);
|
||||
struct pgc_queue_statistics hot;
|
||||
PGC_CACHE_LINE_PADDING(1);
|
||||
struct pgc_queue_statistics dirty;
|
||||
PGC_CACHE_LINE_PADDING(2);
|
||||
struct pgc_queue_statistics clean;
|
||||
PGC_CACHE_LINE_PADDING(3);
|
||||
} queues;
|
||||
alignas(64) size_t searches_exact;
|
||||
alignas(64) size_t searches_exact_hits;
|
||||
alignas(64) size_t searches_exact_misses;
|
||||
|
||||
alignas(64) size_t searches_closest;
|
||||
alignas(64) size_t searches_closest_hits;
|
||||
alignas(64) size_t searches_closest_misses;
|
||||
|
||||
alignas(64) size_t flushes_completed;
|
||||
alignas(64) size_t flushes_completed_size;
|
||||
alignas(64) size_t flushes_cancelled_size;
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// critical events
|
||||
|
||||
alignas(64) size_t events_cache_under_severe_pressure;
|
||||
alignas(64) size_t events_cache_needs_space_aggressively;
|
||||
alignas(64) size_t events_flush_critical;
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// worker threads
|
||||
|
||||
alignas(64) size_t workers_search;
|
||||
alignas(64) size_t workers_add;
|
||||
alignas(64) size_t workers_evict;
|
||||
alignas(64) size_t workers_flush;
|
||||
alignas(64) size_t workers_jv2_flush;
|
||||
alignas(64) size_t workers_hot2dirty;
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// waste events
|
||||
|
||||
// waste events - spins
|
||||
alignas(64) size_t waste_insert_spins;
|
||||
alignas(64) size_t waste_evict_useless_spins;
|
||||
alignas(64) size_t waste_release_spins;
|
||||
alignas(64) size_t waste_acquire_spins;
|
||||
alignas(64) size_t waste_delete_spins;
|
||||
|
||||
// waste events - eviction
|
||||
alignas(64) size_t waste_evict_relocated;
|
||||
alignas(64) size_t waste_evict_thread_signals;
|
||||
alignas(64) size_t waste_evictions_inline_on_add;
|
||||
alignas(64) size_t waste_evictions_inline_on_release;
|
||||
|
||||
// waste events - flushing
|
||||
alignas(64) size_t waste_flush_on_add;
|
||||
alignas(64) size_t waste_flush_on_release;
|
||||
alignas(64) size_t waste_flushes_cancelled;
|
||||
|
||||
// ----------------------------------------------------------------------------------------------------------------
|
||||
// per queue statistics
|
||||
|
||||
struct pgc_queue_statistics queues[3];
|
||||
};
|
||||
|
||||
|
||||
typedef void (*free_clean_page_callback)(PGC *cache, PGC_ENTRY entry);
|
||||
typedef void (*save_dirty_page_callback)(PGC *cache, PGC_ENTRY *entries_array, PGC_PAGE **pages_array, size_t entries);
|
||||
typedef void (*save_dirty_init_callback)(PGC *cache, Word_t section);
|
||||
|
@ -225,7 +226,7 @@ size_t pgc_get_current_cache_size(PGC *cache);
|
|||
size_t pgc_get_wanted_cache_size(PGC *cache);
|
||||
|
||||
// resetting the end time of a hot page
|
||||
void pgc_page_hot_set_end_time_s(PGC *cache, PGC_PAGE *page, time_t end_time_s);
|
||||
void pgc_page_hot_set_end_time_s(PGC *cache, PGC_PAGE *page, time_t end_time_s, size_t additional_bytes);
|
||||
bool pgc_page_to_clean_evict_or_release(PGC *cache, PGC_PAGE *page);
|
||||
|
||||
typedef void (*migrate_to_v2_callback)(Word_t section, unsigned datafile_fileno, uint8_t type, Pvoid_t JudyL_metrics, Pvoid_t JudyL_extents_pos, size_t count_of_unique_extents, size_t count_of_unique_metrics, size_t count_of_unique_pages, void *data);
|
||||
|
@ -237,26 +238,33 @@ size_t pgc_count_hot_pages_having_data_ptr(PGC *cache, Word_t section, void *ptr
|
|||
typedef size_t (*dynamic_target_cache_size_callback)(void);
|
||||
void pgc_set_dynamic_target_cache_size_callback(PGC *cache, dynamic_target_cache_size_callback callback);
|
||||
|
||||
typedef size_t (*nominal_page_size_callback)(void *);
|
||||
void pgc_set_nominal_page_size_callback(PGC *cache, nominal_page_size_callback callback);
|
||||
|
||||
// return true when there is more work to do
|
||||
bool pgc_evict_pages(PGC *cache, size_t max_skip, size_t max_evict);
|
||||
bool pgc_flush_pages(PGC *cache, size_t max_flushes);
|
||||
bool pgc_flush_pages(PGC *cache);
|
||||
|
||||
struct pgc_statistics pgc_get_statistics(PGC *cache);
|
||||
size_t pgc_hot_and_dirty_entries(PGC *cache);
|
||||
|
||||
struct aral_statistics *pgc_aral_statistics(void);
|
||||
size_t pgc_aral_structures(void);
|
||||
size_t pgc_aral_overhead(void);
|
||||
|
||||
static inline size_t indexing_partition(Word_t ptr, Word_t modulo) __attribute__((const));
|
||||
static inline size_t indexing_partition(Word_t ptr, Word_t modulo) {
|
||||
#ifdef ENV64BIT
|
||||
uint64_t hash = murmur64(ptr);
|
||||
XXH64_hash_t hash = XXH3_64bits(&ptr, sizeof(ptr));
|
||||
return hash % modulo;
|
||||
#else
|
||||
uint32_t hash = murmur32(ptr);
|
||||
return hash % modulo;
|
||||
#endif
|
||||
}
|
||||
|
||||
long get_netdata_cpus(void);
|
||||
|
||||
static inline size_t pgc_max_evictors(void) {
|
||||
return 1 + get_netdata_cpus() / 2;
|
||||
}
|
||||
|
||||
static inline size_t pgc_max_flushers(void) {
|
||||
return get_netdata_cpus();
|
||||
}
|
||||
|
||||
#endif // DBENGINE_CACHE_H
|
||||
|
|
|
@ -251,7 +251,7 @@ int create_data_file(struct rrdengine_datafile *datafile)
|
|||
char path[RRDENG_PATH_MAX];
|
||||
|
||||
generate_datafilepath(datafile, path, sizeof(path));
|
||||
fd = open_file_for_io(path, O_CREAT | O_RDWR | O_TRUNC, &file, use_direct_io);
|
||||
fd = open_file_for_io(path, O_CREAT | O_RDWR | O_TRUNC, &file, dbengine_use_direct_io);
|
||||
if (fd < 0) {
|
||||
ctx_fs_error(ctx);
|
||||
return fd;
|
||||
|
@ -334,7 +334,7 @@ static int load_data_file(struct rrdengine_datafile *datafile)
|
|||
char path[RRDENG_PATH_MAX];
|
||||
|
||||
generate_datafilepath(datafile, path, sizeof(path));
|
||||
fd = open_file_for_io(path, O_RDWR, &file, use_direct_io);
|
||||
fd = open_file_for_io(path, O_RDWR, &file, dbengine_use_direct_io);
|
||||
if (fd < 0) {
|
||||
ctx_fs_error(ctx);
|
||||
return fd;
|
||||
|
|
|
@ -22,13 +22,13 @@ static RRDHOST *dbengine_rrdhost_find_or_create(char *name) {
|
|||
default_rrd_history_entries,
|
||||
RRD_MEMORY_MODE_DBENGINE,
|
||||
health_plugin_enabled(),
|
||||
stream_conf_send_enabled,
|
||||
stream_conf_send_destination,
|
||||
stream_conf_send_api_key,
|
||||
stream_conf_send_charts_matching,
|
||||
stream_conf_replication_enabled,
|
||||
stream_conf_replication_period,
|
||||
stream_conf_replication_step,
|
||||
stream_send.enabled,
|
||||
stream_send.parents.destination,
|
||||
stream_send.api_key,
|
||||
stream_send.send_charts_matching,
|
||||
stream_receive.replication.enabled,
|
||||
stream_receive.replication.period,
|
||||
stream_receive.replication.step,
|
||||
NULL,
|
||||
0
|
||||
);
|
||||
|
|
|
@ -108,13 +108,13 @@ static RRDHOST *dbengine_rrdhost_find_or_create(char *name) {
|
|||
default_rrd_history_entries,
|
||||
RRD_MEMORY_MODE_DBENGINE,
|
||||
health_plugin_enabled(),
|
||||
stream_conf_send_enabled,
|
||||
stream_conf_send_destination,
|
||||
stream_conf_send_api_key,
|
||||
stream_conf_send_charts_matching,
|
||||
stream_conf_replication_enabled,
|
||||
stream_conf_replication_period,
|
||||
stream_conf_replication_step,
|
||||
stream_send.enabled,
|
||||
stream_send.parents.destination,
|
||||
stream_send.api_key,
|
||||
stream_send.send_charts_matching,
|
||||
stream_receive.replication.enabled,
|
||||
stream_receive.replication.period,
|
||||
stream_receive.replication.step,
|
||||
NULL,
|
||||
0
|
||||
);
|
||||
|
|
|
@ -577,7 +577,7 @@ int journalfile_create(struct rrdengine_journalfile *journalfile, struct rrdengi
|
|||
char path[RRDENG_PATH_MAX];
|
||||
|
||||
journalfile_v1_generate_path(datafile, path, sizeof(path));
|
||||
fd = open_file_for_io(path, O_CREAT | O_RDWR | O_TRUNC, &file, use_direct_io);
|
||||
fd = open_file_for_io(path, O_CREAT | O_RDWR | O_TRUNC, &file, dbengine_use_direct_io);
|
||||
if (fd < 0) {
|
||||
ctx_fs_error(ctx);
|
||||
return fd;
|
||||
|
@ -1522,7 +1522,7 @@ int journalfile_load(struct rrdengine_instance *ctx, struct rrdengine_journalfil
|
|||
|
||||
journalfile_v1_generate_path(datafile, path, sizeof(path));
|
||||
|
||||
fd = open_file_for_io(path, O_RDWR, &file, use_direct_io);
|
||||
fd = open_file_for_io(path, O_RDWR, &file, dbengine_use_direct_io);
|
||||
if (fd < 0) {
|
||||
ctx_fs_error(ctx);
|
||||
|
||||
|
|
|
@ -375,6 +375,7 @@ inline MRG *mrg_create(ssize_t partitions) {
|
|||
|
||||
mrg->index[i].aral = aral_create(buf, sizeof(METRIC), 0, 16384, &mrg_aral_statistics, NULL, NULL, false, false);
|
||||
}
|
||||
telemetry_aral_register(mrg->index[0].aral, "mrg");
|
||||
|
||||
return mrg;
|
||||
}
|
||||
|
@ -394,7 +395,7 @@ inline void mrg_destroy(MRG *mrg __maybe_unused) {
|
|||
// to delete entries, the caller needs to keep pointers to them
|
||||
// and delete them one by one
|
||||
|
||||
;
|
||||
telemetry_aral_unregister(mrg->index[0].aral);
|
||||
}
|
||||
|
||||
inline METRIC *mrg_metric_add_and_acquire(MRG *mrg, MRG_ENTRY entry, bool *ret) {
|
||||
|
|
|
@ -6,6 +6,8 @@
|
|||
|
||||
typedef enum __attribute__((packed)) {
|
||||
PAGE_OPTION_ALL_VALUES_EMPTY = (1 << 0),
|
||||
PAGE_OPTION_ARAL_MARKED = (1 << 1),
|
||||
PAGE_OPTION_ARAL_UNMARKED = (1 << 2),
|
||||
} PAGE_OPTIONS;
|
||||
|
||||
typedef enum __attribute__((packed)) {
|
||||
|
@ -17,31 +19,32 @@ typedef enum __attribute__((packed)) {
|
|||
|
||||
typedef struct {
|
||||
uint8_t *data;
|
||||
uint32_t size;
|
||||
uint16_t size;
|
||||
} page_raw_t;
|
||||
|
||||
|
||||
typedef struct {
|
||||
size_t num_buffers;
|
||||
gorilla_writer_t *writer;
|
||||
int aral_index;
|
||||
uint16_t num_buffers;
|
||||
} page_gorilla_t;
|
||||
|
||||
struct pgd {
|
||||
// the used number of slots in the page
|
||||
uint16_t used;
|
||||
|
||||
// the total number of slots available in the page
|
||||
uint16_t slots;
|
||||
|
||||
// the page type
|
||||
uint8_t type;
|
||||
|
||||
// options related to the page
|
||||
// the partition this pgd was allocated from
|
||||
uint8_t partition;
|
||||
|
||||
// options related to the page
|
||||
PAGE_OPTIONS options;
|
||||
|
||||
PGD_STATES states;
|
||||
|
||||
// the uses number of slots in the page
|
||||
uint32_t used;
|
||||
|
||||
// the total number of slots available in the page
|
||||
uint32_t slots;
|
||||
|
||||
union {
|
||||
page_raw_t raw;
|
||||
page_gorilla_t gorilla;
|
||||
|
@ -51,118 +54,241 @@ struct pgd {
|
|||
// ----------------------------------------------------------------------------
|
||||
// memory management
|
||||
|
||||
#define ARAL_TOLERANCE_TO_DEDUP 7 // deduplicate aral sizes, if the delta is below this number of bytes
|
||||
#define PGD_ARAL_PARTITIONS 4
|
||||
|
||||
struct {
|
||||
ARAL *aral_pgd;
|
||||
ARAL *aral_data[RRD_STORAGE_TIERS];
|
||||
ARAL *aral_gorilla_buffer[4];
|
||||
ARAL *aral_gorilla_writer[4];
|
||||
size_t sizeof_pgd;
|
||||
size_t sizeof_gorilla_writer_t;
|
||||
size_t sizeof_gorilla_buffer_32bit;
|
||||
|
||||
ARAL *aral_pgd[PGD_ARAL_PARTITIONS];
|
||||
ARAL *aral_gorilla_buffer[PGD_ARAL_PARTITIONS];
|
||||
ARAL *aral_gorilla_writer[PGD_ARAL_PARTITIONS];
|
||||
} pgd_alloc_globals = {};
|
||||
|
||||
static ARAL *pgd_aral_data_lookup(size_t size)
|
||||
{
|
||||
for (size_t tier = 0; tier < storage_tiers; tier++)
|
||||
if (size == tier_page_size[tier])
|
||||
return pgd_alloc_globals.aral_data[tier];
|
||||
#if RRD_STORAGE_TIERS != 5
|
||||
#error "You need to update the slots reserved for storage tiers"
|
||||
#endif
|
||||
|
||||
return NULL;
|
||||
static struct aral_statistics aral_statistics_for_pgd = { 0 };
|
||||
|
||||
static size_t aral_sizes_delta;
|
||||
static size_t aral_sizes_count;
|
||||
static size_t aral_sizes[] = {
|
||||
// // leave space for the storage tier page sizes
|
||||
[RRD_STORAGE_TIERS - 5] = 0,
|
||||
[RRD_STORAGE_TIERS - 4] = 0,
|
||||
[RRD_STORAGE_TIERS - 3] = 0,
|
||||
[RRD_STORAGE_TIERS - 2] = 0,
|
||||
[RRD_STORAGE_TIERS - 1] = 0,
|
||||
|
||||
// gorilla buffer size
|
||||
RRDENG_GORILLA_32BIT_BUFFER_SIZE,
|
||||
|
||||
// our structures
|
||||
sizeof(gorilla_writer_t),
|
||||
sizeof(PGD),
|
||||
};
|
||||
static ARAL **arals = NULL;
|
||||
|
||||
#define arals_slot(slot, partition) ((partition) * aral_sizes_count + (slot))
|
||||
static ARAL *pgd_get_aral_by_size_and_partition(size_t size, size_t partition);
|
||||
|
||||
size_t pgd_aral_structures(void) {
|
||||
return aral_structures(pgd_alloc_globals.aral_pgd[0]);
|
||||
}
|
||||
|
||||
void pgd_init_arals(void)
|
||||
{
|
||||
// pgd aral
|
||||
{
|
||||
char buf[20 + 1];
|
||||
snprintfz(buf, sizeof(buf) - 1, "pgd");
|
||||
size_t pgd_aral_overhead(void) {
|
||||
return aral_overhead(pgd_alloc_globals.aral_pgd[0]);
|
||||
}
|
||||
|
||||
// FIXME: add stats
|
||||
pgd_alloc_globals.aral_pgd = aral_create(
|
||||
buf,
|
||||
sizeof(struct pgd),
|
||||
64,
|
||||
512 * (sizeof(struct pgd)),
|
||||
pgc_aral_statistics(),
|
||||
NULL, NULL, false, false);
|
||||
int aral_size_sort_compare(const void *a, const void *b) {
|
||||
size_t size_a = *(const size_t *)a;
|
||||
size_t size_b = *(const size_t *)b;
|
||||
return (size_a > size_b) - (size_a < size_b);
|
||||
}
|
||||
|
||||
void pgd_init_arals(void) {
|
||||
aral_sizes_count = _countof(aral_sizes);
|
||||
|
||||
for(size_t i = 0; i < RRD_STORAGE_TIERS ;i++)
|
||||
aral_sizes[i] = tier_page_size[i];
|
||||
|
||||
size_t max_delta = 0;
|
||||
for(size_t i = 0; i < aral_sizes_count ;i++) {
|
||||
size_t wanted = aral_sizes[i];
|
||||
size_t usable = aral_sizes[i]; /* aral_allocation_slot_size(wanted, true);*/
|
||||
internal_fatal(usable < wanted, "usable cannot be less than wanted");
|
||||
if(usable > wanted && usable - wanted > max_delta)
|
||||
max_delta = usable - wanted;
|
||||
|
||||
aral_sizes[i] = usable;
|
||||
}
|
||||
aral_sizes_delta = max_delta + ARAL_TOLERANCE_TO_DEDUP;
|
||||
|
||||
// tier page aral
|
||||
{
|
||||
for (size_t i = storage_tiers; i > 0 ;i--)
|
||||
{
|
||||
size_t tier = storage_tiers - i;
|
||||
// sort the array
|
||||
qsort(aral_sizes, aral_sizes_count, sizeof(size_t), aral_size_sort_compare);
|
||||
|
||||
char buf[20 + 1];
|
||||
snprintfz(buf, sizeof(buf) - 1, "tier%zu-pages", tier);
|
||||
// deduplicate (with some tolerance)
|
||||
size_t unique_count = 1;
|
||||
for (size_t i = 1; i < aral_sizes_count; ++i) {
|
||||
if (aral_sizes[i] > aral_sizes[unique_count - 1] + aral_sizes_delta)
|
||||
aral_sizes[unique_count++] = aral_sizes[i];
|
||||
else
|
||||
aral_sizes[unique_count - 1] = aral_sizes[i];
|
||||
}
|
||||
aral_sizes_count = unique_count;
|
||||
|
||||
pgd_alloc_globals.aral_data[tier] = aral_create(
|
||||
buf,
|
||||
tier_page_size[tier],
|
||||
64,
|
||||
512 * (tier_page_size[tier]),
|
||||
pgc_aral_statistics(),
|
||||
NULL, NULL, false, false);
|
||||
// clear the rest
|
||||
for(size_t i = unique_count; i < _countof(aral_sizes) ;i++)
|
||||
aral_sizes[i] = 0;
|
||||
|
||||
// allocate all the arals
|
||||
arals = callocz(aral_sizes_count * PGD_ARAL_PARTITIONS, sizeof(ARAL *));
|
||||
for(size_t slot = 0; slot < aral_sizes_count ; slot++) {
|
||||
for(size_t partition = 0; partition < PGD_ARAL_PARTITIONS; partition++) {
|
||||
|
||||
if(partition > 0 && aral_sizes[slot] > 128) {
|
||||
// do not create partitions for sizes above 128 bytes
|
||||
// use the first partition for all of them
|
||||
arals[arals_slot(slot, partition)] = arals[arals_slot(slot, 0)];
|
||||
continue;
|
||||
}
|
||||
|
||||
char buf[32];
|
||||
snprintfz(buf, sizeof(buf), "pgd-%zu-%zu", aral_sizes[slot], partition);
|
||||
|
||||
arals[arals_slot(slot, partition)] = aral_create(
|
||||
buf,
|
||||
aral_sizes[slot],
|
||||
0,
|
||||
0,
|
||||
&aral_statistics_for_pgd,
|
||||
NULL, NULL, false, false);
|
||||
}
|
||||
}
|
||||
|
||||
// gorilla buffers aral
|
||||
for (size_t i = 0; i != 4; i++) {
|
||||
char buf[20 + 1];
|
||||
snprintfz(buf, sizeof(buf) - 1, "gbuffer-%zu", i);
|
||||
for(size_t p = 0; p < PGD_ARAL_PARTITIONS ;p++) {
|
||||
pgd_alloc_globals.aral_pgd[p] = pgd_get_aral_by_size_and_partition(sizeof(PGD), p);
|
||||
pgd_alloc_globals.aral_gorilla_writer[p] = pgd_get_aral_by_size_and_partition(sizeof(gorilla_writer_t), p);
|
||||
pgd_alloc_globals.aral_gorilla_buffer[p] = pgd_get_aral_by_size_and_partition(RRDENG_GORILLA_32BIT_BUFFER_SIZE, p);
|
||||
|
||||
// FIXME: add stats
|
||||
pgd_alloc_globals.aral_gorilla_buffer[i] = aral_create(
|
||||
buf,
|
||||
RRDENG_GORILLA_32BIT_BUFFER_SIZE,
|
||||
64,
|
||||
512 * RRDENG_GORILLA_32BIT_BUFFER_SIZE,
|
||||
pgc_aral_statistics(),
|
||||
NULL, NULL, false, false);
|
||||
internal_fatal(!pgd_alloc_globals.aral_pgd[p] ||
|
||||
!pgd_alloc_globals.aral_gorilla_writer[p] ||
|
||||
!pgd_alloc_globals.aral_gorilla_buffer[p]
|
||||
, "required PGD aral sizes not found");
|
||||
}
|
||||
|
||||
// gorilla writers aral
|
||||
for (size_t i = 0; i != 4; i++) {
|
||||
char buf[20 + 1];
|
||||
snprintfz(buf, sizeof(buf) - 1, "gwriter-%zu", i);
|
||||
pgd_alloc_globals.sizeof_pgd = aral_actual_element_size(pgd_alloc_globals.aral_pgd[0]);
|
||||
pgd_alloc_globals.sizeof_gorilla_writer_t = aral_actual_element_size(pgd_alloc_globals.aral_gorilla_writer[0]);
|
||||
pgd_alloc_globals.sizeof_gorilla_buffer_32bit = aral_actual_element_size(pgd_alloc_globals.aral_gorilla_buffer[0]);
|
||||
|
||||
// FIXME: add stats
|
||||
pgd_alloc_globals.aral_gorilla_writer[i] = aral_create(
|
||||
buf,
|
||||
sizeof(gorilla_writer_t),
|
||||
64,
|
||||
512 * sizeof(gorilla_writer_t),
|
||||
pgc_aral_statistics(),
|
||||
NULL, NULL, false, false);
|
||||
}
|
||||
telemetry_aral_register(pgd_alloc_globals.aral_pgd[0], "pgd");
|
||||
}
|
||||
|
||||
static void *pgd_data_aral_alloc(size_t size)
|
||||
{
|
||||
ARAL *ar = pgd_aral_data_lookup(size);
|
||||
if (!ar)
|
||||
static ARAL *pgd_get_aral_by_size_and_partition(size_t size, size_t partition) {
|
||||
internal_fatal(partition >= PGD_ARAL_PARTITIONS, "Wrong partition %zu", partition);
|
||||
|
||||
size_t slot;
|
||||
|
||||
if (size <= aral_sizes[0])
|
||||
slot = 0;
|
||||
|
||||
else if (size > aral_sizes[aral_sizes_count - 1])
|
||||
return NULL;
|
||||
|
||||
else {
|
||||
// binary search for the smallest size >= requested size
|
||||
size_t low = 0, high = aral_sizes_count - 1;
|
||||
while (low < high) {
|
||||
size_t mid = low + (high - low) / 2;
|
||||
if (aral_sizes[mid] >= size)
|
||||
high = mid;
|
||||
else
|
||||
low = mid + 1;
|
||||
}
|
||||
slot = low; // This is the smallest index where aral_sizes[slot] >= size
|
||||
}
|
||||
internal_fatal(slot >= aral_sizes_count || aral_sizes[slot] < size, "Invalid PGD size binary search");
|
||||
|
||||
ARAL *ar = arals[arals_slot(slot, partition)];
|
||||
internal_fatal(!ar || aral_requested_element_size(ar) < size, "Invalid PGD aral lookup");
|
||||
return ar;
|
||||
}
|
||||
|
||||
static inline gorilla_writer_t *pgd_gorilla_writer_alloc(size_t partition) {
|
||||
internal_fatal(partition >= PGD_ARAL_PARTITIONS, "invalid gorilla writer partition %zu", partition);
|
||||
return aral_mallocz_marked(pgd_alloc_globals.aral_gorilla_writer[partition]);
|
||||
}
|
||||
|
||||
static inline gorilla_buffer_t *pgd_gorilla_buffer_alloc(size_t partition) {
|
||||
internal_fatal(partition >= PGD_ARAL_PARTITIONS, "invalid gorilla buffer partition %zu", partition);
|
||||
return aral_mallocz_marked(pgd_alloc_globals.aral_gorilla_buffer[partition]);
|
||||
}
|
||||
|
||||
static inline PGD *pgd_alloc(bool for_collector) {
|
||||
size_t partition = gettid_cached() % PGD_ARAL_PARTITIONS;
|
||||
PGD *pgd;
|
||||
|
||||
if(for_collector)
|
||||
pgd = aral_mallocz_marked(pgd_alloc_globals.aral_pgd[partition]);
|
||||
else
|
||||
pgd = aral_mallocz(pgd_alloc_globals.aral_pgd[partition]);
|
||||
|
||||
pgd->partition = partition;
|
||||
return pgd;
|
||||
}
|
||||
|
||||
static inline void *pgd_data_alloc(size_t size, size_t partition, bool for_collector) {
|
||||
ARAL *ar = pgd_get_aral_by_size_and_partition(size, partition);
|
||||
if(ar) {
|
||||
if(for_collector)
|
||||
return aral_mallocz_marked(ar);
|
||||
else
|
||||
return aral_mallocz(ar);
|
||||
}
|
||||
else
|
||||
return mallocz(size);
|
||||
else
|
||||
return aral_mallocz(ar);
|
||||
}
|
||||
|
||||
static void pgd_data_aral_free(void *page, size_t size)
|
||||
{
|
||||
ARAL *ar = pgd_aral_data_lookup(size);
|
||||
if (!ar)
|
||||
freez(page);
|
||||
else
|
||||
static void pgd_data_free(void *page, size_t size, size_t partition) {
|
||||
ARAL *ar = pgd_get_aral_by_size_and_partition(size, partition);
|
||||
if(ar)
|
||||
aral_freez(ar, page);
|
||||
else
|
||||
freez(page);
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_MAIN_PGD_TIER1_ARAL);
|
||||
}
|
||||
|
||||
static void pgd_data_unmark(void *page, size_t size, size_t partition) {
|
||||
if(!page) return;
|
||||
|
||||
ARAL *ar = pgd_get_aral_by_size_and_partition(size, partition);
|
||||
if(ar)
|
||||
aral_unmark_allocation(ar, page);
|
||||
}
|
||||
|
||||
static size_t pgd_data_footprint(size_t size, size_t partition) {
|
||||
ARAL *ar = pgd_get_aral_by_size_and_partition(size, partition);
|
||||
if(ar)
|
||||
return aral_actual_element_size(ar);
|
||||
else
|
||||
return size;
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// management api
|
||||
|
||||
PGD *pgd_create(uint8_t type, uint32_t slots)
|
||||
{
|
||||
PGD *pg = aral_mallocz(pgd_alloc_globals.aral_pgd);
|
||||
PGD *pgd_create(uint8_t type, uint32_t slots) {
|
||||
|
||||
PGD *pg = pgd_alloc(true); // this is malloc'd !
|
||||
pg->type = type;
|
||||
pg->states = PGD_STATE_CREATED_FROM_COLLECTOR;
|
||||
pg->options = PAGE_OPTION_ALL_VALUES_EMPTY | PAGE_OPTION_ARAL_MARKED;
|
||||
|
||||
pg->used = 0;
|
||||
pg->slots = slots;
|
||||
pg->options = PAGE_OPTION_ALL_VALUES_EMPTY;
|
||||
pg->states = PGD_STATE_CREATED_FROM_COLLECTOR;
|
||||
|
||||
switch (type) {
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
|
@ -173,23 +299,20 @@ PGD *pgd_create(uint8_t type, uint32_t slots)
|
|||
"DBENGINE: invalid number of slots (%u) or page type (%u)", slots, type);
|
||||
|
||||
pg->raw.size = size;
|
||||
pg->raw.data = pgd_data_aral_alloc(size);
|
||||
pg->raw.data = pgd_data_alloc(size, pg->partition, true);
|
||||
break;
|
||||
}
|
||||
case RRDENG_PAGE_TYPE_GORILLA_32BIT: {
|
||||
internal_fatal(slots == 1,
|
||||
"DBENGINE: invalid number of slots (%u) or page type (%u)", slots, type);
|
||||
|
||||
pg->slots = 8 * RRDENG_GORILLA_32BIT_BUFFER_SLOTS;
|
||||
|
||||
// allocate new gorilla writer
|
||||
pg->gorilla.aral_index = gettid_cached() % 4;
|
||||
pg->gorilla.writer = aral_mallocz(pgd_alloc_globals.aral_gorilla_writer[pg->gorilla.aral_index]);
|
||||
pg->gorilla.writer = pgd_gorilla_writer_alloc(pg->partition);
|
||||
|
||||
// allocate new gorilla buffer
|
||||
gorilla_buffer_t *gbuf = aral_mallocz(pgd_alloc_globals.aral_gorilla_buffer[pg->gorilla.aral_index]);
|
||||
gorilla_buffer_t *gbuf = pgd_gorilla_buffer_alloc(pg->partition);
|
||||
memset(gbuf, 0, RRDENG_GORILLA_32BIT_BUFFER_SIZE);
|
||||
global_statistics_gorilla_buffer_add_hot();
|
||||
telemetry_gorilla_hot_buffer_added();
|
||||
|
||||
*pg->gorilla.writer = gorilla_writer_init(gbuf, RRDENG_GORILLA_32BIT_BUFFER_SLOTS);
|
||||
pg->gorilla.num_buffers = 1;
|
||||
|
@ -198,7 +321,7 @@ PGD *pgd_create(uint8_t type, uint32_t slots)
|
|||
}
|
||||
default:
|
||||
netdata_log_error("%s() - Unknown page type: %uc", __FUNCTION__, type);
|
||||
aral_freez(pgd_alloc_globals.aral_pgd, pg);
|
||||
aral_freez(pgd_alloc_globals.aral_pgd[pg->partition], pg);
|
||||
pg = PGD_EMPTY;
|
||||
break;
|
||||
}
|
||||
|
@ -206,52 +329,47 @@ PGD *pgd_create(uint8_t type, uint32_t slots)
|
|||
return pg;
|
||||
}
|
||||
|
||||
PGD *pgd_create_from_disk_data(uint8_t type, void *base, uint32_t size)
|
||||
{
|
||||
if (!size)
|
||||
PGD *pgd_create_from_disk_data(uint8_t type, void *base, uint32_t size) {
|
||||
|
||||
if (!size || size < page_type_size[type])
|
||||
return PGD_EMPTY;
|
||||
|
||||
if (size < page_type_size[type])
|
||||
return PGD_EMPTY;
|
||||
|
||||
PGD *pg = aral_mallocz(pgd_alloc_globals.aral_pgd);
|
||||
|
||||
PGD *pg = pgd_alloc(false); // this is malloc'd !
|
||||
pg->type = type;
|
||||
pg->states = PGD_STATE_CREATED_FROM_DISK;
|
||||
pg->options = ~PAGE_OPTION_ALL_VALUES_EMPTY;
|
||||
pg->options = PAGE_OPTION_ARAL_UNMARKED;
|
||||
|
||||
switch (type)
|
||||
{
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
case RRDENG_PAGE_TYPE_ARRAY_TIER1:
|
||||
pg->raw.size = size;
|
||||
pg->used = size / page_type_size[type];
|
||||
pg->slots = pg->used;
|
||||
|
||||
pg->raw.data = pgd_data_aral_alloc(size);
|
||||
pg->raw.size = size;
|
||||
pg->raw.data = pgd_data_alloc(size, pg->partition, false);
|
||||
memcpy(pg->raw.data, base, size);
|
||||
break;
|
||||
|
||||
case RRDENG_PAGE_TYPE_GORILLA_32BIT:
|
||||
internal_fatal(size == 0, "Asked to create page with 0 data!!!");
|
||||
internal_fatal(size % sizeof(uint32_t), "Unaligned gorilla buffer size");
|
||||
internal_fatal(size % RRDENG_GORILLA_32BIT_BUFFER_SIZE, "Expected size to be a multiple of %zu-bytes",
|
||||
RRDENG_GORILLA_32BIT_BUFFER_SIZE);
|
||||
|
||||
pg->raw.data = mallocz(size);
|
||||
pg->raw.data = (void *)pgd_data_alloc(size, pg->partition, false);
|
||||
pg->raw.size = size;
|
||||
|
||||
// TODO: rm this
|
||||
memset(pg->raw.data, 0, size);
|
||||
memcpy(pg->raw.data, base, size);
|
||||
memcpy(pg->raw.data, base, pg->raw.size);
|
||||
|
||||
uint32_t total_entries = gorilla_buffer_patch((void *) pg->raw.data);
|
||||
|
||||
pg->used = total_entries;
|
||||
pg->slots = pg->used;
|
||||
break;
|
||||
|
||||
default:
|
||||
netdata_log_error("%s() - Unknown page type: %uc", __FUNCTION__, type);
|
||||
aral_freez(pgd_alloc_globals.aral_pgd, pg);
|
||||
aral_freez(pgd_alloc_globals.aral_pgd[pg->partition], pg);
|
||||
pg = PGD_EMPTY;
|
||||
break;
|
||||
}
|
||||
|
@ -259,26 +377,30 @@ PGD *pgd_create_from_disk_data(uint8_t type, void *base, uint32_t size)
|
|||
return pg;
|
||||
}
|
||||
|
||||
void pgd_free(PGD *pg)
|
||||
{
|
||||
if (!pg)
|
||||
void pgd_free(PGD *pg) {
|
||||
if (!pg || pg == PGD_EMPTY)
|
||||
return;
|
||||
|
||||
if (pg == PGD_EMPTY)
|
||||
return;
|
||||
internal_fatal(pg->partition >= PGD_ARAL_PARTITIONS,
|
||||
"PGD partition is invalid %u", pg->partition);
|
||||
|
||||
switch (pg->type)
|
||||
{
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
case RRDENG_PAGE_TYPE_ARRAY_TIER1:
|
||||
pgd_data_aral_free(pg->raw.data, pg->raw.size);
|
||||
pgd_data_free(pg->raw.data, pg->raw.size, pg->partition);
|
||||
break;
|
||||
|
||||
case RRDENG_PAGE_TYPE_GORILLA_32BIT: {
|
||||
if (pg->states & PGD_STATE_CREATED_FROM_DISK)
|
||||
{
|
||||
internal_fatal(pg->raw.data == NULL, "Tried to free gorilla PGD loaded from disk with NULL data");
|
||||
freez(pg->raw.data);
|
||||
|
||||
pgd_data_free(pg->raw.data, pg->raw.size, pg->partition);
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_MAIN_PGD_ARAL);
|
||||
|
||||
pg->raw.data = NULL;
|
||||
pg->raw.size = 0;
|
||||
}
|
||||
else if ((pg->states & PGD_STATE_CREATED_FROM_COLLECTOR) ||
|
||||
(pg->states & PGD_STATE_SCHEDULED_FOR_FLUSHING) ||
|
||||
|
@ -294,15 +416,19 @@ void pgd_free(PGD *pg)
|
|||
gorilla_buffer_t *gbuf = gorilla_writer_drop_head_buffer(pg->gorilla.writer);
|
||||
if (!gbuf)
|
||||
break;
|
||||
aral_freez(pgd_alloc_globals.aral_gorilla_buffer[pg->gorilla.aral_index], gbuf);
|
||||
aral_freez(pgd_alloc_globals.aral_gorilla_buffer[pg->partition], gbuf);
|
||||
pg->gorilla.num_buffers -= 1;
|
||||
}
|
||||
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_MAIN_PGD_GLIVE);
|
||||
|
||||
internal_fatal(pg->gorilla.num_buffers != 0,
|
||||
"Could not free all gorilla writer buffers");
|
||||
|
||||
aral_freez(pgd_alloc_globals.aral_gorilla_writer[pg->gorilla.aral_index], pg->gorilla.writer);
|
||||
aral_freez(pgd_alloc_globals.aral_gorilla_writer[pg->partition], pg->gorilla.writer);
|
||||
pg->gorilla.writer = NULL;
|
||||
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_MAIN_PGD_GWORKER);
|
||||
} else {
|
||||
fatal("pgd_free() called on gorilla page with unsupported state");
|
||||
// TODO: should we support any other states?
|
||||
|
@ -317,7 +443,62 @@ void pgd_free(PGD *pg)
|
|||
break;
|
||||
}
|
||||
|
||||
aral_freez(pgd_alloc_globals.aral_pgd, pg);
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_MAIN_PGD_DATA);
|
||||
|
||||
aral_freez(pgd_alloc_globals.aral_pgd[pg->partition], pg);
|
||||
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_MAIN_PGD_ARAL);
|
||||
}
|
||||
|
||||
static void pgd_aral_unmark(PGD *pg) {
|
||||
if (!pg ||
|
||||
pg == PGD_EMPTY ||
|
||||
(pg->options & PAGE_OPTION_ARAL_UNMARKED) ||
|
||||
!(pg->options & PAGE_OPTION_ARAL_MARKED))
|
||||
return;
|
||||
|
||||
internal_fatal(pg->partition >= PGD_ARAL_PARTITIONS,
|
||||
"PGD partition is invalid %u", pg->partition);
|
||||
|
||||
switch (pg->type)
|
||||
{
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
case RRDENG_PAGE_TYPE_ARRAY_TIER1:
|
||||
pgd_data_unmark(pg->raw.data, pg->raw.size, pg->partition);
|
||||
break;
|
||||
|
||||
case RRDENG_PAGE_TYPE_GORILLA_32BIT: {
|
||||
if (pg->states & PGD_STATE_CREATED_FROM_DISK)
|
||||
pgd_data_unmark(pg->raw.data, pg->raw.size, pg->partition);
|
||||
|
||||
else if ((pg->states & PGD_STATE_CREATED_FROM_COLLECTOR) ||
|
||||
(pg->states & PGD_STATE_SCHEDULED_FOR_FLUSHING) ||
|
||||
(pg->states & PGD_STATE_FLUSHED_TO_DISK))
|
||||
{
|
||||
internal_fatal(pg->gorilla.writer == NULL, "PGD does not have an active gorilla writer");
|
||||
internal_fatal(pg->gorilla.num_buffers == 0, "PGD does not have any gorilla buffers allocated");
|
||||
|
||||
gorilla_writer_aral_unmark(pg->gorilla.writer, pgd_alloc_globals.aral_gorilla_buffer[pg->partition]);
|
||||
aral_unmark_allocation(pgd_alloc_globals.aral_gorilla_writer[pg->partition], pg->gorilla.writer);
|
||||
}
|
||||
else {
|
||||
fatal("pgd_free() called on gorilla page with unsupported state");
|
||||
// TODO: should we support any other states?
|
||||
// if (!(pg->states & PGD_STATE_FLUSHED_TO_DISK))
|
||||
// fatal("pgd_free() is not supported yet for pages flushed to disk");
|
||||
}
|
||||
|
||||
break;
|
||||
}
|
||||
default:
|
||||
netdata_log_error("%s() - Unknown page type: %uc", __FUNCTION__, pg->type);
|
||||
break;
|
||||
}
|
||||
|
||||
aral_unmark_allocation(pgd_alloc_globals.aral_pgd[pg->partition], pg);
|
||||
|
||||
// make sure we will not do this again
|
||||
pg->options |= PAGE_OPTION_ARAL_UNMARKED;
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
|
@ -356,6 +537,17 @@ uint32_t pgd_slots_used(PGD *pg)
|
|||
return pg->used;
|
||||
}
|
||||
|
||||
uint32_t pgd_capacity(PGD *pg) {
|
||||
if (!pg)
|
||||
return 0;
|
||||
|
||||
if (pg == PGD_EMPTY)
|
||||
return 0;
|
||||
|
||||
return pg->slots;
|
||||
}
|
||||
|
||||
// return the overall memory footprint of the page, including all its structures and overheads
|
||||
uint32_t pgd_memory_footprint(PGD *pg)
|
||||
{
|
||||
if (!pg)
|
||||
|
@ -364,20 +556,59 @@ uint32_t pgd_memory_footprint(PGD *pg)
|
|||
if (pg == PGD_EMPTY)
|
||||
return 0;
|
||||
|
||||
size_t footprint = 0;
|
||||
size_t footprint = pgd_alloc_globals.sizeof_pgd;
|
||||
|
||||
switch (pg->type) {
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
case RRDENG_PAGE_TYPE_ARRAY_TIER1:
|
||||
footprint = sizeof(PGD) + pg->raw.size;
|
||||
footprint += pgd_data_footprint(pg->raw.size, pg->partition);
|
||||
break;
|
||||
|
||||
case RRDENG_PAGE_TYPE_GORILLA_32BIT: {
|
||||
if (pg->states & PGD_STATE_CREATED_FROM_DISK)
|
||||
footprint = sizeof(PGD) + pg->raw.size;
|
||||
else
|
||||
footprint = sizeof(PGD) + sizeof(gorilla_writer_t) + (pg->gorilla.num_buffers * RRDENG_GORILLA_32BIT_BUFFER_SIZE);
|
||||
footprint += pgd_data_footprint(pg->raw.size, pg->partition);
|
||||
|
||||
else {
|
||||
footprint += pgd_alloc_globals.sizeof_gorilla_writer_t;
|
||||
footprint += pg->gorilla.num_buffers * pgd_alloc_globals.sizeof_gorilla_buffer_32bit;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
netdata_log_error("%s() - Unknown page type: %uc", __FUNCTION__, pg->type);
|
||||
break;
|
||||
}
|
||||
|
||||
return footprint;
|
||||
}
|
||||
|
||||
// return the nominal buffer size depending on the page type - used by the PGC histogram
|
||||
uint32_t pgd_buffer_memory_footprint(PGD *pg)
|
||||
{
|
||||
if (!pg)
|
||||
return 0;
|
||||
|
||||
if (pg == PGD_EMPTY)
|
||||
return 0;
|
||||
|
||||
size_t footprint = 0;
|
||||
|
||||
switch (pg->type) {
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
case RRDENG_PAGE_TYPE_ARRAY_TIER1:
|
||||
footprint = pg->raw.size;
|
||||
break;
|
||||
|
||||
case RRDENG_PAGE_TYPE_GORILLA_32BIT: {
|
||||
if (pg->states & PGD_STATE_CREATED_FROM_DISK)
|
||||
footprint = pg->raw.size;
|
||||
|
||||
else
|
||||
footprint = pg->gorilla.num_buffers * RRDENG_GORILLA_32BIT_BUFFER_SIZE;
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
netdata_log_error("%s() - Unknown page type: %uc", __FUNCTION__, pg->type);
|
||||
break;
|
||||
|
@ -393,6 +624,9 @@ uint32_t pgd_disk_footprint(PGD *pg)
|
|||
|
||||
size_t size = 0;
|
||||
|
||||
// since the page is ready for flushing, let's unmark its pages to ARAL
|
||||
pgd_aral_unmark(pg);
|
||||
|
||||
switch (pg->type) {
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
case RRDENG_PAGE_TYPE_ARRAY_TIER1: {
|
||||
|
@ -415,10 +649,12 @@ uint32_t pgd_disk_footprint(PGD *pg)
|
|||
|
||||
size = pg->gorilla.num_buffers * RRDENG_GORILLA_32BIT_BUFFER_SIZE;
|
||||
|
||||
if (pg->states & PGD_STATE_CREATED_FROM_COLLECTOR) {
|
||||
global_statistics_tier0_disk_compressed_bytes(gorilla_writer_nbytes(pg->gorilla.writer));
|
||||
global_statistics_tier0_disk_uncompressed_bytes(gorilla_writer_entries(pg->gorilla.writer) * sizeof(storage_number));
|
||||
}
|
||||
if (pg->states & PGD_STATE_CREATED_FROM_COLLECTOR)
|
||||
telemetry_gorilla_tier0_page_flush(
|
||||
gorilla_writer_actual_nbytes(pg->gorilla.writer),
|
||||
gorilla_writer_optimal_nbytes(pg->gorilla.writer),
|
||||
tier_page_size[0]);
|
||||
|
||||
} else if (pg->states & PGD_STATE_CREATED_FROM_DISK) {
|
||||
size = pg->raw.size;
|
||||
} else {
|
||||
|
@ -434,6 +670,7 @@ uint32_t pgd_disk_footprint(PGD *pg)
|
|||
|
||||
internal_fatal(pg->states & PGD_STATE_CREATED_FROM_DISK,
|
||||
"Disk footprint asked for page created from disk.");
|
||||
|
||||
pg->states = PGD_STATE_SCHEDULED_FOR_FLUSHING;
|
||||
return size;
|
||||
}
|
||||
|
@ -461,7 +698,7 @@ void pgd_copy_to_extent(PGD *pg, uint8_t *dst, uint32_t dst_size)
|
|||
bool ok = gorilla_writer_serialize(pg->gorilla.writer, dst, dst_size);
|
||||
UNUSED(ok);
|
||||
internal_fatal(!ok,
|
||||
"pgd_copy_to_extent() tried to serialize pg=%p, gw=%p (with dst_size=%u bytes, num_buffers=%zu)",
|
||||
"pgd_copy_to_extent() tried to serialize pg=%p, gw=%p (with dst_size=%u bytes, num_buffers=%u)",
|
||||
pg, pg->gorilla.writer, dst_size, pg->gorilla.num_buffers);
|
||||
break;
|
||||
}
|
||||
|
@ -476,7 +713,8 @@ void pgd_copy_to_extent(PGD *pg, uint8_t *dst, uint32_t dst_size)
|
|||
// ----------------------------------------------------------------------------
|
||||
// data collection
|
||||
|
||||
void pgd_append_point(PGD *pg,
|
||||
// returns additional memory that may have been allocated to store this point
|
||||
size_t pgd_append_point(PGD *pg,
|
||||
usec_t point_in_time_ut __maybe_unused,
|
||||
NETDATA_DOUBLE n,
|
||||
NETDATA_DOUBLE min_value,
|
||||
|
@ -535,22 +773,27 @@ void pgd_append_point(PGD *pg,
|
|||
|
||||
bool ok = gorilla_writer_write(pg->gorilla.writer, t);
|
||||
if (!ok) {
|
||||
gorilla_buffer_t *new_buffer = aral_mallocz(pgd_alloc_globals.aral_gorilla_buffer[pg->gorilla.aral_index]);
|
||||
gorilla_buffer_t *new_buffer = pgd_gorilla_buffer_alloc(pg->partition);
|
||||
memset(new_buffer, 0, RRDENG_GORILLA_32BIT_BUFFER_SIZE);
|
||||
|
||||
gorilla_writer_add_buffer(pg->gorilla.writer, new_buffer, RRDENG_GORILLA_32BIT_BUFFER_SLOTS);
|
||||
pg->gorilla.num_buffers += 1;
|
||||
global_statistics_gorilla_buffer_add_hot();
|
||||
telemetry_gorilla_hot_buffer_added();
|
||||
|
||||
ok = gorilla_writer_write(pg->gorilla.writer, t);
|
||||
internal_fatal(ok == false, "Failed to writer value in newly allocated gorilla buffer.");
|
||||
|
||||
return RRDENG_GORILLA_32BIT_BUFFER_SIZE;
|
||||
}
|
||||
|
||||
break;
|
||||
}
|
||||
default:
|
||||
netdata_log_error("%s() - Unknown page type: %uc", __FUNCTION__, pg->type);
|
||||
break;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
|
@ -589,7 +832,6 @@ static void pgdc_seek(PGDC *pgdc, uint32_t position)
|
|||
uint32_t value;
|
||||
|
||||
bool ok = gorilla_reader_read(&pgdc->gr, &value);
|
||||
|
||||
if (!ok) {
|
||||
// this is fine, the reader will return empty points
|
||||
break;
|
||||
|
@ -664,6 +906,7 @@ bool pgdc_get_next_point(PGDC *pgdc, uint32_t expected_position __maybe_unused,
|
|||
|
||||
uint32_t n = 666666666;
|
||||
bool ok = gorilla_reader_read(&pgdc->gr, &n);
|
||||
|
||||
if (ok) {
|
||||
sp->min = sp->max = sp->sum = unpack_storage_number(n);
|
||||
sp->flags = (SN_FLAGS)(n & SN_USER_FLAGS);
|
||||
|
|
|
@ -33,12 +33,17 @@ uint32_t pgd_type(PGD *pg);
|
|||
bool pgd_is_empty(PGD *pg);
|
||||
uint32_t pgd_slots_used(PGD *pg);
|
||||
|
||||
uint32_t pgd_buffer_memory_footprint(PGD *pg);
|
||||
uint32_t pgd_memory_footprint(PGD *pg);
|
||||
uint32_t pgd_capacity(PGD *pg);
|
||||
uint32_t pgd_disk_footprint(PGD *pg);
|
||||
|
||||
size_t pgd_aral_structures(void);
|
||||
size_t pgd_aral_overhead(void);
|
||||
|
||||
void pgd_copy_to_extent(PGD *pg, uint8_t *dst, uint32_t dst_size);
|
||||
|
||||
void pgd_append_point(PGD *pg,
|
||||
size_t pgd_append_point(PGD *pg,
|
||||
usec_t point_in_time_ut,
|
||||
NETDATA_DOUBLE n,
|
||||
NETDATA_DOUBLE min_value,
|
||||
|
|
|
@ -63,6 +63,7 @@ static void open_cache_free_clean_page_callback(PGC *cache __maybe_unused, PGC_E
|
|||
{
|
||||
struct rrdengine_datafile *datafile = entry.data;
|
||||
datafile_release(datafile, DATAFILE_ACQUIRE_OPEN_CACHE);
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_OPEN);
|
||||
}
|
||||
|
||||
static void open_cache_flush_dirty_page_callback(PGC *cache __maybe_unused, PGC_ENTRY *entries_array __maybe_unused, PGC_PAGE **pages_array __maybe_unused, size_t entries __maybe_unused)
|
||||
|
@ -73,6 +74,7 @@ static void open_cache_flush_dirty_page_callback(PGC *cache __maybe_unused, PGC_
|
|||
static void extent_cache_free_clean_page_callback(PGC *cache __maybe_unused, PGC_ENTRY entry __maybe_unused)
|
||||
{
|
||||
dbengine_extent_free(entry.data, entry.size);
|
||||
timing_dbengine_evict_step(TIMING_STEP_DBENGINE_EVICT_FREE_EXTENT);
|
||||
}
|
||||
|
||||
static void extent_cache_flush_dirty_page_callback(PGC *cache __maybe_unused, PGC_ENTRY *entries_array __maybe_unused, PGC_PAGE **pages_array __maybe_unused, size_t entries __maybe_unused)
|
||||
|
@ -1030,8 +1032,21 @@ void pgc_open_add_hot_page(Word_t section, Word_t metric_id, time_t start_time_s
|
|||
}
|
||||
|
||||
size_t dynamic_open_cache_size(void) {
|
||||
size_t main_cache_size = pgc_get_wanted_cache_size(main_cache);
|
||||
size_t target_size = main_cache_size / 100 * 5;
|
||||
size_t main_wanted_cache_size = pgc_get_wanted_cache_size(main_cache);
|
||||
size_t target_size = main_wanted_cache_size / 100 * 5; // 5%
|
||||
|
||||
// static bool query_current_size = true;
|
||||
// if(query_current_size) {
|
||||
// size_t main_current_cache_size = pgc_get_current_cache_size(main_cache);
|
||||
//
|
||||
// size_t main_free_cache_size = (main_wanted_cache_size > main_current_cache_size) ?
|
||||
// main_wanted_cache_size - main_current_cache_size : 0;
|
||||
//
|
||||
// if(main_free_cache_size > target_size)
|
||||
// target_size = main_free_cache_size;
|
||||
// else
|
||||
// query_current_size = false;
|
||||
// }
|
||||
|
||||
if(target_size < 2 * 1024 * 1024)
|
||||
target_size = 2 * 1024 * 1024;
|
||||
|
@ -1040,15 +1055,33 @@ size_t dynamic_open_cache_size(void) {
|
|||
}
|
||||
|
||||
size_t dynamic_extent_cache_size(void) {
|
||||
size_t main_cache_size = pgc_get_wanted_cache_size(main_cache);
|
||||
size_t target_size = main_cache_size / 100 * 5;
|
||||
size_t main_wanted_cache_size = pgc_get_wanted_cache_size(main_cache);
|
||||
|
||||
if(target_size < 3 * 1024 * 1024)
|
||||
target_size = 3 * 1024 * 1024;
|
||||
size_t target_size = main_wanted_cache_size / 100 * 10; // 10%
|
||||
|
||||
// static bool query_current_size = true;
|
||||
// if(query_current_size) {
|
||||
// size_t main_current_cache_size = pgc_get_current_cache_size(main_cache);
|
||||
//
|
||||
// size_t main_free_cache_size = (main_wanted_cache_size > main_current_cache_size) ?
|
||||
// main_wanted_cache_size - main_current_cache_size : 0;
|
||||
//
|
||||
// if(main_free_cache_size > target_size)
|
||||
// target_size = main_free_cache_size;
|
||||
// else
|
||||
// query_current_size = false;
|
||||
// }
|
||||
|
||||
if(target_size < 5 * 1024 * 1024)
|
||||
target_size = 5 * 1024 * 1024;
|
||||
|
||||
return target_size;
|
||||
}
|
||||
|
||||
size_t pgc_main_nominal_page_size(void *data) {
|
||||
return pgd_buffer_memory_footprint(data);
|
||||
}
|
||||
|
||||
void pgc_and_mrg_initialize(void)
|
||||
{
|
||||
main_mrg = mrg_create(0);
|
||||
|
@ -1066,51 +1099,52 @@ void pgc_and_mrg_initialize(void)
|
|||
extent_cache_size += (size_t)(default_rrdeng_extent_cache_mb * 1024ULL * 1024ULL);
|
||||
|
||||
main_cache = pgc_create(
|
||||
"main_cache",
|
||||
"MAIN_PGC",
|
||||
main_cache_size,
|
||||
main_cache_free_clean_page_callback,
|
||||
(size_t) rrdeng_pages_per_extent,
|
||||
main_cache_flush_dirty_page_init_callback,
|
||||
main_cache_flush_dirty_page_callback,
|
||||
10,
|
||||
10240, // if there are that many threads, evict so many at once!
|
||||
1000, //
|
||||
5, // don't delay too much other threads
|
||||
PGC_OPTIONS_AUTOSCALE, // AUTOSCALE = 2x max hot pages
|
||||
0, // 0 = as many as the system cpus
|
||||
2,
|
||||
pgc_max_evictors(),
|
||||
1000,
|
||||
1,
|
||||
PGC_OPTIONS_AUTOSCALE,
|
||||
0,
|
||||
0
|
||||
);
|
||||
pgc_set_nominal_page_size_callback(main_cache, pgc_main_nominal_page_size);
|
||||
|
||||
open_cache = pgc_create(
|
||||
"open_cache",
|
||||
open_cache_size, // the default is 1MB
|
||||
"OPEN_PGC",
|
||||
open_cache_size,
|
||||
open_cache_free_clean_page_callback,
|
||||
1,
|
||||
2,
|
||||
NULL,
|
||||
open_cache_flush_dirty_page_callback,
|
||||
10,
|
||||
10240, // if there are that many threads, evict that many at once!
|
||||
1000, //
|
||||
3, // don't delay too much other threads
|
||||
PGC_OPTIONS_AUTOSCALE | PGC_OPTIONS_EVICT_PAGES_INLINE | PGC_OPTIONS_FLUSH_PAGES_INLINE,
|
||||
0, // 0 = as many as the system cpus
|
||||
1,
|
||||
pgc_max_evictors(),
|
||||
1000,
|
||||
1,
|
||||
PGC_OPTIONS_AUTOSCALE, // flushing inline: all dirty pages are just converted to clean
|
||||
0,
|
||||
sizeof(struct extent_io_data)
|
||||
);
|
||||
pgc_set_dynamic_target_cache_size_callback(open_cache, dynamic_open_cache_size);
|
||||
|
||||
extent_cache = pgc_create(
|
||||
"extent_cache",
|
||||
"EXTENT_PGC",
|
||||
extent_cache_size,
|
||||
extent_cache_free_clean_page_callback,
|
||||
1,
|
||||
2,
|
||||
NULL,
|
||||
extent_cache_flush_dirty_page_callback,
|
||||
5,
|
||||
10, // it will lose up to that extents at once!
|
||||
100, //
|
||||
2, // don't delay too much other threads
|
||||
PGC_OPTIONS_AUTOSCALE | PGC_OPTIONS_EVICT_PAGES_INLINE | PGC_OPTIONS_FLUSH_PAGES_INLINE,
|
||||
0, // 0 = as many as the system cpus
|
||||
1,
|
||||
pgc_max_evictors(),
|
||||
1000,
|
||||
1,
|
||||
PGC_OPTIONS_AUTOSCALE | PGC_OPTIONS_FLUSH_PAGES_NO_INLINE, // no flushing needed
|
||||
0,
|
||||
0
|
||||
);
|
||||
pgc_set_dynamic_target_cache_size_callback(extent_cache, dynamic_extent_cache_size);
|
||||
|
|
|
@ -53,10 +53,12 @@ void pdc_init(void) {
|
|||
"dbengine-pdc",
|
||||
sizeof(PDC),
|
||||
0,
|
||||
65536,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false
|
||||
);
|
||||
|
||||
telemetry_aral_register(pdc_globals.pdc.ar, "pdc");
|
||||
}
|
||||
|
||||
PDC *pdc_get(void) {
|
||||
|
@ -81,10 +83,11 @@ void page_details_init(void) {
|
|||
"dbengine-pd",
|
||||
sizeof(struct page_details),
|
||||
0,
|
||||
65536,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false
|
||||
);
|
||||
telemetry_aral_register(pdc_globals.pd.ar, "pd");
|
||||
}
|
||||
|
||||
struct page_details *page_details_get(void) {
|
||||
|
@ -109,10 +112,11 @@ void epdl_init(void) {
|
|||
"dbengine-epdl",
|
||||
sizeof(EPDL),
|
||||
0,
|
||||
65536,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false
|
||||
);
|
||||
telemetry_aral_register(pdc_globals.epdl.ar, "epdl");
|
||||
}
|
||||
|
||||
static EPDL *epdl_get(void) {
|
||||
|
@ -137,10 +141,12 @@ void deol_init(void) {
|
|||
"dbengine-deol",
|
||||
sizeof(DEOL),
|
||||
0,
|
||||
65536,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false
|
||||
);
|
||||
|
||||
telemetry_aral_register(pdc_globals.deol.ar, "deol");
|
||||
}
|
||||
|
||||
static DEOL *deol_get(void) {
|
||||
|
@ -1126,6 +1132,7 @@ static bool epdl_populate_pages_from_extent_data(
|
|||
PGC_PAGE *page = pgc_page_add_and_acquire(main_cache, page_entry, &added);
|
||||
if (false == added) {
|
||||
pgd_free(pgd);
|
||||
pgd = pgc_page_data(page);
|
||||
stats_cache_hit_while_inserting++;
|
||||
stats_data_from_main_cache++;
|
||||
}
|
||||
|
@ -1256,9 +1263,12 @@ void epdl_find_extent_and_populate_pages(struct rrdengine_instance *ctx, EPDL *e
|
|||
void *extent_data = datafile_extent_read(ctx, epdl->file, epdl->extent_offset, epdl->extent_size);
|
||||
if(extent_data != NULL) {
|
||||
|
||||
void *copied_extent_compressed_data = dbengine_extent_alloc(epdl->extent_size);
|
||||
memcpy(copied_extent_compressed_data, extent_data, epdl->extent_size);
|
||||
#if defined(NETDATA_TRACE_ALLOCATIONS)
|
||||
void *tmp = dbengine_extent_alloc(epdl->extent_size);
|
||||
memcpy(tmp, extent_data, epdl->extent_size);
|
||||
datafile_extent_read_free(extent_data);
|
||||
extent_data = tmp;
|
||||
#endif
|
||||
|
||||
if(worker)
|
||||
worker_is_busy(UV_EVENT_DBENGINE_EXTENT_CACHE_LOOKUP);
|
||||
|
@ -1272,11 +1282,11 @@ void epdl_find_extent_and_populate_pages(struct rrdengine_instance *ctx, EPDL *e
|
|||
.size = epdl->extent_size,
|
||||
.end_time_s = 0,
|
||||
.update_every_s = 0,
|
||||
.data = copied_extent_compressed_data,
|
||||
.data = extent_data,
|
||||
}, &added);
|
||||
|
||||
if (!added) {
|
||||
dbengine_extent_free(copied_extent_compressed_data, epdl->extent_size);
|
||||
dbengine_extent_free(extent_data, epdl->extent_size);
|
||||
internal_fatal(epdl->extent_size != pgc_page_data_size(extent_cache, extent_cache_page),
|
||||
"DBENGINE: cache size does not match the expected size");
|
||||
}
|
||||
|
|
|
@ -45,7 +45,9 @@ struct rrdeng_main {
|
|||
bool shutdown;
|
||||
|
||||
size_t flushes_running;
|
||||
size_t evictions_running;
|
||||
size_t evict_main_running;
|
||||
size_t evict_open_running;
|
||||
size_t evict_extent_running;
|
||||
size_t cleanup_running;
|
||||
|
||||
struct {
|
||||
|
@ -86,8 +88,9 @@ struct rrdeng_main {
|
|||
.loop = {},
|
||||
.async = {},
|
||||
.timer = {},
|
||||
.retention_timer = {},
|
||||
.flushes_running = 0,
|
||||
.evictions_running = 0,
|
||||
.evict_main_running = 0,
|
||||
.cleanup_running = 0,
|
||||
|
||||
.cmd_queue = {
|
||||
|
@ -138,12 +141,15 @@ struct rrdeng_work {
|
|||
|
||||
static void work_request_init(void) {
|
||||
rrdeng_main.work_cmd.ar = aral_create(
|
||||
"dbengine-work-cmd",
|
||||
sizeof(struct rrdeng_work),
|
||||
0,
|
||||
65536, NULL,
|
||||
NULL, NULL, false, false
|
||||
"dbengine-work-cmd",
|
||||
sizeof(struct rrdeng_work),
|
||||
0,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false
|
||||
);
|
||||
|
||||
telemetry_aral_register(rrdeng_main.work_cmd.ar, "workers");
|
||||
}
|
||||
|
||||
enum LIBUV_WORKERS_STATUS {
|
||||
|
@ -259,9 +265,11 @@ void page_descriptors_init(void) {
|
|||
"dbengine-descriptors",
|
||||
sizeof(struct page_descr_with_data),
|
||||
0,
|
||||
65536 * 4,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false);
|
||||
|
||||
telemetry_aral_register(rrdeng_main.xt_io_descr.ar, "descriptors");
|
||||
}
|
||||
|
||||
struct page_descr_with_data *page_descriptor_get(void) {
|
||||
|
@ -282,10 +290,12 @@ static void extent_io_descriptor_init(void) {
|
|||
"dbengine-extent-io",
|
||||
sizeof(struct extent_io_descriptor),
|
||||
0,
|
||||
65536,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false
|
||||
);
|
||||
|
||||
telemetry_aral_register(rrdeng_main.xt_io_descr.ar, "extent io");
|
||||
}
|
||||
|
||||
static struct extent_io_descriptor *extent_io_descriptor_get(void) {
|
||||
|
@ -306,9 +316,11 @@ void rrdeng_query_handle_init(void) {
|
|||
"dbengine-query-handles",
|
||||
sizeof(struct rrdeng_query_handle),
|
||||
0,
|
||||
65536,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false);
|
||||
|
||||
telemetry_aral_register(rrdeng_main.handles.ar, "query handles");
|
||||
}
|
||||
|
||||
struct rrdeng_query_handle *rrdeng_query_handle_get(void) {
|
||||
|
@ -426,9 +438,11 @@ static void rrdeng_cmd_queue_init(void) {
|
|||
rrdeng_main.cmd_queue.ar = aral_create("dbengine-opcodes",
|
||||
sizeof(struct rrdeng_cmd),
|
||||
0,
|
||||
65536,
|
||||
0,
|
||||
NULL,
|
||||
NULL, NULL, false, false);
|
||||
|
||||
telemetry_aral_register(rrdeng_main.cmd_queue.ar, "opcodes");
|
||||
}
|
||||
|
||||
static inline STORAGE_PRIORITY rrdeng_enq_cmd_map_opcode_to_priority(enum rrdeng_opcode opcode, STORAGE_PRIORITY priority) {
|
||||
|
@ -1376,17 +1390,41 @@ static void *cache_flush_tp_worker(struct rrdengine_instance *ctx __maybe_unused
|
|||
return data;
|
||||
|
||||
worker_is_busy(UV_EVENT_DBENGINE_FLUSH_MAIN_CACHE);
|
||||
pgc_flush_pages(main_cache, 0);
|
||||
while (pgc_flush_pages(main_cache))
|
||||
yield_the_processor();
|
||||
|
||||
return data;
|
||||
}
|
||||
|
||||
static void *cache_evict_tp_worker(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t *req __maybe_unused) {
|
||||
static void *cache_evict_main_tp_worker(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t *req __maybe_unused) {
|
||||
if (!main_cache)
|
||||
return data;
|
||||
|
||||
worker_is_busy(UV_EVENT_DBENGINE_EVICT_MAIN_CACHE);
|
||||
pgc_evict_pages(main_cache, 0, 0);
|
||||
while (pgc_evict_pages(main_cache, 0, 0))
|
||||
yield_the_processor();
|
||||
|
||||
return data;
|
||||
}
|
||||
|
||||
static void *cache_evict_open_tp_worker(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t *req __maybe_unused) {
|
||||
if (!open_cache)
|
||||
return data;
|
||||
|
||||
worker_is_busy(UV_EVENT_DBENGINE_EVICT_OPEN_CACHE);
|
||||
while (pgc_evict_pages(open_cache, 0, 0))
|
||||
yield_the_processor();
|
||||
|
||||
return data;
|
||||
}
|
||||
|
||||
static void *cache_evict_extent_tp_worker(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t *req __maybe_unused) {
|
||||
if (!extent_cache)
|
||||
return data;
|
||||
|
||||
worker_is_busy(UV_EVENT_DBENGINE_EVICT_EXTENT_CACHE);
|
||||
while (pgc_evict_pages(extent_cache, 0, 0))
|
||||
yield_the_processor();
|
||||
|
||||
return data;
|
||||
}
|
||||
|
@ -1532,8 +1570,16 @@ static void after_do_cache_flush(struct rrdengine_instance *ctx __maybe_unused,
|
|||
rrdeng_main.flushes_running--;
|
||||
}
|
||||
|
||||
static void after_do_cache_evict(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t* req __maybe_unused, int status __maybe_unused) {
|
||||
rrdeng_main.evictions_running--;
|
||||
static void after_do_main_cache_evict(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t* req __maybe_unused, int status __maybe_unused) {
|
||||
rrdeng_main.evict_main_running--;
|
||||
}
|
||||
|
||||
static void after_do_open_cache_evict(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t* req __maybe_unused, int status __maybe_unused) {
|
||||
rrdeng_main.evict_open_running--;
|
||||
}
|
||||
|
||||
static void after_do_extent_cache_evict(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t* req __maybe_unused, int status __maybe_unused) {
|
||||
rrdeng_main.evict_extent_running--;
|
||||
}
|
||||
|
||||
static void after_journal_v2_indexing(struct rrdengine_instance *ctx __maybe_unused, void *data __maybe_unused, struct completion *completion __maybe_unused, uv_work_t* req __maybe_unused, int status __maybe_unused) {
|
||||
|
@ -1544,6 +1590,7 @@ static void after_journal_v2_indexing(struct rrdengine_instance *ctx __maybe_unu
|
|||
struct rrdeng_buffer_sizes rrdeng_get_buffer_sizes(void) {
|
||||
return (struct rrdeng_buffer_sizes) {
|
||||
.pgc = pgc_aral_overhead() + pgc_aral_structures(),
|
||||
.pgd = pgd_aral_overhead() + pgd_aral_structures(),
|
||||
.mrg = mrg_aral_overhead() + mrg_aral_structures(),
|
||||
.opcodes = aral_overhead(rrdeng_main.cmd_queue.ar) + aral_structures(rrdeng_main.cmd_queue.ar),
|
||||
.handles = aral_overhead(rrdeng_main.handles.ar) + aral_structures(rrdeng_main.handles.ar),
|
||||
|
@ -1642,8 +1689,7 @@ bool rrdeng_ctx_tier_cap_exceeded(struct rrdengine_instance *ctx)
|
|||
return false;
|
||||
}
|
||||
|
||||
void retention_timer_cb(uv_timer_t *handle)
|
||||
{
|
||||
static void retention_timer_cb(uv_timer_t *handle) {
|
||||
if (!localhost)
|
||||
return;
|
||||
|
||||
|
@ -1663,7 +1709,7 @@ void retention_timer_cb(uv_timer_t *handle)
|
|||
worker_is_idle();
|
||||
}
|
||||
|
||||
void timer_cb(uv_timer_t* handle) {
|
||||
static void timer_per_sec_cb(uv_timer_t* handle) {
|
||||
worker_is_busy(RRDENG_TIMER_CB);
|
||||
uv_stop(handle->loop);
|
||||
uv_update_time(handle->loop);
|
||||
|
@ -1672,14 +1718,17 @@ void timer_cb(uv_timer_t* handle) {
|
|||
worker_set_metric(RRDENG_WORKS_DISPATCHED, (NETDATA_DOUBLE)__atomic_load_n(&rrdeng_main.work_cmd.atomics.dispatched, __ATOMIC_RELAXED));
|
||||
worker_set_metric(RRDENG_WORKS_EXECUTING, (NETDATA_DOUBLE)__atomic_load_n(&rrdeng_main.work_cmd.atomics.executing, __ATOMIC_RELAXED));
|
||||
|
||||
rrdeng_enq_cmd(NULL, RRDENG_OPCODE_FLUSH_INIT, NULL, NULL, STORAGE_PRIORITY_INTERNAL_DBENGINE, NULL, NULL);
|
||||
rrdeng_enq_cmd(NULL, RRDENG_OPCODE_EVICT_INIT, NULL, NULL, STORAGE_PRIORITY_INTERNAL_DBENGINE, NULL, NULL);
|
||||
// rrdeng_enq_cmd(NULL, RRDENG_OPCODE_EVICT_MAIN, NULL, NULL, STORAGE_PRIORITY_INTERNAL_DBENGINE, NULL, NULL);
|
||||
// rrdeng_enq_cmd(NULL, RRDENG_OPCODE_EVICT_OPEN, NULL, NULL, STORAGE_PRIORITY_INTERNAL_DBENGINE, NULL, NULL);
|
||||
// rrdeng_enq_cmd(NULL, RRDENG_OPCODE_EVICT_EXTENT, NULL, NULL, STORAGE_PRIORITY_INTERNAL_DBENGINE, NULL, NULL);
|
||||
rrdeng_enq_cmd(NULL, RRDENG_OPCODE_FLUSH_MAIN, NULL, NULL, STORAGE_PRIORITY_INTERNAL_DBENGINE, NULL, NULL);
|
||||
rrdeng_enq_cmd(NULL, RRDENG_OPCODE_CLEANUP, NULL, NULL, STORAGE_PRIORITY_INTERNAL_DBENGINE, NULL, NULL);
|
||||
|
||||
worker_is_idle();
|
||||
}
|
||||
|
||||
static void dbengine_initialize_structures(void) {
|
||||
pgd_init_arals();
|
||||
pgc_and_mrg_initialize();
|
||||
|
||||
pdc_init();
|
||||
|
@ -1691,7 +1740,6 @@ static void dbengine_initialize_structures(void) {
|
|||
rrdeng_query_handle_init();
|
||||
page_descriptors_init();
|
||||
extent_buffer_init();
|
||||
pgd_init_arals();
|
||||
extent_io_descriptor_init();
|
||||
}
|
||||
|
||||
|
@ -1900,8 +1948,8 @@ void dbengine_event_loop(void* arg) {
|
|||
worker_register_job_name(RRDENG_OPCODE_FLUSHED_TO_OPEN, "flushed to open");
|
||||
worker_register_job_name(RRDENG_OPCODE_DATABASE_ROTATE, "db rotate");
|
||||
worker_register_job_name(RRDENG_OPCODE_JOURNAL_INDEX, "journal index");
|
||||
worker_register_job_name(RRDENG_OPCODE_FLUSH_INIT, "flush init");
|
||||
worker_register_job_name(RRDENG_OPCODE_EVICT_INIT, "evict init");
|
||||
worker_register_job_name(RRDENG_OPCODE_FLUSH_MAIN, "flush init");
|
||||
worker_register_job_name(RRDENG_OPCODE_EVICT_MAIN, "evict init");
|
||||
worker_register_job_name(RRDENG_OPCODE_CTX_SHUTDOWN, "ctx shutdown");
|
||||
worker_register_job_name(RRDENG_OPCODE_CTX_QUIESCE, "ctx quiesce");
|
||||
worker_register_job_name(RRDENG_OPCODE_SHUTDOWN_EVLOOP, "dbengine shutdown");
|
||||
|
@ -1914,8 +1962,8 @@ void dbengine_event_loop(void* arg) {
|
|||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_FLUSHED_TO_OPEN, "flushed to open cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_DATABASE_ROTATE, "db rotate cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_JOURNAL_INDEX, "journal index cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_FLUSH_INIT, "flush init cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_EVICT_INIT, "evict init cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_FLUSH_MAIN, "flush init cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_EVICT_MAIN, "evict init cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_CTX_SHUTDOWN, "ctx shutdown cb");
|
||||
worker_register_job_name(RRDENG_OPCODE_MAX + RRDENG_OPCODE_CTX_QUIESCE, "ctx quiesce cb");
|
||||
|
||||
|
@ -1932,7 +1980,7 @@ void dbengine_event_loop(void* arg) {
|
|||
struct rrdeng_cmd cmd;
|
||||
main->tid = gettid_cached();
|
||||
|
||||
fatal_assert(0 == uv_timer_start(&main->timer, timer_cb, TIMER_PERIOD_MS, TIMER_PERIOD_MS));
|
||||
fatal_assert(0 == uv_timer_start(&main->timer, timer_per_sec_cb, TIMER_PERIOD_MS, TIMER_PERIOD_MS));
|
||||
fatal_assert(0 == uv_timer_start(&main->retention_timer, retention_timer_cb, TIMER_PERIOD_MS * 60, TIMER_PERIOD_MS * 60));
|
||||
|
||||
bool shutdown = false;
|
||||
|
@ -1974,18 +2022,34 @@ void dbengine_event_loop(void* arg) {
|
|||
break;
|
||||
}
|
||||
|
||||
case RRDENG_OPCODE_FLUSH_INIT: {
|
||||
if(rrdeng_main.flushes_running < (size_t)(libuv_worker_threads / 4)) {
|
||||
case RRDENG_OPCODE_FLUSH_MAIN: {
|
||||
if(rrdeng_main.flushes_running < pgc_max_flushers()) {
|
||||
rrdeng_main.flushes_running++;
|
||||
work_dispatch(NULL, NULL, NULL, opcode, cache_flush_tp_worker, after_do_cache_flush);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case RRDENG_OPCODE_EVICT_INIT: {
|
||||
if(!rrdeng_main.evictions_running) {
|
||||
rrdeng_main.evictions_running++;
|
||||
work_dispatch(NULL, NULL, NULL, opcode, cache_evict_tp_worker, after_do_cache_evict);
|
||||
case RRDENG_OPCODE_EVICT_MAIN: {
|
||||
if(rrdeng_main.evict_main_running < pgc_max_evictors()) {
|
||||
rrdeng_main.evict_main_running++;
|
||||
work_dispatch(NULL, NULL, NULL, opcode, cache_evict_main_tp_worker, after_do_main_cache_evict);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case RRDENG_OPCODE_EVICT_OPEN: {
|
||||
if(rrdeng_main.evict_open_running < pgc_max_evictors()) {
|
||||
rrdeng_main.evict_open_running++;
|
||||
work_dispatch(NULL, NULL, NULL, opcode, cache_evict_open_tp_worker, after_do_open_cache_evict);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case RRDENG_OPCODE_EVICT_EXTENT: {
|
||||
if(rrdeng_main.evict_extent_running < pgc_max_evictors()) {
|
||||
rrdeng_main.evict_extent_running++;
|
||||
work_dispatch(NULL, NULL, NULL, opcode, cache_evict_extent_tp_worker, after_do_extent_cache_evict);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
@ -2049,6 +2113,7 @@ void dbengine_event_loop(void* arg) {
|
|||
|
||||
case RRDENG_OPCODE_SHUTDOWN_EVLOOP: {
|
||||
uv_close((uv_handle_t *)&main->async, NULL);
|
||||
|
||||
(void) uv_timer_stop(&main->timer);
|
||||
uv_close((uv_handle_t *)&main->timer, NULL);
|
||||
|
||||
|
|
|
@ -201,7 +201,6 @@ struct rrdeng_collect_handle {
|
|||
struct metric *metric;
|
||||
struct pgc_page *pgc_page;
|
||||
struct pgd *page_data;
|
||||
size_t page_data_size;
|
||||
struct pg_alignment *alignment;
|
||||
uint32_t page_entries_max;
|
||||
uint32_t page_position; // keep track of the current page size, to make sure we don't exceed it
|
||||
|
@ -249,8 +248,10 @@ enum rrdeng_opcode {
|
|||
RRDENG_OPCODE_FLUSHED_TO_OPEN,
|
||||
RRDENG_OPCODE_DATABASE_ROTATE,
|
||||
RRDENG_OPCODE_JOURNAL_INDEX,
|
||||
RRDENG_OPCODE_FLUSH_INIT,
|
||||
RRDENG_OPCODE_EVICT_INIT,
|
||||
RRDENG_OPCODE_FLUSH_MAIN,
|
||||
RRDENG_OPCODE_EVICT_MAIN,
|
||||
RRDENG_OPCODE_EVICT_OPEN,
|
||||
RRDENG_OPCODE_EVICT_EXTENT,
|
||||
RRDENG_OPCODE_CTX_SHUTDOWN,
|
||||
RRDENG_OPCODE_CTX_QUIESCE,
|
||||
RRDENG_OPCODE_CTX_POPULATE_MRG,
|
||||
|
|
|
@ -59,6 +59,8 @@ __attribute__((constructor)) void initialize_multidb_ctx(void) {
|
|||
initialize_single_ctx(multidb_ctx[i]);
|
||||
}
|
||||
|
||||
uint64_t dbengine_out_of_memory_protection = 0;
|
||||
bool dbengine_use_all_ram_for_caches = false;
|
||||
int db_engine_journal_check = 0;
|
||||
bool new_dbengine_defaults = false;
|
||||
bool legacy_multihost_db_space = false;
|
||||
|
@ -279,8 +281,7 @@ STORAGE_COLLECT_HANDLE *rrdeng_store_metric_init(STORAGE_METRIC_HANDLE *smh, uin
|
|||
|
||||
handle->pgc_page = NULL;
|
||||
handle->page_data = NULL;
|
||||
handle->page_data_size = 0;
|
||||
|
||||
|
||||
handle->page_position = 0;
|
||||
handle->page_entries_max = 0;
|
||||
handle->update_every_ut = (usec_t)update_every * USEC_PER_SEC;
|
||||
|
@ -339,7 +340,6 @@ void rrdeng_store_metric_flush_current_page(STORAGE_COLLECT_HANDLE *sch) {
|
|||
handle->page_position = 0;
|
||||
handle->page_entries_max = 0;
|
||||
handle->page_data = NULL;
|
||||
handle->page_data_size = 0;
|
||||
|
||||
// important!
|
||||
// we should never zero page end time ut, because this will allow
|
||||
|
@ -355,8 +355,7 @@ void rrdeng_store_metric_flush_current_page(STORAGE_COLLECT_HANDLE *sch) {
|
|||
static void rrdeng_store_metric_create_new_page(struct rrdeng_collect_handle *handle,
|
||||
struct rrdengine_instance *ctx,
|
||||
usec_t point_in_time_ut,
|
||||
PGD *data,
|
||||
size_t data_size) {
|
||||
PGD *data) {
|
||||
time_t point_in_time_s = (time_t)(point_in_time_ut / USEC_PER_SEC);
|
||||
const uint32_t update_every_s = (uint32_t)(handle->update_every_ut / USEC_PER_SEC);
|
||||
|
||||
|
@ -365,7 +364,7 @@ static void rrdeng_store_metric_create_new_page(struct rrdeng_collect_handle *ha
|
|||
.metric_id = mrg_metric_id(main_mrg, handle->metric),
|
||||
.start_time_s = point_in_time_s,
|
||||
.end_time_s = point_in_time_s,
|
||||
.size = data_size,
|
||||
.size = pgd_memory_footprint(data),
|
||||
.data = data,
|
||||
.update_every_s = update_every_s,
|
||||
.hot = true
|
||||
|
@ -405,7 +404,7 @@ static void rrdeng_store_metric_create_new_page(struct rrdeng_collect_handle *ha
|
|||
pgc_page = pgc_page_add_and_acquire(main_cache, page_entry, &added);
|
||||
}
|
||||
|
||||
handle->page_entries_max = data_size / CTX_POINT_SIZE_BYTES(ctx);
|
||||
handle->page_entries_max = pgd_capacity(data);
|
||||
handle->page_start_time_ut = point_in_time_ut;
|
||||
handle->page_end_time_ut = point_in_time_ut;
|
||||
handle->page_position = 1; // zero is already in our data
|
||||
|
@ -436,7 +435,7 @@ static size_t aligned_allocation_entries(size_t max_slots, size_t target_slot, t
|
|||
return slots;
|
||||
}
|
||||
|
||||
static PGD *rrdeng_alloc_new_page_data(struct rrdeng_collect_handle *handle, size_t *data_size, usec_t point_in_time_ut) {
|
||||
static PGD *rrdeng_alloc_new_page_data(struct rrdeng_collect_handle *handle, usec_t point_in_time_ut) {
|
||||
struct rrdengine_instance *ctx = mrg_metric_ctx(handle->metric);
|
||||
|
||||
PGD *d = NULL;
|
||||
|
@ -463,17 +462,11 @@ static PGD *rrdeng_alloc_new_page_data(struct rrdeng_collect_handle *handle, siz
|
|||
internal_fatal(slots < 3 || slots > max_slots, "ooops! wrong distribution of metrics across time");
|
||||
internal_fatal(size > tier_page_size[ctx->config.tier] || size < CTX_POINT_SIZE_BYTES(ctx) * 2, "ooops! wrong page size");
|
||||
|
||||
*data_size = size;
|
||||
|
||||
switch (ctx->config.page_type) {
|
||||
case RRDENG_PAGE_TYPE_ARRAY_32BIT:
|
||||
case RRDENG_PAGE_TYPE_ARRAY_TIER1:
|
||||
d = pgd_create(ctx->config.page_type, slots);
|
||||
break;
|
||||
case RRDENG_PAGE_TYPE_GORILLA_32BIT:
|
||||
// ignore slots, and use the fixed number of slots per gorilla buffer.
|
||||
// gorilla will automatically add more buffers if needed.
|
||||
d = pgd_create(ctx->config.page_type, RRDENG_GORILLA_32BIT_BUFFER_SLOTS);
|
||||
d = pgd_create(ctx->config.page_type, slots);
|
||||
break;
|
||||
default:
|
||||
fatal("Unknown page type: %uc\n", ctx->config.page_type);
|
||||
|
@ -496,24 +489,25 @@ static void rrdeng_store_metric_append_point(STORAGE_COLLECT_HANDLE *sch,
|
|||
struct rrdengine_instance *ctx = mrg_metric_ctx(handle->metric);
|
||||
|
||||
if(unlikely(!handle->page_data))
|
||||
handle->page_data = rrdeng_alloc_new_page_data(handle, &handle->page_data_size, point_in_time_ut);
|
||||
handle->page_data = rrdeng_alloc_new_page_data(handle, point_in_time_ut);
|
||||
|
||||
timing_step(TIMING_STEP_DBENGINE_CHECK_DATA);
|
||||
|
||||
pgd_append_point(handle->page_data,
|
||||
point_in_time_ut,
|
||||
n, min_value, max_value, count, anomaly_count, flags,
|
||||
handle->page_position);
|
||||
size_t additional_bytes = pgd_append_point(handle->page_data,
|
||||
point_in_time_ut,
|
||||
n, min_value, max_value, count, anomaly_count, flags,
|
||||
handle->page_position);
|
||||
|
||||
timing_step(TIMING_STEP_DBENGINE_PACK);
|
||||
|
||||
if(unlikely(!handle->pgc_page)) {
|
||||
rrdeng_store_metric_create_new_page(handle, ctx, point_in_time_ut, handle->page_data, handle->page_data_size);
|
||||
rrdeng_store_metric_create_new_page(handle, ctx, point_in_time_ut, handle->page_data);
|
||||
// handle->position is set to 1 already
|
||||
}
|
||||
else {
|
||||
// update an existing page
|
||||
pgc_page_hot_set_end_time_s(main_cache, handle->pgc_page, (time_t) (point_in_time_ut / USEC_PER_SEC));
|
||||
pgc_page_hot_set_end_time_s(main_cache, handle->pgc_page,
|
||||
(time_t) (point_in_time_ut / USEC_PER_SEC), additional_bytes);
|
||||
handle->page_end_time_ut = point_in_time_ut;
|
||||
|
||||
if(unlikely(++handle->page_position >= handle->page_entries_max)) {
|
||||
|
|
|
@ -13,6 +13,9 @@
|
|||
|
||||
#define RRDENG_FD_BUDGET_PER_INSTANCE (50)
|
||||
|
||||
extern uint64_t dbengine_out_of_memory_protection;
|
||||
extern bool dbengine_use_all_ram_for_caches;
|
||||
|
||||
extern int default_rrdeng_page_cache_mb;
|
||||
extern int default_rrdeng_extent_cache_mb;
|
||||
extern int db_engine_journal_check;
|
||||
|
@ -218,6 +221,7 @@ struct rrdeng_buffer_sizes {
|
|||
size_t deol;
|
||||
size_t pd;
|
||||
size_t pgc;
|
||||
size_t pgd;
|
||||
size_t mrg;
|
||||
#ifdef PDC_USE_JULYL
|
||||
size_t julyl;
|
||||
|
|
34
src/database/rrd-database-mode.c
Normal file
34
src/database/rrd-database-mode.c
Normal file
|
@ -0,0 +1,34 @@
|
|||
|
||||
#include "rrd.h"
|
||||
|
||||
inline const char *rrd_memory_mode_name(RRD_MEMORY_MODE id) {
|
||||
switch(id) {
|
||||
case RRD_MEMORY_MODE_RAM:
|
||||
return RRD_MEMORY_MODE_RAM_NAME;
|
||||
|
||||
case RRD_MEMORY_MODE_NONE:
|
||||
return RRD_MEMORY_MODE_NONE_NAME;
|
||||
|
||||
case RRD_MEMORY_MODE_ALLOC:
|
||||
return RRD_MEMORY_MODE_ALLOC_NAME;
|
||||
|
||||
case RRD_MEMORY_MODE_DBENGINE:
|
||||
return RRD_MEMORY_MODE_DBENGINE_NAME;
|
||||
}
|
||||
|
||||
STORAGE_ENGINE* eng = storage_engine_get(id);
|
||||
if (eng) {
|
||||
return eng->name;
|
||||
}
|
||||
|
||||
return RRD_MEMORY_MODE_RAM_NAME;
|
||||
}
|
||||
|
||||
RRD_MEMORY_MODE rrd_memory_mode_id(const char *name) {
|
||||
STORAGE_ENGINE* eng = storage_engine_find(name);
|
||||
if (eng) {
|
||||
return eng->id;
|
||||
}
|
||||
|
||||
return RRD_MEMORY_MODE_RAM;
|
||||
}
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Reference in a new issue