* rrdset - in progress
* rrdset optimal constructor; rrdset conflict
* rrdset final touches
* re-organization of rrdset object members
* prevent use-after-free
* dictionary dfe supports also counting of iterations
* rrddim managed by dictionary
* rrd.h cleanup
* DICTIONARY_ITEM now is referencing actual dictionary items in the code
* removed rrdset linked list
* Revert "removed rrdset linked list"
This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5.
* removed rrdset linked list
* added comments
* Switch chart uuid to static allocation in rrdset
Remove unused functions
* rrdset_archive() and friends...
* always create rrdfamily
* enable ml_free_dimension
* rrddim_foreach done with dfe
* most custom rrddim loops replaced with rrddim_foreach
* removed accesses to rrddim->dimensions
* removed locks that are no longer needed
* rrdsetvar is now managed by the dictionary
* set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853
* conflict callback of rrdsetvar now properly checks if it has to reset the variable
* dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM
* dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe
* dictionary walkthrough callbacks get dictionary acquired items
* dictionary reference counters that can be dupped from zero
* added advanced functions for get and del
* rrdvar managed by dictionaries
* thread safety for rrdsetvar
* faster rrdvar initialization
* rrdvar string lengths should match in all add, del, get functions
* rrdvar internals hidden from the rest of the world
* rrdvar is now acquired throughout netdata
* hide the internal structures of rrdsetvar
* rrdsetvar is now acquired through out netdata
* rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata
* better error handling
* dont create variables if not initialized for health
* dont create variables if not initialized for health again
* rrdfamily is now managed by dictionaries; references of it are acquired dictionary items
* type checking on acquired objects
* rrdcalc renaming of functions
* type checking for rrdfamily_acquired
* rrdcalc managed by dictionaries
* rrdcalc double free fix
* host rrdvars is always needed
* attempt to fix deadlock 1
* attempt to fix deadlock 2
* Remove unused variable
* attempt to fix deadlock 3
* snprintfz
* rrdcalc index in rrdset fix
* Stop storing active charts and computing chart hashes
* Remove store active chart function
* Remove compute chart hash function
* Remove sql_store_chart_hash function
* Remove store_active_dimension function
* dictionary delayed destruction
* formatting and cleanup
* zero dictionary base on rrdsetvar
* added internal error to log delayed destructions of dictionaries
* typo in rrddimvar
* added debugging info to dictionary
* debug info
* fix for rrdcalc keys being empty
* remove forgotten unlock
* remove deadlock
* Switch to metadata version 5 and drop
chart_hash
chart_hash_map
chart_active
dimension_active
v_chart_hash
* SQL cosmetic changes
* do not busy wait while destroying a referenced dictionary
* remove deadlock
* code cleanup; re-organization;
* fast cleanup and flushing of dictionaries
* number formatting fixes
* do not delete configured alerts when archiving a chart
* rrddim obsolete linked list management outside dictionaries
* removed duplicate contexts call
* fix crash when rrdfamily is not initialized
* dont keep rrddimvar referenced
* properly cleanup rrdvar
* removed some locks
* Do not attempt to cleanup chart_hash / chart_hash_map
* rrdcalctemplate managed by dictionary
* register callbacks on the right dictionary
* removed some more locks
* rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread
* when looking up for an alarm look using both chart id and chart name
* host initialization a bit more modular
* init rrdlabels on host update
* preparation for dictionary views
* improved comment
* unused variables without internal checks
* service threads isolation and worker info
* more worker info in service thread
* thread cancelability debugging with internal checks
* strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647
* dictionary modularization
* Remove unused SQL statement definition
* unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated
* remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops
* rewritten dictionary to have 2 separate locks, one for indexing and another for traversal
* Update collectors/cgroups.plugin/sys_fs_cgroup.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* Update collectors/cgroups.plugin/sys_fs_cgroup.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* Update collectors/proc.plugin/proc_net_dev.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* fix memory leak in rrdset cache_dir
* minor dictionary changes
* dont use index locks in single threaded
* obsolete dict option
* rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim;
* fix jump on uninitialized value in dictionary; remove double free of cache_dir
* addressed codacy findings
* removed debugging code
* use the private refcount on dictionaries
* make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim;
* more dictionary statistics
* global statistics about dictionary operations, memory, items, callbacks
* dictionary support for views - missing the public API
* removed warning about unused parameter
* chart and context name for cloud
* chart and context name for cloud, again
* dictionary statistics fixed; first implementation of dictionary views - not currently used
* only the master can globally delete an item
* context needs netdata prefix
* fix context and chart it of spins
* fix for host variables when health is not enabled
* run garbage collector on item insert too
* Fix info message; remove extra "using"
* update dict unittest for new placement of garbage collector
* we need RRDHOST->rrdvars for maintaining custom host variables
* Health initialization needs the host->host_uuid
* split STRING to its own files; no code changes other than that
* initialize health unconditionally
* unit tests do not pollute the global scope with their variables
* Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* move chart_labels to rrdset
* rename chart_labels to rrdlabels
* renamed hash_id to uuid
* turned is_ar_chart into an rrdset flag
* removed rrdset state
* removed unused senders_connected member of rrdhost
* removed unused host flag RRDHOST_FLAG_MULTIHOST
* renamed rrdhost host_labels to rrdlabels
* Update exporting unit tests
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* rrdfamily
* rrddim
* rrdset plugin and module names
* rrdset units
* rrdset type
* rrdset family
* rrdset title
* rrdset title more
* rrdset context
* rrdcalctemplate context and removal of context hash from rrdset
* strings statistics
* rrdset name
* rearranged members of rrdset
* eliminate rrdset name hash; rrdcalc chart converted to STRING
* rrdset id, eliminated rrdset hash
* rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate
* rrdcalctemplate
* rrdvar
* eval_variable
* rrddimvar and rrdsetvar
* rrdhost hostname, os and tags
* fix master commits
* added thread cache; implemented string_dup without locks
* faster thread cache
* rrdset and rrddim now use dictionaries for indexing
* rrdhost now uses dictionary
* rrdfamily now uses DICTIONARY
* rrdvar using dictionary instead of AVL
* allocate the right size to rrdvar flag members
* rrdhost remaining char * members to STRING *
* better error handling on indexing
* strings now use a read/write lock to allow parallel searches to the index
* removed AVL support from dictionaries; implemented STRING with native Judy calls
* string releases should be negative
* only 31 bits are allowed for enum flags
* proper locking on strings
* string threading unittest and fixes
* fix lgtm finding
* fixed naming
* stream chart/dimension definitions at the beginning of a streaming session
* thread stack variable is undefined on thread cancel
* rrdcontext garbage collect per host on startup
* worker control in garbage collection
* relaxed deletion of rrdmetrics
* type checking on dictfe
* netdata chart to monitor rrdcontext triggers
* Group chart label updates
* rrdcontext better handling of collected rrdsets
* rrdpush incremental transmition of definitions should use as much buffer as possible
* require 1MB per chart
* empty the sender buffer before enabling metrics streaming
* fill up to 50% of buffer
* reset signaling metrics sending
* use the shared variable for status
* use separate host flag for enabling streaming of metrics
* make sure the flag is clear
* add logging for streaming
* add logging for streaming on buffer overflow
* circular_buffer proper sizing
* removed obsolete logs
* do not execute worker jobs if not necessary
* better messages about compression disabling
* proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped
* monitor stream sender used buffer ratio
* Update exporting unit tests
* no need to compare label value with strcmp
* streaming send workers now monitor bandwidth
* workers now use strings
* streaming receiver monitors incoming bandwidth
* parser shift of worker ids
* minor fixes
* Group chart label updates
* Populate context with dimensions that have data
* Fix chart id
* better shift of parser worker ids
* fix for streaming compression
* properly count received bytes
* ensure LZ4 compression ring buffer does not wrap prematurely
* do not stream empty charts; do not process empty instances in rrdcontext
* need_to_send_chart_definition() does not need an rrdset lock any more
* rrdcontext objects are collected, after data have been written to the db
* better logging of RRDCONTEXT transitions
* always set all variables needed by the worker utilization charts
* implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes
* lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock
* STRING code re-organization for clarity
* thread_cache improvements; double numbers precision on worker threads
* STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS
* rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts
* Add index to speed up database context searches
* Removed last_updated optimization (was also buggy after latest merge with master)
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* Tier part 1
* Tier part 2
* Tier part 3
* Tier part 4
* Tier part 5
* Fix some ML compilation errors
* fix more conflicts
* pass proper tier
* move metric_uuid from state to RRDDIM
* move aclk_live_status from state to RRDDIM
* move ml_dimension from state to RRDDIM
* abstracted the data collection interface
* support flushing for mem db too
* abstracted the query api
* abstracted latest/oldest time per metric
* cleanup
* store_metric for tier1
* fix for store_metric
* allow multiple tiers, more than 2
* state to tier
* Change storage type in db. Query param to request min, max, sum or average
* Store tier data correctly
* Fix skipping tier page type
* Add tier grouping in the tier
* Fix to handle archived charts (part 1)
* Temp fix for query granularity when requesting tier1 data
* Fix parameters in the correct order and calculate the anomaly based on the anomaly count
* Proper tiering grouping
* Anomaly calculation based on anomaly count
* force type checking on storage handles
* update cmocka tests
* fully dynamic number of storage tiers
* fix static allocation
* configure grouping for all tiers; disable tiers for unittest; disable statsd configuration for private charts mode
* use default page dt using the tiering info
* automatic selection of tier
* fix for automatic selection of tier
* working prototype of dynamic tier selection
* automatic selection of tier done right (I hope)
* ask for the proper tier value, based on the grouping function
* fixes for unittests and load_metric_next()
* fixes for lgtm findings
* minor renames
* add dbengine to page cache size setting
* add dbengine to page cache with malloc
* query engine optimized to loop as little are required based on the view_update_every
* query engine grouping methods now do not assume a constant number of points per group and they allocate memory with OWA
* report db points per tier in jsonwrap
* query planer that switches database tiers on the fly to satisfy the query for the entire timeframe
* dbegnine statistics and documentation (in progress)
* calculate average point duration in db
* handle single point pages the best we can
* handle single point pages even better
* Keep page type in the rrdeng_page_descr
* updated doc
* handle future backwards compatibility - improved statistics
* support &tier=X in queries
* enfore increasing iterations on tiers
* tier 1 is always 1 iteration
* backfilling higher tiers on first data collection
* reversed anomaly bit
* set up to 5 tiers
* natural points should only be offered on tier 0, except a specific tier is selected
* do not allow more than 65535 points of tier0 to be aggregated on any tier
* Work only on actually activated tiers
* fix query interpolation
* fix query interpolation again
* fix lgtm finding
* Activate one tier for now
* backfilling of higher tiers using raw metrics from lower tiers
* fix for crash on start when storage tiers is increased from the default
* more statistics on exit
* fix bug that prevented higher tiers to get any values; added backfilling options
* fixed the statistics log line
* removed limit of 255 iterations per tier; moved the code of freezing rd->tiers[x]->db_metric_handle
* fixed division by zero on zero points_wanted
* removed dead code
* Decide on the descr->type for the type of metric
* dont store metrics on unknown page types
* free db_metric_handle on sql based context queries
* Disable STORAGE_POINT value check in the exporting engine unit tests
* fix for db modes other than dbengine
* fix for aclk archived chart queries destroying db_metric_handles of valid rrddims
* fix left-over freez() instead of OWA freez on median queries
Co-authored-by: Costa Tsaousis <costa@netdata.cloud>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* netdata doubles
* fix cmocka test
* fix cmocka test again
* fix left-overs of long double to NETDATA_DOUBLE
* RRDDIM detached from disk representation; db settings in [db] section of netdata.conf
* update the memory before saving
* rrdset is now detached from file structures too
* on memory mode map, update the memory mapped structures on every iteration
* allow RRD_ID_LENGTH_MAX to be changed
* granularity secs, back to update every
* fix formatting
* more formatting
* set grouping functions
* storage engine should check the validity of timestamps, not the query engine
* calculate and store in RRDR anomaly rates for every query
* anomaly rate used by volume metric correlations
* mc volume should use absolute data, to avoid cancelling effect
* return anomaly-rates in jasonwrap with jw-anomaly-rates option to data queries
* dont return null on anomaly rates
* allow passing group query options from the URL
* added countif to the query engine and used it in metric correlations
* fix configure
* fix countif and anomaly rate percentages
* added group_options to metric correlations; updated swagger
* added newline at the end of yaml file
* always check the time the highlighted window was above/below the highlighted window
* properly track time in memory queries
* error for internal checks only
* moved pack_storage_number() into the storage engines
* moved unpack_storage_number() inside the storage engines
* remove old comment
* pass unit tests
* properly detect zero or subnormal values in pack_storage_number()
* fill nulls before the value, not after
* make sure math.h is included
* workaround for isfinite()
* fix for isfinite()
* faster isfinite() alternative
* fix for faster isfinite() alternative
* next_metric() now returns end_time too
* variable step implemented in a generic way
* remove left-over variables
* ensure we always complete the wanted number of points
* fixes
* ensure no infinite loop
* mc-volume-improvements: Add information about invalid condition
* points should have a duration in the past
* removed unneeded info() line
* Fix unit tests for exporting engine
* new_point should only be checked when it is fetched from the db; better comment about the premature breaking of the main query loop
Co-authored-by: Thiago Marques <thiagoftsm@gmail.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* faster correlations
* 4x times faster correlations
* a little bit more help
* 10x times faster metrics correlations
* 6 digits precision; better comments
* enabled metrics correlations by default
* abstracted DIFFS_NUMBER to allow easily changing it
* reworked the entire logic to have more accuracy and support a baseline that is power of two multiple of highlight
* properly calculate shifts
* even more improved version
* added support for timeout; fixed another memory leak; skipped hidden dimensions
* default timeout 1min
* reduce memory even further
* use dictionary for the list of charts and optimize locks
* return 403 forbidden, when mc is not enabled
* added query options
* dont process zero dimensions
* added volume method as an option to metric correlations ; now metric correlations can support multiple implementations
* make sure we will never crash
* spread results evenly for both kstwo and volume
* fixed bug in query engine that was missing misaligned queries when a single point was requested from the db; improved comments; improved query flags
* updated swagger and added sane defaults; query options are now supported, including anomaly-bit
* added "raw" option to allow cross node correlations; added "group" option to allow different time aggregations; allowed calling metric correlations without any parameters; allowed calling metric correlations with relative timestamps; added timeout to volume method; properly handled timeout on ks2 method; json output now sends all parameters back - same for json_wrap; modified query engine to use present time for relative timestamps; modified "allow_past" to mean both past backwards and forwards
* emulate the old behaviour about zero points
* 100% accuracy against python ks_2samp(); now the default is volume and the default points are 500
* added config option to change default metric correlations method
* removed work-arounds now that rrdlabels are merged
* squashed and rebased to master
* fix overflow and single character bug in sanitize; include rrd.h instead of node_info.h
* added unittest for UTF-8 multibyte sanitization
* Fix unit test compilation
* Fix CMake build
* remove double sanitizer for opentsdb; cleanup sanitize_json_string()
* rename error_description to error_message to avoid conflict with json-c
* revert last and undef error_description from json-c
* more unittests; attempt to fix protobuf map issue
* get rid of rrdlabels_get() and replace it with a safe version that writes the value to a buffer
* added dictionary sorting unittest; rrdlabels_to_buffer() now is sorted
* better sorted dictionary checking
* proper unittesting for sorted dictionaries
* call dictionary deletion callback when destroying the dictionary
* remove obsolete variable
* Fix exporting unit tests
* Fix k8s label parsing test
* workaround for cmocka and strdupz()
* Bypass cmocka memory allocation check
* Revert "Bypass cmocka memory allocation check"
This reverts commit 4c49923839.
* Revert "workaround for cmocka and strdupz()"
This reverts commit 7bebee0480.
* Bypass cmocka memory allocation checks
* respect json formatting for chart labels
* cloud sends colons
* print the value only once
* allow parenthesis in values and spaces; make stream sender send quotes for values
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* dictionary internals isolation
* more dictionary cleanups
* added unit test
* we should use DICT internally
* disable cups in cmake
* implement DICTIONARY with Judy arrays
* operational JUDY implementation
* JUDY cleanup
* JUDY summary added
* JudyHS implementation with double linked list
* test negative searches too
* optimize destruction
* optimize set to insert first without lookup
* updated stats
* code cleanup; better organization; updated info
* more code cleanup and commenting
* more cleanup, renames and comments
* fix rename
* more cleanups
* use Judy.h from system paths
* added foreach traversal; added flag to add item in front; isolated locks to their own functions; destruction returns the number of bytes freed
* more comments; flags are now 16-bit
* completed unittesting
* addressed comments and added reference counters maintainance
* added unittest in main; tested removal of items in front, back and middle
* added read/write walkthrough and foreach; allowed walkthrough and foreach in write mode to delete the current element (used by cups.plugin); referenced counters removed from the API
* DICTFE.name should be const too
* added API calls for exposing all statistics
* dictionary flags as enum and reference counters as atomic operations
* more comments; improved error handling at unit tests
* added functions to allow unsafe access while traversing the dictionary with locks in place
* check for libcups in cmake
* added delete callback; implemented statsd with this dictionary
* added missing dfe_done()
* added alternative implementation with AVL
* added documentation
* added comments and warning about AVL
* dictionary walktrhough on new code
* simplified foreach; updated docs
* updated docs
* AVL is much faster without hashes
* AVL should follow DBENGINE
* Consolidate query params
* Add new option to show full dimensions in the json header (this will include dimensions, charts and chart labels)
* Group and pass parameters with query_params
Initial work on host labels from the dedicated branch. Includes work for issues #7096, #7400, #7411, #7369, #7410, #7458, #7459, #7412 and #7408 by @vlvkobal, @thiagoftsm, @cakrit and @amoss.
* Database engine prototype version 0
* Database engine initial integration with netdata POC
* Scalable database engine with file and memory management.
* Database engine integration with netdata
* Added MIN MAX definitions to fix alpine build of travis CI
* Bugfix for backends and new DB engine, remove useless rrdset_time2slot() calls and erroneous checks
* DB engine disk protocol correction
* Moved DB engine storage file location to /var/cache/netdata/{host}/dbengine
* Fix configure to require openSSL for DB engine
* Fix netdata daemon health not holding read lock when iterating chart dimensions
* Optimized query API for new DB engine and old netdata DB fallback code-path
* netdata database internal query API improvements and cleanup
* Bugfix for DB engine queries returning empty values
* Added netdata internal check for data queries for old and new DB
* Added statistics to DB engine and fixed memory corruption bug
* Added preliminary charts for DB engine statistics
* Changed DB engine ratio statistics to incremental
* Added netdata statistics charts for DB engine internal statistics
* Fix for netdata not compiling successfully when missing dbengine dependencies
* Added DB engine functional test to netdata unittest command parameter
* Implemented DB engine dataset generator based on example.random chart
* Fix build error in CI
* Support older versions of libuv1
* Fixes segmentation fault when using multiple DB engine instances concurrently
* Fix memory corruption bug
* Fixed createdataset advanced option not exiting
* Fix for DB engine not working on FreeBSD
* Support FreeBSD library paths of new dependencies
* Workaround for unsupported O_DIRECT in OS X
* Fix unittest crashing during cleanup
* Disable DB engine FS caching in Apple OS X since O_DIRECT is not available
* Fix segfault when unittest and DB engine dataset generator don't have permissions to create temporary host
* Modified DB engine dataset generator to create multiple files
* Toned down overzealous page cache prefetcher
* Reduce internal memory fragmentation for page-cache data pages
* Added documentation describing the DB engine
* Documentation bugfixes
* Fixed unit tests compilation errors since last rebase
* Added note to back-up the DB engine files in documentation
* Added codacy fix.
* Support old gcc versions for atomic counters in DB engine