* rrdset - in progress
* rrdset optimal constructor; rrdset conflict
* rrdset final touches
* re-organization of rrdset object members
* prevent use-after-free
* dictionary dfe supports also counting of iterations
* rrddim managed by dictionary
* rrd.h cleanup
* DICTIONARY_ITEM now is referencing actual dictionary items in the code
* removed rrdset linked list
* Revert "removed rrdset linked list"
This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5.
* removed rrdset linked list
* added comments
* Switch chart uuid to static allocation in rrdset
Remove unused functions
* rrdset_archive() and friends...
* always create rrdfamily
* enable ml_free_dimension
* rrddim_foreach done with dfe
* most custom rrddim loops replaced with rrddim_foreach
* removed accesses to rrddim->dimensions
* removed locks that are no longer needed
* rrdsetvar is now managed by the dictionary
* set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853
* conflict callback of rrdsetvar now properly checks if it has to reset the variable
* dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM
* dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe
* dictionary walkthrough callbacks get dictionary acquired items
* dictionary reference counters that can be dupped from zero
* added advanced functions for get and del
* rrdvar managed by dictionaries
* thread safety for rrdsetvar
* faster rrdvar initialization
* rrdvar string lengths should match in all add, del, get functions
* rrdvar internals hidden from the rest of the world
* rrdvar is now acquired throughout netdata
* hide the internal structures of rrdsetvar
* rrdsetvar is now acquired through out netdata
* rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata
* better error handling
* dont create variables if not initialized for health
* dont create variables if not initialized for health again
* rrdfamily is now managed by dictionaries; references of it are acquired dictionary items
* type checking on acquired objects
* rrdcalc renaming of functions
* type checking for rrdfamily_acquired
* rrdcalc managed by dictionaries
* rrdcalc double free fix
* host rrdvars is always needed
* attempt to fix deadlock 1
* attempt to fix deadlock 2
* Remove unused variable
* attempt to fix deadlock 3
* snprintfz
* rrdcalc index in rrdset fix
* Stop storing active charts and computing chart hashes
* Remove store active chart function
* Remove compute chart hash function
* Remove sql_store_chart_hash function
* Remove store_active_dimension function
* dictionary delayed destruction
* formatting and cleanup
* zero dictionary base on rrdsetvar
* added internal error to log delayed destructions of dictionaries
* typo in rrddimvar
* added debugging info to dictionary
* debug info
* fix for rrdcalc keys being empty
* remove forgotten unlock
* remove deadlock
* Switch to metadata version 5 and drop
chart_hash
chart_hash_map
chart_active
dimension_active
v_chart_hash
* SQL cosmetic changes
* do not busy wait while destroying a referenced dictionary
* remove deadlock
* code cleanup; re-organization;
* fast cleanup and flushing of dictionaries
* number formatting fixes
* do not delete configured alerts when archiving a chart
* rrddim obsolete linked list management outside dictionaries
* removed duplicate contexts call
* fix crash when rrdfamily is not initialized
* dont keep rrddimvar referenced
* properly cleanup rrdvar
* removed some locks
* Do not attempt to cleanup chart_hash / chart_hash_map
* rrdcalctemplate managed by dictionary
* register callbacks on the right dictionary
* removed some more locks
* rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread
* when looking up for an alarm look using both chart id and chart name
* host initialization a bit more modular
* init rrdlabels on host update
* preparation for dictionary views
* improved comment
* unused variables without internal checks
* service threads isolation and worker info
* more worker info in service thread
* thread cancelability debugging with internal checks
* strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647
* dictionary modularization
* Remove unused SQL statement definition
* unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated
* remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops
* rewritten dictionary to have 2 separate locks, one for indexing and another for traversal
* Update collectors/cgroups.plugin/sys_fs_cgroup.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* Update collectors/cgroups.plugin/sys_fs_cgroup.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* Update collectors/proc.plugin/proc_net_dev.c
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* fix memory leak in rrdset cache_dir
* minor dictionary changes
* dont use index locks in single threaded
* obsolete dict option
* rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim;
* fix jump on uninitialized value in dictionary; remove double free of cache_dir
* addressed codacy findings
* removed debugging code
* use the private refcount on dictionaries
* make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim;
* more dictionary statistics
* global statistics about dictionary operations, memory, items, callbacks
* dictionary support for views - missing the public API
* removed warning about unused parameter
* chart and context name for cloud
* chart and context name for cloud, again
* dictionary statistics fixed; first implementation of dictionary views - not currently used
* only the master can globally delete an item
* context needs netdata prefix
* fix context and chart it of spins
* fix for host variables when health is not enabled
* run garbage collector on item insert too
* Fix info message; remove extra "using"
* update dict unittest for new placement of garbage collector
* we need RRDHOST->rrdvars for maintaining custom host variables
* Health initialization needs the host->host_uuid
* split STRING to its own files; no code changes other than that
* initialize health unconditionally
* unit tests do not pollute the global scope with their variables
* Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* rrdfamily
* rrddim
* rrdset plugin and module names
* rrdset units
* rrdset type
* rrdset family
* rrdset title
* rrdset title more
* rrdset context
* rrdcalctemplate context and removal of context hash from rrdset
* strings statistics
* rrdset name
* rearranged members of rrdset
* eliminate rrdset name hash; rrdcalc chart converted to STRING
* rrdset id, eliminated rrdset hash
* rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate
* rrdcalctemplate
* rrdvar
* eval_variable
* rrddimvar and rrdsetvar
* rrdhost hostname, os and tags
* fix master commits
* added thread cache; implemented string_dup without locks
* faster thread cache
* rrdset and rrddim now use dictionaries for indexing
* rrdhost now uses dictionary
* rrdfamily now uses DICTIONARY
* rrdvar using dictionary instead of AVL
* allocate the right size to rrdvar flag members
* rrdhost remaining char * members to STRING *
* better error handling on indexing
* strings now use a read/write lock to allow parallel searches to the index
* removed AVL support from dictionaries; implemented STRING with native Judy calls
* string releases should be negative
* only 31 bits are allowed for enum flags
* proper locking on strings
* string threading unittest and fixes
* fix lgtm finding
* fixed naming
* stream chart/dimension definitions at the beginning of a streaming session
* thread stack variable is undefined on thread cancel
* rrdcontext garbage collect per host on startup
* worker control in garbage collection
* relaxed deletion of rrdmetrics
* type checking on dictfe
* netdata chart to monitor rrdcontext triggers
* Group chart label updates
* rrdcontext better handling of collected rrdsets
* rrdpush incremental transmition of definitions should use as much buffer as possible
* require 1MB per chart
* empty the sender buffer before enabling metrics streaming
* fill up to 50% of buffer
* reset signaling metrics sending
* use the shared variable for status
* use separate host flag for enabling streaming of metrics
* make sure the flag is clear
* add logging for streaming
* add logging for streaming on buffer overflow
* circular_buffer proper sizing
* removed obsolete logs
* do not execute worker jobs if not necessary
* better messages about compression disabling
* proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped
* monitor stream sender used buffer ratio
* Update exporting unit tests
* no need to compare label value with strcmp
* streaming send workers now monitor bandwidth
* workers now use strings
* streaming receiver monitors incoming bandwidth
* parser shift of worker ids
* minor fixes
* Group chart label updates
* Populate context with dimensions that have data
* Fix chart id
* better shift of parser worker ids
* fix for streaming compression
* properly count received bytes
* ensure LZ4 compression ring buffer does not wrap prematurely
* do not stream empty charts; do not process empty instances in rrdcontext
* need_to_send_chart_definition() does not need an rrdset lock any more
* rrdcontext objects are collected, after data have been written to the db
* better logging of RRDCONTEXT transitions
* always set all variables needed by the worker utilization charts
* implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes
* lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock
* STRING code re-organization for clarity
* thread_cache improvements; double numbers precision on worker threads
* STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS
* rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts
* Add index to speed up database context searches
* Removed last_updated optimization (was also buggy after latest merge with master)
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* /api/v1/weights endpoints
* high resolution anomaly rate in parallel with queries; points and options in /api/v1/weights reflect the truth
* context printing
* merged metric_correlations with weights API; added parameter tier to select the tier to run the query; weight api now returns points per tier; added swagger info about weights api
* moved metric_correlations files to web/api/queries as weights
* added contexts filtering; renamed correlated_dimensions; weights API is always enabled; code cleanup
* allow returning zero results
* Tier part 1
* Tier part 2
* Tier part 3
* Tier part 4
* Tier part 5
* Fix some ML compilation errors
* fix more conflicts
* pass proper tier
* move metric_uuid from state to RRDDIM
* move aclk_live_status from state to RRDDIM
* move ml_dimension from state to RRDDIM
* abstracted the data collection interface
* support flushing for mem db too
* abstracted the query api
* abstracted latest/oldest time per metric
* cleanup
* store_metric for tier1
* fix for store_metric
* allow multiple tiers, more than 2
* state to tier
* Change storage type in db. Query param to request min, max, sum or average
* Store tier data correctly
* Fix skipping tier page type
* Add tier grouping in the tier
* Fix to handle archived charts (part 1)
* Temp fix for query granularity when requesting tier1 data
* Fix parameters in the correct order and calculate the anomaly based on the anomaly count
* Proper tiering grouping
* Anomaly calculation based on anomaly count
* force type checking on storage handles
* update cmocka tests
* fully dynamic number of storage tiers
* fix static allocation
* configure grouping for all tiers; disable tiers for unittest; disable statsd configuration for private charts mode
* use default page dt using the tiering info
* automatic selection of tier
* fix for automatic selection of tier
* working prototype of dynamic tier selection
* automatic selection of tier done right (I hope)
* ask for the proper tier value, based on the grouping function
* fixes for unittests and load_metric_next()
* fixes for lgtm findings
* minor renames
* add dbengine to page cache size setting
* add dbengine to page cache with malloc
* query engine optimized to loop as little are required based on the view_update_every
* query engine grouping methods now do not assume a constant number of points per group and they allocate memory with OWA
* report db points per tier in jsonwrap
* query planer that switches database tiers on the fly to satisfy the query for the entire timeframe
* dbegnine statistics and documentation (in progress)
* calculate average point duration in db
* handle single point pages the best we can
* handle single point pages even better
* Keep page type in the rrdeng_page_descr
* updated doc
* handle future backwards compatibility - improved statistics
* support &tier=X in queries
* enfore increasing iterations on tiers
* tier 1 is always 1 iteration
* backfilling higher tiers on first data collection
* reversed anomaly bit
* set up to 5 tiers
* natural points should only be offered on tier 0, except a specific tier is selected
* do not allow more than 65535 points of tier0 to be aggregated on any tier
* Work only on actually activated tiers
* fix query interpolation
* fix query interpolation again
* fix lgtm finding
* Activate one tier for now
* backfilling of higher tiers using raw metrics from lower tiers
* fix for crash on start when storage tiers is increased from the default
* more statistics on exit
* fix bug that prevented higher tiers to get any values; added backfilling options
* fixed the statistics log line
* removed limit of 255 iterations per tier; moved the code of freezing rd->tiers[x]->db_metric_handle
* fixed division by zero on zero points_wanted
* removed dead code
* Decide on the descr->type for the type of metric
* dont store metrics on unknown page types
* free db_metric_handle on sql based context queries
* Disable STORAGE_POINT value check in the exporting engine unit tests
* fix for db modes other than dbengine
* fix for aclk archived chart queries destroying db_metric_handles of valid rrddims
* fix left-over freez() instead of OWA freez on median queries
Co-authored-by: Costa Tsaousis <costa@netdata.cloud>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* netdata doubles
* fix cmocka test
* fix cmocka test again
* fix left-overs of long double to NETDATA_DOUBLE
* RRDDIM detached from disk representation; db settings in [db] section of netdata.conf
* update the memory before saving
* rrdset is now detached from file structures too
* on memory mode map, update the memory mapped structures on every iteration
* allow RRD_ID_LENGTH_MAX to be changed
* granularity secs, back to update every
* fix formatting
* more formatting
* set grouping functions
* storage engine should check the validity of timestamps, not the query engine
* calculate and store in RRDR anomaly rates for every query
* anomaly rate used by volume metric correlations
* mc volume should use absolute data, to avoid cancelling effect
* return anomaly-rates in jasonwrap with jw-anomaly-rates option to data queries
* dont return null on anomaly rates
* allow passing group query options from the URL
* added countif to the query engine and used it in metric correlations
* fix configure
* fix countif and anomaly rate percentages
* added group_options to metric correlations; updated swagger
* added newline at the end of yaml file
* always check the time the highlighted window was above/below the highlighted window
* properly track time in memory queries
* error for internal checks only
* moved pack_storage_number() into the storage engines
* moved unpack_storage_number() inside the storage engines
* remove old comment
* pass unit tests
* properly detect zero or subnormal values in pack_storage_number()
* fill nulls before the value, not after
* make sure math.h is included
* workaround for isfinite()
* fix for isfinite()
* faster isfinite() alternative
* fix for faster isfinite() alternative
* next_metric() now returns end_time too
* variable step implemented in a generic way
* remove left-over variables
* ensure we always complete the wanted number of points
* fixes
* ensure no infinite loop
* mc-volume-improvements: Add information about invalid condition
* points should have a duration in the past
* removed unneeded info() line
* Fix unit tests for exporting engine
* new_point should only be checked when it is fetched from the db; better comment about the premature breaking of the main query loop
Co-authored-by: Thiago Marques <thiagoftsm@gmail.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
* faster correlations
* 4x times faster correlations
* a little bit more help
* 10x times faster metrics correlations
* 6 digits precision; better comments
* enabled metrics correlations by default
* abstracted DIFFS_NUMBER to allow easily changing it
* reworked the entire logic to have more accuracy and support a baseline that is power of two multiple of highlight
* properly calculate shifts
* even more improved version
* added support for timeout; fixed another memory leak; skipped hidden dimensions
* default timeout 1min
* reduce memory even further
* use dictionary for the list of charts and optimize locks
* return 403 forbidden, when mc is not enabled
* added query options
* dont process zero dimensions
* added volume method as an option to metric correlations ; now metric correlations can support multiple implementations
* make sure we will never crash
* spread results evenly for both kstwo and volume
* fixed bug in query engine that was missing misaligned queries when a single point was requested from the db; improved comments; improved query flags
* updated swagger and added sane defaults; query options are now supported, including anomaly-bit
* added "raw" option to allow cross node correlations; added "group" option to allow different time aggregations; allowed calling metric correlations without any parameters; allowed calling metric correlations with relative timestamps; added timeout to volume method; properly handled timeout on ks2 method; json output now sends all parameters back - same for json_wrap; modified query engine to use present time for relative timestamps; modified "allow_past" to mean both past backwards and forwards
* emulate the old behaviour about zero points
* 100% accuracy against python ks_2samp(); now the default is volume and the default points are 500
* added config option to change default metric correlations method
* removed work-arounds now that rrdlabels are merged
* Add timeout parameter in queries and in calling functions
* Add CANCEL flag in RRDR and code to cancel a query
* Update swagger
* Format swagger file properly
- add possibility to change badge font color
- unify color param syntax and fix where not working
before in conflict with documentation custom html colors were not possible
as label color. This fixes that and makes all `color` parameter values
behave in the same way
* issue #3925 implement optional fixed size badges
* add docu related to #3925
* explicit cast (thiagoftsm) an split to multiline
* implement clipping of text when fixed_width_label enabled
* Update web/api/badges/README.md
improved docu based on suggestion from @joelhans
Co-Authored-By: Joel Hans <joel.g.hans@gmail.com>
* update docu per req. of @thiagoftsm
* health_connection: http_error_pattern
This commit brings an unique pattern for the Netdata webserver errors,
now Netdata uses define for all web error
* http_error_pattern: API v1
This PR also brings the pattern for the web_api_v1.c
#### Summary
Fixes#3117
Additionally it adds support for UTF-8 in URL parser (as it should).
Label sizes now are updated by browser with JavaScript (although guess is still calculated by verdana11_widths with minor improvements)
#### Component Name
API/Badges, LibNetData/URL
#### Additional Information
It was found that not only verdana11_widths need to be updated but the url parser replaces international characters with spaces (one space per each byte of multibyte character).
Therefore I update both to support international chars.
This reverts commit 58b7d95a7e.
---
As agreed with @thiago and @cakrit we revert URL parser changes,
to buy the time on a more detailed investigation
---
* modularized exporters
* modularized API data queries
* optimized queries
* modularized API data reduction methods
* modularized api queries
* added new directories in makefiles
* added median db query
* moved all RRDR_GROUPING related to query.h
* added stddev query
* operational median and stddev
* working simple exponential smoothing
* too complex to do it right
* fixed ses
* fixed ses
* rewrote query engine
* fix double-exponential-smoothing
* cleanup
* fixed bug identified by @vlvkobal at rrdset_first_slot()
* enable freeipmi on systems with libipmimonitoring; #4440