0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-22 20:42:33 +00:00
Commit graph

422 commits

Author SHA1 Message Date
Costa Tsaousis
cb7af25c09
RRD structures managed by dictionaries ()
* rrdset - in progress

* rrdset optimal constructor; rrdset conflict

* rrdset final touches

* re-organization of rrdset object members

* prevent use-after-free

* dictionary dfe supports also counting of iterations

* rrddim managed by dictionary

* rrd.h cleanup

* DICTIONARY_ITEM now is referencing actual dictionary items in the code

* removed rrdset linked list

* Revert "removed rrdset linked list"

This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5.

* removed rrdset linked list

* added comments

* Switch chart uuid to static allocation in rrdset
Remove unused functions

* rrdset_archive() and friends...

* always create rrdfamily

* enable ml_free_dimension

* rrddim_foreach done with dfe

* most custom rrddim loops replaced with rrddim_foreach

* removed accesses to rrddim->dimensions

* removed locks that are no longer needed

* rrdsetvar is now managed by the dictionary

* set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853

* conflict callback of rrdsetvar now properly checks if it has to reset the variable

* dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM

* dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe

* dictionary walkthrough callbacks get dictionary acquired items

* dictionary reference counters that can be dupped from zero

* added advanced functions for get and del

* rrdvar managed by dictionaries

* thread safety for rrdsetvar

* faster rrdvar initialization

* rrdvar string lengths should match in all add, del, get functions

* rrdvar internals hidden from the rest of the world

* rrdvar is now acquired throughout netdata

* hide the internal structures of rrdsetvar

* rrdsetvar is now acquired through out netdata

* rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata

* better error handling

* dont create variables if not initialized for health

* dont create variables if not initialized for health again

* rrdfamily is now managed by dictionaries; references of it are acquired dictionary items

* type checking on acquired objects

* rrdcalc renaming of functions

* type checking for rrdfamily_acquired

* rrdcalc managed by dictionaries

* rrdcalc double free fix

* host rrdvars is always needed

* attempt to fix deadlock 1

* attempt to fix deadlock 2

* Remove unused variable

* attempt to fix deadlock 3

* snprintfz

* rrdcalc index in rrdset fix

* Stop storing active charts and computing chart hashes

* Remove store active chart function

* Remove compute chart hash function

* Remove sql_store_chart_hash function

* Remove store_active_dimension function

* dictionary delayed destruction

* formatting and cleanup

* zero dictionary base on rrdsetvar

* added internal error to log delayed destructions of dictionaries

* typo in rrddimvar

* added debugging info to dictionary

* debug info

* fix for rrdcalc keys being empty

* remove forgotten unlock

* remove deadlock

* Switch to metadata version 5 and drop
  chart_hash
  chart_hash_map
  chart_active
  dimension_active
  v_chart_hash

* SQL cosmetic changes

* do not busy wait while destroying a referenced dictionary

* remove deadlock

* code cleanup; re-organization;

* fast cleanup and flushing of dictionaries

* number formatting fixes

* do not delete configured alerts when archiving a chart

* rrddim obsolete linked list management outside dictionaries

* removed duplicate contexts call

* fix crash when rrdfamily is not initialized

* dont keep rrddimvar referenced

* properly cleanup rrdvar

* removed some locks

* Do not attempt to cleanup chart_hash / chart_hash_map

* rrdcalctemplate managed by dictionary

* register callbacks on the right dictionary

* removed some more locks

* rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread

* when looking up for an alarm look using both chart id and chart name

* host initialization a bit more modular

* init rrdlabels on host update

* preparation for dictionary views

* improved comment

* unused variables without internal checks

* service threads isolation and worker info

* more worker info in service thread

* thread cancelability debugging with internal checks

* strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647

* dictionary modularization

* Remove unused SQL statement definition

* unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated

* remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops

* rewritten dictionary to have 2 separate locks, one for indexing and another for traversal

* Update collectors/cgroups.plugin/sys_fs_cgroup.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* Update collectors/cgroups.plugin/sys_fs_cgroup.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* Update collectors/proc.plugin/proc_net_dev.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* fix memory leak in rrdset cache_dir

* minor dictionary changes

* dont use index locks in single threaded

* obsolete dict option

* rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim;

* fix jump on uninitialized value in dictionary; remove double free of cache_dir

* addressed codacy findings

* removed debugging code

* use the private refcount on dictionaries

* make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim;

* more dictionary statistics

* global statistics about dictionary operations, memory, items, callbacks

* dictionary support for views - missing the public API

* removed warning about unused parameter

* chart and context name for cloud

* chart and context name for cloud, again

* dictionary statistics fixed; first implementation of dictionary views - not currently used

* only the master can globally delete an item

* context needs netdata prefix

* fix context and chart it of spins

* fix for host variables when health is not enabled

* run garbage collector on item insert too

* Fix info message; remove extra "using"

* update dict unittest for new placement of garbage collector

* we need RRDHOST->rrdvars for maintaining custom host variables

* Health initialization needs the host->host_uuid

* split STRING to its own files; no code changes other than that

* initialize health unconditionally

* unit tests do not pollute the global scope with their variables

* Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-19 23:46:13 +03:00
Ilya Mashchenko
28f55f6614
remove _instance_family label () 2022-09-18 19:44:44 +03:00
Costa Tsaousis
6a82d7bd25
fix typo not deleting collected flag; force removing collected flag on child disconnect () 2022-09-16 16:45:59 +03:00
Stelios Fragkakis
b629d3e310
Advance the buffer properly to scan the journal file () 2022-09-14 19:24:32 +03:00
Stelios Fragkakis
466b1fcc56
Add sqlite page cache hit and miss statistics ()
* Add sqlite page cache hit/ miss statistics

* Proper function definition

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* Proper function calls

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-14 19:22:25 +03:00
Stelios Fragkakis
3b4fc558e1
Use mmap if possible during startup for journal replay ()
Try to mmap the journal files during replay. If the mmap fails, fallback to the old way
2022-09-14 13:00:31 +03:00
Stelios Fragkakis
e6870f0e8f
Improve agent shutdown time ()
* Remove dbengine statistics on shutdown since this is now provided by the /api/v1/dbengine_stats endpoint
Skip the memory cleanup if the agent is shutting down and not compiled with NETDATA_INTERNAL_CHECKS

* Report database tier as the event loop is shutting down

* Remove not-useful pointer report in the logfile
2022-09-08 23:45:05 +03:00
Stelios Fragkakis
f0b19a4a22
Fix a memory leak on archived host creation ()
When creating an archived host do not initialize labels (it is done already after host creation)
2022-09-07 23:13:03 +03:00
Costa Tsaousis
642e00348d
fix rrdcontexts left in the post-processing queue from the garbage collector ()
* fix rrdcontexts left in the post-processing queue from the garbage collector

* set the queuing flags atomically, using the dictionary callbacks
2022-09-07 23:02:09 +03:00
Costa Tsaousis
3f6a75250d
Obsolete RRDSET state ()
* move chart_labels to rrdset

* rename chart_labels to rrdlabels

* renamed hash_id to uuid

* turned is_ar_chart into an rrdset flag

* removed rrdset state

* removed unused senders_connected member of rrdhost

* removed unused host flag RRDHOST_FLAG_MULTIHOST

* renamed rrdhost host_labels to rrdlabels

* Update exporting unit tests

Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-07 15:28:30 +03:00
Costa Tsaousis
c7d2732ebf
remove forgotten avl structure from rrdcalc () 2022-09-06 23:38:48 +03:00
Costa Tsaousis
58c79fd329
Faster rrdcontext ()
* moved rrdcontexts processing to worker thread

* added loggings

* check for aclk deeper in the code

* removed unessesary logs

* code re-organization; cleanup; more comments; better error handling; rrdcontext locks optimization; more clarity

* updated 2 comments

* make instances walkthrough reentrant; move context lock to the place is really needed

* created macro for reentrant dictionary walkthrough

* incremental updates on instances and metrics

* renamed family of rrdcontext workers

* prevent crash in case RRDINSTANCE or RRDMETRIC is freed during shutdown

* prevent crash during rrddim save, on out of memory fatal()

* always post-process contexts

* added tracing for tracking the caller that trigger updates

* more details on tracing info

* fix for charts that are collected without metrics
2022-09-06 19:02:39 +03:00
Costa Tsaousis
5e1b95cf92
Deduplicate all netdata strings ()
* rrdfamily

* rrddim

* rrdset plugin and module names

* rrdset units

* rrdset type

* rrdset family

* rrdset title

* rrdset title more

* rrdset context

* rrdcalctemplate context and removal of context hash from rrdset

* strings statistics

* rrdset name

* rearranged members of rrdset

* eliminate rrdset name hash; rrdcalc chart converted to STRING

* rrdset id, eliminated rrdset hash

* rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate

* rrdcalctemplate

* rrdvar

* eval_variable

* rrddimvar and rrdsetvar

* rrdhost hostname, os and tags

* fix master commits

* added thread cache; implemented string_dup without locks

* faster thread cache

* rrdset and rrddim now use dictionaries for indexing

* rrdhost now uses dictionary

* rrdfamily now uses DICTIONARY

* rrdvar using dictionary instead of AVL

* allocate the right size to rrdvar flag members

* rrdhost remaining char * members to STRING *

* better error handling on indexing

* strings now use a read/write lock to allow parallel searches to the index

* removed AVL support from dictionaries; implemented STRING with native Judy calls

* string releases should be negative

* only 31 bits are allowed for enum flags

* proper locking on strings

* string threading unittest and fixes

* fix lgtm finding

* fixed naming

* stream chart/dimension definitions at the beginning of a streaming session

* thread stack variable is undefined on thread cancel

* rrdcontext garbage collect per host on startup

* worker control in garbage collection

* relaxed deletion of rrdmetrics

* type checking on dictfe

* netdata chart to monitor rrdcontext triggers

* Group chart label updates

* rrdcontext better handling of collected rrdsets

* rrdpush incremental transmition of definitions should use as much buffer as possible

* require 1MB per chart

* empty the sender buffer before enabling metrics streaming

* fill up to 50% of buffer

* reset signaling metrics sending

* use the shared variable for status

* use separate host flag for enabling streaming of metrics

* make sure the flag is clear

* add logging for streaming

* add logging for streaming on buffer overflow

* circular_buffer proper sizing

* removed obsolete logs

* do not execute worker jobs if not necessary

* better messages about compression disabling

* proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped

* monitor stream sender used buffer ratio

* Update exporting unit tests

* no need to compare label value with strcmp

* streaming send workers now monitor bandwidth

* workers now use strings

* streaming receiver monitors incoming bandwidth

* parser shift of worker ids

* minor fixes

* Group chart label updates

* Populate context with dimensions that have data

* Fix chart id

* better shift of parser worker ids

* fix for streaming compression

* properly count received bytes

* ensure LZ4 compression ring buffer does not wrap prematurely

* do not stream empty charts; do not process empty instances in rrdcontext

* need_to_send_chart_definition() does not need an rrdset lock any more

* rrdcontext objects are collected, after data have been written to the db

* better logging of RRDCONTEXT transitions

* always set all variables needed by the worker utilization charts

* implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes

* lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock

* STRING code re-organization for clarity

* thread_cache improvements; double numbers precision on worker threads

* STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS

* rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts

* Add index to speed up database context searches

* Removed last_updated optimization (was also buggy after latest merge with master)

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-05 19:31:06 +03:00
Stelios Fragkakis
544aef1fde
Clean chart hash map ()
Clean chart hash map after 7 days (eventually to be removed)
2022-09-05 10:22:57 +03:00
Stelios Fragkakis
aacba0b597
Use prepared statements for context related queries () 2022-09-01 16:23:21 +03:00
Emmanuel Vasilakis
dbfc1c2779
Don't try to load db rows when chart_id or dim_id is null ()
dont try to load rows with chart_id or dim_id missing
2022-09-01 14:02:59 +03:00
Costa Tsaousis
77b0e7bccd
sqlite3 global statistics () 2022-08-31 10:04:14 +03:00
Costa Tsaousis
84b07e6e28
prevent crash on rrdcontext apis when rrdcontexts is not initialized ()
prevent crash
2022-08-27 18:50:48 +03:00
Ilya Mashchenko
142e2ee23d
chore: removing logging that a chart collection in the same interpolation point () 2022-08-25 15:29:31 +03:00
Timotej S
971fe35547
Remove aclk_api.[ch] ()
* get rid of aclk_starter middleman
* get rid of aclk_api.[ch]
2022-08-24 10:41:14 +02:00
Emmanuel Vasilakis
b955368a90
Prefer context attributes from non archived charts ()
prefer context attributes from non archived charts
2022-08-23 18:14:53 +03:00
Emmanuel Vasilakis
c414aca77a
Fix coverity 380387 ()
check rc->rrdset
2022-08-22 16:11:16 +03:00
Emmanuel Vasilakis
46ad4ff727
Schedule next rotation based on absolute time ()
schedule rotation_after based on now
2022-08-18 16:46:32 +03:00
Emmanuel Vasilakis
708efb41bd
Support chart labels in alerts ()
* chart labels for alerts

* proper termination

* use strchr

* change if statement

* change label variable. add docs

* change doc

* assign buf to temp

* use new dictionary functions

* reduce variable scope

* reduce line length

* make sure rrdcalc updates labels after inserted

* reduce var scope

* add rrdcalc.c for cmocka tests

* Revert "add rrdcalc.c for cmocka tests"

This reverts commit 5fe122adcf.

* Fix cmocka unit tests

* valgrind errors

Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-08-16 10:33:08 +03:00
Emmanuel Vasilakis
21327b9f1f
Print rrdcontexts versions with PRIu64 ()
print with PRIu64
2022-08-12 13:09:24 +03:00
Costa Tsaousis
37ba5c1a29
rrdcontexts allow not linked dimensions and charts () 2022-08-09 16:29:24 +03:00
Emmanuel Vasilakis
a314f637de
Add chart_context to alert snapshots () 2022-08-06 18:31:56 +03:00
Ilya Mashchenko
4b115f20b4
docs: fix unresolved file references () 2022-08-05 14:30:55 +03:00
Emmanuel Vasilakis
2fd2607475
Send chart context with alert events to the cloud ()
* add chart context to alert events

* migrate health log tables to add chart_context

* send it via proto message

* add from v3 to v4

* free table

* free chart_context
2022-08-04 10:18:53 +03:00
Stelios Fragkakis
bf9c180746
Enable rrdcontexts by default () 2022-08-03 14:46:58 +03:00
Stelios Fragkakis
e3f1535053
Fix tests so that the actual metadata database is not accessed ()
* Add simple ctx_unittest under -W unittest

* Skip un needed initialization if running unittests -- make sure the context database is initialized in memory mode

* Remove tests (no metadata is available at this point)
2022-08-02 18:38:42 +03:00
Stelios Fragkakis
6d2fd88861
Handle cases where entries where stored as text (with strftime("%s")) () 2022-08-02 18:12:41 +03:00
Stelios Fragkakis
af1badd3cd
Remove the single threaded arrayallocator optiomization during agent startup ()
Needs multithreading if data rotation needs to happen during startup
2022-08-02 17:17:13 +03:00
Stelios Fragkakis
e22dff4b1f
Load host labels for archived hosts ()
* Add function to load labels from the database for a host and return a DICTIONARY

* Link labels to the newly created archived host
2022-08-02 17:16:19 +03:00
Costa Tsaousis
ccf0f6b6f4
/api/v1/weights endpoint ()
* /api/v1/weights endpoints

* high resolution anomaly rate in parallel with queries; points and options in /api/v1/weights reflect the truth

* context printing

* merged metric_correlations with weights API; added parameter tier to select the tier to run the query; weight api now returns points per tier; added swagger info about weights api

* moved metric_correlations files to web/api/queries as weights

* added contexts filtering; renamed correlated_dimensions; weights API is always enabled; code cleanup

* allow returning zero results
2022-08-01 21:47:14 +03:00
Costa Tsaousis
bfe964bfbb
rrdcontext support for hidden charts ()
* rrdcontext support for hidden charts

* support unhidding charts
2022-08-01 12:17:38 +03:00
Stelios Fragkakis
9a6fd6366f
Store host label information in the metadata database ()
* Create new host_label table

* Add generic function to store chart and host labels
Add function to store host labels
Cleanup old host labels

* Store labels from localhost and children (streaming)
Remove the host label info if the host is deleted

* Delete host labels before insert
2022-07-28 21:53:05 +03:00
Costa Tsaousis
68888d403f
additional stats () 2022-07-28 15:29:38 +03:00
Emmanuel Vasilakis
fcd5748293
Delete aclk_alert table on start streaming from seq 1 batch 1 ()
delete aclk_alert table on start streaming from seq 1 batch 1
2022-07-28 11:10:26 +03:00
Stelios Fragkakis
50bf6c18b4
Fix agent crash when archived host has not been registered to the cloud ()
If archived host has no node id, do not crash
2022-07-27 13:48:09 +03:00
Costa Tsaousis
e73df78a06
Tiering statistics API endpoint ()
* calculator statistics

* added metrics and metrics_pages counters

* implemented API

* updates to match sheet

* updates to match sheet No2

* fix update every calculation for single point pages

* fix lgtm finding
2022-07-26 12:05:21 +03:00
Emmanuel Vasilakis
98e77284cb
Set value to SN_EMPTY_SLOT if flags is SN_EMPTY_SLOT ()
* set value to SN_EMPTY_SLOT if flags is SN_EMPTY_SLOT

* SN_EMPTY_SLOT should be SN_ANOMALOUS_ZERO

* added the const attribute to pack_storage_number()

* tier1 uses floats

* zero should not be empty slot

* add unlikely

* rename all SN flags to be more meaningful

* proper check for zero double value

* properly check if pages are full with empty points

Co-authored-by: Costa Tsaousis <costa@netdata.cloud>
2022-07-26 11:37:38 +03:00
Costa Tsaousis
291b978282
Rrdcontext ()
* type checking on dictionary return values

* first STRING implementation, used by DICTIONARY and RRDLABEL

* enable AVL compilation of STRING

* Initial functions to store context info

* Call simple test functions

* Add host_id when getting charts

* Allow host to be null and in this case it will process the localhost

* Simplify init
Do not use strdupz - link directly to sqlite result set

* Init the database during startup

* make it compile - no functionality yet

* intermediate commit

* intermidiate

* first interface to sql

* loading instances

* check if we need to update cloud

* comparison of rrdcontext on conflict

* merge context titles

* rrdcontext public interface; statistics on STRING; scratchpad on DICTIONARY

* dictionaries maintain version numbers; rrdcontext api

* cascading changes

* first operational cleanup

* string unittest

* proper cleanup of referenced dictionaries

* added rrdmetrics

* rrdmetric starting retention

* Add fields to context
Adjuct context creation and delete

* Memory cleanup

* Fix get context list
Fix memory double free in tests
Store context with two hosts

* calculated retention

* rrdcontext retention with collection

* Persist database and shutdown

* loading all from sql

* Get chart list and dimension list changes

* fully working attempt 1

* fully working attempt 2

* missing archived flag from log

* fixed archived / collected

* operational

* proper cleanup

* cleanup - implemented all interface functions - dictionary react callback triggers after the dictionary is unlocked

* track all reasons for changes

* proper tracking of reasons of changes

* fully working thread

* better versioning of contexts

* fix string indexing with AVL

* running version per context vs hub version; ifdef dbengine

* added option to disable rrdmetrics

* release old context when a chart changes context

* cleanup properly

* renamed config

* cleanup contexts; general cleanup;

* deletion inline with dequeue; lots of cleanup; child connected/disconnected

* ml should start after rrdcontext

* added missing NULL to ri->rrdset; rrdcontext flags are now only changed under a mutex lock

* fix buggy STRING under AVL

* Rework database initialization
Add migration logic to the context database

* fix data race conditions during context deletion

* added version hash algorithm

* fix string over AVL

* update aclk-schemas

* compile new ctx related protos

* add ctx stream message utils

* add context messages

* add dummy rx message handlers

* add the new topics

* add ctx capability

* add helper functions to send the new messages

* update cmake build to not fail

* update topic names

* handle rrdcontext_enabled

* add more functions

* fatal on OOM cases instead of return NULL

* silence unknown query type error

* fully working attempt 1

* fully working attempt 2

* allow compiling without ACLK

* added family to the context

* removed excess character in UUID

* smarter merging of titles and families

* Database migration code to add family
Add family to SQL_CHART_DATA and VERSIONED_CONTEXT_DATA

* add family to context message

* enable ctx in communication

* hardcoded enabled contexts

* Add hard code for CTX

* add update node collectors to json

* add context message log

* fix log about last_time_t

* fix collected flags for queued items

* prevent crash on charts cleanup

* fix bug in AVL indexing of dictionaries; make sure react callback of dictionaries has a reference counter, which is acquired while the dictionary is locked

* fixed dictionary unittest

* strict policy to cleanup and garbage collector

* fix db rotation and garbage collection timings

* remove deadlock

* proper garbage collection - a lot faster retention recalculation

* Added not NULL in database columns
Remove migration code for context -- we will ship with version 1 of the table schema
Added define for query in tests to detect localhost

* Use UUID_STR_LEN instead of GUID_LEN + 1
Use realistic timestamps when adding test data in the database

* Add NULL checks for passed parameters

* Log deleted context when compiled with NETDATA_INTERNAL_CHECKS

* Error checking for null host id

* add missing ContextsCheckpoint log convertor

* Fix spelling in VACCUM

* Hold additional information for host -- prepare to load archived hosts on startup

* Make sure claim id is valid

* is_get_claimed is actually get the current claim id

* Simplify ctx get chart list query

* remove env negotiation

* fix string unittest when there are some strings already in the index

* propagate live-retention flag upstream; cleanup all update reasons; updated instances logging; automated attaching started/stopped collecting flags;

* first implementation of /api/v1/contexts

* full contexts API; updated swagger

* disabled debugging; rrdcontext enabled by default

* final cleanup and renaming of global variables

* return current time on currently collected contexts, charts and dimensions

* added option "deepscan" to the API to have the server refresh the retention and recalculate the contexts on the fly

* fixed identation of yaml

* Add constrains to the host table

* host->node_id may not be available

* new capabilities

* lock the context while rendering json

* update aclk-schemas

* added permanent labels to all charts about plugin, module and family; added labels to all proc plugin modules

* always add the labels

* allow merging of families down to [x]

* dont show uuids by default, added option to enable them; response is now accepting after,before to show only data for a specific timeframe; deleted items are only shown when "deleted" is requested; hub version is now shown when "queue" is requested

* Use the localhost claim id

* Fix to handle host constrains better

* cgroups: add "k8s." prefix to chart context in k8s

* Improve sqlite metadata version migration check

* empty values set to "[none]"; fix labels unit test to reflect that

* Check if we reached the version we want first (address CODACY report re: Array index 'i' is used before limits check)

* Rewrite condition to address CODACY report (Redundant condition: t->filter_callback. '!A || (A && B)' is equivalent to '!A || B')

* Properly unlock context

* fixed memory leak on rrdcontexts - it was not freeing all dictionaries in rrdhost; added wait of up to 100ms on dictionary_destroy() to give time to dictionaries to release their items before destroying them

* fixed memory leak on rrdlabels not freed on rrdinstances

* fixed leak when dimensions and charts are redefined

* Mark entries for charts and dimensions as submitted to the cloud 3600 seconds after their creation
Mark entries for charts and dimensions as updated (confirmed by the cloud) 1800 seconds after their submission

* renamed struct string

* update cgroups alarms

* fixed codacy suggestions

* update dashboard info

* fix k8s_cgroup_10s_received_packets_storm alarm

* added filtering options to /api/v1/contexts and /api/v1/context

* fix eslint

* fix eslint

* Fix pointer binding for host / chart uuids

* Fix cgroups unit tests

* fixed non-retention updates not propagated upstream

* removed non-fatal fatals

* Remove context from 2 way string merge.

* Move string_2way_merge to dictionary.c

* Add 2-way string merge tests.

* split long lines

* fix indentation in netdata-swagger.yaml

* update netdata-swagger.json

* yamllint please

* remove the deleted flag when a context is collected

* fix yaml warning in swagger

* removed non-fatal fatals

* charts should now be able to switch contexts

* allow deletion of unused metrics, instances and contexts

* keep the queued flag

* cleanup old rrdinstance labels

* dont hide objects when there is no filter; mark objects as deleted when there are no sub-objects

* delete old instances once they changed context

* delete all instances and contexts that do not have sub-objects

* more precise transitions

* Load archived hosts on startup (part 1)

* update the queued time every time

* disable by default; dedup deleted dimensions after snapshot

* Load archived hosts on startup (part 2)

* delayed processing of events until charts are being collected

* remove dont-trigger flag when object is collected

* polish all triggers given the new dont_process flag

* Remove always true condition
Enums for readbility / create_host_callback only if ACLK is enabled (for now)

* Skip retention message if context streaming is enabled
Add messages in the access log if context streaming is enabled

* Check for node id being a UUID that can be parsed
Improve error check / reporting when loading archived hosts and creating ACLK sync threads

* collected, archived, deleted are now mutually exclusive

* Enable the "orphan" handling for now
Remove dead code
Fix memory leak on free host

* Queue charts and dimensions will be no-op if host is set to stream contexts

* removed unused parameter and made sure flags are set on rrdcontext insert

* make the rrdcontext thread abort mid-work when exiting

* Skip chart hash computation and storage if contexts streaming is enabled

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Timo <timotej@netdata.cloud>
Co-authored-by: ilyam8 <ilya@netdata.cloud>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
Co-authored-by: Vasilis Kalintiris <vasilis@netdata.cloud>
2022-07-24 22:33:09 +03:00
Stelios Fragkakis
fccfc02d2c
Add missing comma (handle coverity warning CID 379360) ()
Add missing comma (handle coverity warning) *** CID 379360:  Incorrect expression  (MISSING_COMMA)
2022-07-21 10:14:19 +03:00
Stelios Fragkakis
754b542242
Store host system information in the database ()
* Hold host info in the database
Add functions to store / fetch

* Store the system info

* Delete the relevant host_info where the host is deleted

* Remove redundant call to store system info
Use const in parameters to sql_store_host_system_info_key_value and sql_store_host_system_info

* Do not crash if no system info is given

* Fix missing finalize
2022-07-20 13:08:52 +03:00
Tasos Katsoulas
bc5ba4f891
Update docs on metric storage ()
This PR 

- Explains the new tiering mechanism.
- Housekeeping docs about Agent's database options.
- Updates all the configuration options for the `dbengine`.
- Provide a new way for the users to calculate the space they need for their 
  metric storage needs (via a spreadsheet)

Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
Co-authored-by: DShreve2 <david@netdata.cloud>
2022-07-14 17:16:12 +03:00
Stelios Fragkakis
87e9700b2f
Detect stored metric size by page type ()
* Report unknown page only once
Get metric storage size by the page type
Verify validity of the page and skip problematic ones

* Change PAGE_SIZE to PAGE_POINT_SIZE_BYTES

* Add bitmap256 and unittests

* Fix unit test
tier_page_type array
page_type_size arrays

* Add another counter to not rely on uint8_t overflow to stop the test loop
2022-07-11 20:40:26 +03:00
Emmanuel Vasilakis
7a67355e15
Send node info message sooner ()
send node info sooner
2022-07-11 12:30:54 +03:00
Costa Tsaousis
329ef5ebef
fix crash on start on slow disks because ml is initialized before dbengine starts () 2022-07-08 20:18:07 +03:00
Emmanuel Vasilakis
fc8affaabd
Fix coverity 379241 ()
fix coverity 379241
2022-07-08 10:41:38 +03:00