netdata_netdata

mirror of https://github.com/netdata/netdata.git synced 2025-05-15 22:10:42 +00:00

Author	SHA1	Message	Date
Costa Tsaousis	c3d70ffcb4	WEBRTC for communication between agents and browsers (#14874 ) * initial webrtc setup * missing files * rewrite of webrtc integration * initialization and cleanup of webrtc connections * make it compile without libdatachannel * add missing webrtc_initialize() function when webrtc is not enabled * make c++17 optional * add build/m4/ax_compiler_vendor.m4 * add ax_cxx_compile_stdcxx.m4 * added new m4 files to makefile.am * id all webrtc connections * show warning when webrtc is disabled * fixed message * moved all webrtc error checking inside webrtc.cpp * working webrtc connection establishment and cleanup * remove obsolete code * rewrote webrtc code in C to remove dependency for c++17 * fixed left-over reference * detect binary and text messages * minor fix * naming of webrtc threads * added webrtc configuration * fix for thread_get_name_np() * smaller web_client memory footprint * universal web clients cache * free web clients every 100 uses * webrtc is now enabled by default only when compiled with internal checks * webrtc responses to /api/ requests, including LZ4 compression * fix for binary and text messages * web_client_cache is now global * unification of the internal web server API, for web requests, aclk request, webrtc requests * more cleanup and unification of web client timings * fixed compiler warnings * update sent and received bytes * eliminated of almost all big buffers in web client * registry now uses the new json generation * cookies are now an array; fixed redirects * fix redirects, again * write cookies directly to the header buffer, eliminating the need for cookie structures in web client * reset the has_cookies flag * gathered all web client cleanup to one function * fixes redirects * added summary.globals in /api/v2/data response * ars to arc in /api/v2/data * properly handle host impersonation * set the context of mem.numa_nodes	2023-04-20 20:49:06 +03:00
Costa Tsaousis	4c82a15651	/api/v2 part 8 (#14885 ) * configure extent cache size * /api/v2/data responsed with the id of dimensions in result.labels * provide the original units and sts to db.dimensions * updated swagger for the changes	2023-04-10 16:24:45 +03:00
Costa Tsaousis	204dd9ae27	Boost dbengine (#14832 ) * configure extent cache size * workers can now execute up to 10 jobs in a run, boosting query prep and extent reads * fix dispatched and executing counters * boost to the max * increase libuv worker threads * query prep always get more prio than extent reads; stop processing in batch when dbengine is queue is critical * fix accounting of query prep * inlining of time-grouping functions, to speed up queries with billions of points * make switching based on a local const variable * print one pending contexts loading message per iteration * inlined store engine query API * inlined storage engine data collection api * inlined all storage engine query ops * eliminate and inline data collection ops * simplified query group-by * more error handling * optimized partial trimming of group-by queries * preparative work to support multiple passes of group-by * more preparative work to support multiple passes of group-by (accepts multiple group-by params) * unified query timings * unified query timings - weights endpoint * query target is no longer a static thread variable - there is a list of cached query targets, each of which of freed every 1000 queries * fix query memory accounting * added summary.dimension[].pri and sorted summary.dimensions based on priority and then name * limit max ACLK WEB response size to 30MB * the response type should be text/plain * more preparative work for multiple group-by passes * create functions for generating group by keys, ids and names * multiple group-by passes are now supported * parse group-by options array also with an index * implemented percentage-of-instance group by function * family is now merged in multi-node contexts * prevent uninitialized use	2023-04-07 21:25:01 +03:00
Costa Tsaousis	8a036f0b24	/api/v2/X part 7 (#14797 ) * /api/v2/weights, points key renamed to result * /api/v2/weights, add node ids in response * /api/v2/data remove NONZERO flag when all dimensions are zero and fix MIN/MAX grouping and statistics * /api/v2/data expose view.dimensions.sts{} * /api/v2 endpoints expose agents and additional info per node, that is needed to unify cloud responses * /api/v2 nodes output now includes the duration of time spent per node * jsonwrap view object renames and cleanup * rework of the statistics returned by the query engine * swagger work * swagger work * more swagger work * updated swagger json * added the remaining of the /api/v2 endpoints to swagger * point.ar has been renamed point.arp * updated weights endpoint * fix compilation warnings	2023-03-28 15:23:03 +03:00
Costa Tsaousis	5eed0545d4	/api/v2/X part 5 (#14718 ) * query timestamps are now pre-determined and alignment on timestamps is guarranteed * turn internal_fatal() to internal_error() to investigate the issue * handle query when no data exist in the db * check for non NULL dict when running dictionary garbage collect * support API v2 requests via ACLK * add nodes detailed information to /api/v2/nodes * fixed keys and added dummy nodes for completeness * added nodes_hard_hash, alerts_hard_hash, alerts_soft_hash; started building a nodes status object to reflect the current status of a node * make sure replication does not double count charts that are already being replicated * expose min and max in sts structures * added view_minimum_value and view_maximum_value; percentage calculation is now an additional pass on the data, removed from formatters; absolute value calculation is now done at the query level, removed from formatters * respect trimming in percentage calculation; updated swagger * api/v2/weights preparative work to support multi-node queries - still single node though * multi-node /api/v2/weights endpoint, supporting all the filtering parameters of /api/v2/data * when passing the raw option, the query exposes the hidden dimensions * fix compilation issues on older systems * the query engine now calculates per dimension min, max, sum, count, anomaly count * use the macro to calculate storage point anomaly rate * weights endpoint exposing version hashes * weights method=value shows min, max, average, sum, count, anomaly count, anomaly rate * query: expose RESET flag; do not add the same point multiple times to the aggregated point * weights: more compact output * weights requests can be interrupted * all /api/v2 requests can be interrupted and timeout * allow relative timestamps in weights * fix macos compilation warnings * Revert "fix macos compilation warnings" This reverts commit `8a1d24e41e`. * /api/v2/data group-by now works on dimension names, not ids * /api/v2/weights does not query metrics without retention and new output format * /api/v2/weights value and anomaly queries do context queries when contexts are filtered; query timeout is now always in ms	2023-03-21 21:53:47 +02:00
Costa Tsaousis	cd50bf4236	/api/v2 part 4 (#14706 ) * expose the order of group by * key renames in json wrapper v2 * added group by context and group by units * added view_average_values * fix for view_average_values when percentage is specified * option group-by-labels is enabling the exposure of all the labels that are used for each of the final grouped dimensions * when executing group by queries, allocate one dimension data at a time - not all of them * respect hidden dimensions * cancel running data query on socket error * use poll to detect socket errors * use POLLRDHUP to detect half closed connections * make sure POLLRDHUP is available * do not destroy aral-by-size arals * completed documentation of /api/v2/data. * moved min, max back to view; updated swagger yaml and json * default format for /api/v2/data is json2	2023-03-13 23:39:06 +02:00
Costa Tsaousis	cf85c3b0e9	/api/v2/X improvements part 3 (#14665 ) * max web request size to 64KB * fix the request too big message * increase max request reading tries to 100 * support for bigger web requests * add "avg" as a shortcut for "average" to both group by aggregation and time aggregation; discard the last partial points of a query in play mode, up to max update every; group by hidden dimensions too * better implementation for partial data trimming * added group_by=selected to return only one dimension for all selected metrics * fix acceptance of group_by=selected * passing option "raw" disables partial data trimming * remove obsolete option "plan"; use "debug" * fix view.min and view.max calculation - there were 2 bugs: a) min and max were reset for every row and b) min and max were corrupted by GBC and AR printing * per row annotations * added time column to point annotations * disable caching for /api/v2/contexts responses * added api format json2 that returns an array for each points, having all the point values and annotations in them * work on swagger about /api/v2 * prevent infinite loop * cleanup and swagger work * allow negative simple pattern expressions to work as expected * do not lookup in the dictionary empty names * garbage collect dictionaries * make query_target allocate less aggressively; queries fill the remaining points with nulls * reusable query ops to save memory on huge queries * move parts of query plans into query ops to save query target memory * remove storage engine from query metric tiers, to save memory, and recalculate it when it is needed	2023-03-10 12:41:14 +02:00
Costa Tsaousis	021e252fc5	/api/v2/contexts (#14592 ) * preparation for /api/v2/contexts * working /api/v2/contexts * add anomaly rate information in all statistics; when sum-count is requested, return sums and counts instead of averages * minor fix * query targegt now accurately counts hosts, contexts, instances, dimensions, metrics * cleanup /api/v2/contexts * full text search with /api/v2/contexts * simple patterns now support the option to search ignoring case * full text search API with /api/v2/q * simple pattern execution optimization * do not show q when not given * full text search accounting * separated /api/v2/nodes from /api/v2/contexts * fix ssv queries for group_by * count query instances queried and failed per context and host * split rrdcontext.c to multiple files * add query totals * fix anomaly rate calculation; provide "ni" for indexing hosts * do not generate zero valued members * faster calculation of anomaly rate; by just summing integers for each db points and doing math once for every generated point * fix typo when printing dimensions totals * added option minify to remove spaces and newlines fron JSON output * send instance ids and names when they differ * do not add in query target dimensions, instances, contexts and hosts for which there is no retention in the current timeframe * fix for the previous + renames and code cleanup * when a dimension is filtered, include in the response all the other dimensions that are selectable * do not add nodes that do not have retention in the current window * move selection of dimensions to query_dimension_add(), instead of query_metric_add() * increase the pre-processing capacity of queries * generate instance fqdn ids and names only when they are needed * provide detailed statistics about tiers retention, queries, points, update_every * late allocation of query dimensions * cleanup * more cleanup * support for annotations per displayed point, RESET and PARTIAL * new type annotations * if a chart is not linked to contexts and it is collected, link it when it is collected * make ML run reentrant * make ML rrdr query synchronous * optimize replication memory allocation of replication_sort_entry * change units to percentage, when requesting a coefficinet of variation, or a percentage query * initialize replication before starting main threads * properly decrement no room requests counter * propagate the non-zero flag to group-by * the same by avoiding the extra loop * respect non-zero in all dimension arrays * remove dictionary garbage collection from dictionary_entries() and dictionary_version() * be more verbose when jv2 indexing is postponed * prevent infinite loop * use hidden dimensions even when dimensions pattern is unset * traverse hosts using dictionaries * fix dictionary unittests	2023-03-02 22:50:48 +02:00
Costa Tsaousis	1cfad181a8	/api/v2/data - multi-host/context/instance/dimension/label queries (#14564 ) * fundamentals for having /api/v2/ working * use an atomic to prevent writing to internal pipe too much * first attempt of multi-node, multi-context, multi-chart, multi-dimension queries * v2 jsonwrap * first attempt for group by * cleaned up RRDR and fixed group by * improvements to /api/v2/api * query instance may be realloced, so pointers to it get invalid; solved memory leaks * count of quried metrics in summary information * provide detailed information about selected, excluded, queried and failed metrics for each entity * select instances by fqdn too * add timing information to json output * link charts to rrdcontexts, if a query comes in and it is found unlinked * calculate min, max, sum, average, volume, count per metric * api v2 parameters naming * renders alerts and units * render machine_guid and node_id in all sections it is relevant * unified keys * group by now takes into account units and when there are multiple units involved, it creates a dimension per unit * request and detailed are hidden behind an option * summary includes only a flattened list of alerts * alert counts per host and instance * count of grouped metrics per dimension * added contexts to summary * added chart title * added dimension priorities and chart type * support for multiple group by at the same time * minor fixes * labels are now a tree * keys uniformity * filtering by alerts, both having a specific alert and having a specific alert in a specific status * added scope of hosts and contexts * count of instances on contexts and hosts * make the api return valid responses even when the response contains no data * calculate average and contribution % for every item in the summary * fix compilation warnings * fix compilation warnings - again	2023-02-22 22:30:40 +02:00
Costa Tsaousis	d2daa19bf5	JSON internal API, IEEE754 base64/hex streaming, weights endpoint optimization (#14493 ) * first work on standardizing json formatting * renamed old grouping to time_grouping and added group_by * add dummy functions to enable compilation * buffer json api work * jsonwrap opening with buffer_json_X() functions * cleanup * storage for quotes * optimize buffer printing for both numbers and strings * removed ; from define * contexts json generation using the new json functions * fix buffer overflow at unit test * weights endpoint using new json api * fixes to weights endpoint * check buffer overflow on all buffer functions * do synchronous queries for weights * buffer_flush() now resets json state too * content type typedef * print double values that are above the max 64-bit value * str2ndd() can now parse values above UINT64_MAX * faster number parsing by avoiding double calculations as much as possible * faster number parsing * faster hex parsing * accurate printing and parsing of double values, even for very large numbers that cannot fit in 64bit integers * full printing and parsing without using library functions - and related unit tests * added IEEE754 streaming capability to enable streaming of double values in hex * streaming and replication to transfer all values in hex * use our own str2ndd for set2 * remove subnormal check from ieee * base64 encoding for numbers, instead of hex * when increasing double precision, also make sure the fractional number printed is aligned to the wanted precision * str2ndd_encoded() parses all encoding formats, including integers * prevent uninitialized use * /api/v1/info using the new json API * Fix error when compiling with --disable-ml * Remove redundant 'buffer_unittest' declaration * Fix formatting * Fix formatting * Fix formatting * fix buffer unit test * apps.plugin using the new JSON API * make sure the metrics registry does not accept negative timestamps * do not allow pages with negative timestamps to be loaded from db files; do not accept pages with negative timestamps in the cache * Fix more formatting --------- Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>	2023-02-15 21:16:29 +02:00
Costa Tsaousis	82150596e7	do not report dimensions that failed to be queried (#14447 ) * do not report dimensions that failed to be queried * renamed SELECTED to QUERIED to have clarity on what it means * fix wrong placement of continue	2023-02-07 11:25:41 +02:00
Costa Tsaousis	c83abcfb9d	DBENGINE v2 - improvements part 1 (#14251 ) * allow running multiple evictors and flushers * flipped aggressive and critical evictions * dont run more than 1 evictor * switch to batch evictions when the size of the cache is critical * remove batching of evictions * dedup extent load pending requests * accounting for merged extents * always use double linked list * add extent merging to the overall cache hit ratio * support requeuing merged extents to higher priorities * fix function name * query planner now prefers higher tiers even when they miss some data at the end, which it fills from lower tiers; adding the option "plan" to jsonwrap now renders the query plan * update statistics after every dimension completes * use the retention of all tiers to calculate coverage per tier * use the original window of the query for the planner * give 2.5% befenit for each higher tier * update cmd->priority so that it be requeued multiple times * merged extent pages is a cache hit * fixed dbegnine cache hit stats	2023-01-12 23:10:12 +02:00
Costa Tsaousis	368a26cfee	DBENGINE v2 (#14125 ) * count open cache pages refering to datafile * eliminate waste flush attempts * remove eliminated variable * journal v2 scanning split functions * avoid locking open cache for a long time while migrating to journal v2 * dont acquire datafile for the loop; disable thread cancelability while a query is running * work on datafile acquiring * work on datafile deletion * work on datafile deletion again * logs of dbengine should start with DBENGINE * thread specific key for queries to check if a query finishes without a finalize * page_uuid is not used anymore * Cleanup judy traversal when building new v2 Remove not needed calls to metric registry * metric is 8 bytes smaller; timestamps are protected with a spinlock; timestamps in metric are now always coherent * disable checks for invalid time-ranges * Remove type from page details * report scanning time * remove infinite loop from datafile acquire for deletion * remove infinite loop from datafile acquire for deletion again * trace query handles * properly allocate array of dimensions in replication * metrics cleanup * metrics registry uses arrayalloc * arrayalloc free should be protected by lock * use array alloc in page cache * journal v2 scanning fix * datafile reference leaking hunding * do not load metrics of future timestamps * initialize reasons * fix datafile reference leak * do not load pages that are entirely overlapped by others * expand metric retention atomically * split replication logic in initialization and execution * replication prepare ahead queries * replication prepare ahead queries fixed * fix replication workers accounting * add router active queries chart * restore accounting of pages metadata sources; cleanup replication * dont count skipped pages as unroutable * notes on services shutdown * do not migrate to journal v2 too early, while it has pending dirty pages in the main cache for the specific journal file * do not add pages we dont need to pdc * time in range re-work to provide info about past and future matches * finner control on the pages selected for processing; accounting of page related issues * fix invalid reference to handle->page * eliminate data collection handle of pg_lookup_next * accounting for queries with gaps * query preprocessing the same way the processing is done; cache now supports all operations on Judy * dynamic libuv workers based on number of processors; minimum libuv workers 8; replication query init ahead uses libuv workers - reserved ones (3) * get into pdc all matching pages from main cache and open cache; do not do v2 scan if main cache and open cache can satisfy the query * finner gaps calculation; accounting of overlapping pages in queries * fix gaps accounting * move datafile deletion to worker thread * tune libuv workers and thread stack size * stop netdata threads gradually * run indexing together with cache flush/evict * more work on clean shutdown * limit the number of pages to evict per run * do not lock the clean queue for accesses if it is not possible at that time - the page will be moved to the back of the list during eviction * economies on flags for smaller page footprint; cleanup and renames * eviction moves referenced pages to the end of the queue * use murmur hash for indexing partition * murmur should be static * use more indexing partitions * revert number of partitions to number of cpus * cancel threads first, then stop services * revert default thread stack size * dont execute replication requests of disconnected senders * wait more time for services that are exiting gradually * fixed last commit * finer control on page selection algorithm * default stacksize of 1MB * fix formatting * fix worker utilization going crazy when the number is rotating * avoid buffer full due to replication preprocessing of requests * support query priorities * add count of spins in spinlock when compiled with netdata internal checks * remove prioritization from dbengine queries; cache now uses mutexes for the queues * hot pages are now in sections judy arrays, like dirty * align replication queries to optimal page size * during flushing add to clean and evict in batches * Revert "during flushing add to clean and evict in batches" This reverts commit `8fb2b69d06`. * dont lock clean while evicting pages during flushing * Revert "dont lock clean while evicting pages during flushing" This reverts commit `d6c82b5f40`. * Revert "Revert "during flushing add to clean and evict in batches"" This reverts commit `ca7a187537`. * dont cross locks during flushing, for the fastest flushes possible * low-priority queries load pages synchronously * Revert "low-priority queries load pages synchronously" This reverts commit `1ef2662ddc`. * cache uses spinlock again * during flushing, dont lock the clean queue at all; each item is added atomically * do smaller eviction runs * evict one page at a time to minimize lock contention on the clean queue * fix eviction statistics * fix last commit * plain should be main cache * event loop cleanup; evictions and flushes can now happen concurrently * run flush and evictions from tier0 only * remove not needed variables * flushing open cache is not needed; flushing protection is irrelevant since flushing is global for all tiers; added protection to datafiles so that only one flusher can run per datafile at any given time * added worker jobs in timer to find the slow part of it * support fast eviction of pages when all_of_them is set * revert default thread stack size * bypass event loop for dispatching read extent commands to workers - send them directly * Revert "bypass event loop for dispatching read extent commands to workers - send them directly" This reverts commit `2c08bc5bab`. * cache work requests * minimize memory operations during flushing; caching of extent_io_descriptors and page_descriptors * publish flushed pages to open cache in the thread pool * prevent eventloop requests from getting stacked in the event loop * single threaded dbengine controller; support priorities for all queries; major cleanup and restructuring of rrdengine.c * more rrdengine.c cleanup * enable db rotation * do not log when there is a filter * do not run multiple migration to journal v2 * load all extents async * fix wrong paste * report opcodes waiting, works dispatched, works executing * cleanup event loop memory every 10 minutes * dont dispatch more work requests than the number of threads available * use the dispatched counter instead of the executing counter to check if the worker thread pool is full * remove UV_RUN_NOWAIT * replication to fill the queues * caching of extent buffers; code cleanup * caching of pdc and pd; rework on journal v2 indexing, datafile creation, database rotation * single transaction wal * synchronous flushing * first cancel the threads, then signal them to exit * caching of rrdeng query handles; added priority to query target; health is now low prio * add priority to the missing points; do not allow critical priority in queries * offload query preparation and routing to libuv thread pool * updated timing charts for the offloaded query preparation * caching of WALs * accounting for struct caches (buffers); do not load extents with invalid sizes * protection against memory booming during replication due to the optimal alignment of pages; sender thread buffer is now also reset when the circular buffer is reset * also check if the expanded before is not the chart later updated time * also check if the expanded before is not after the wall clock time of when the query started * Remove unused variable * replication to queue less queries; cleanup of internal fatals * Mark dimension to be updated async * caching of extent_page_details_list (epdl) and datafile_extent_offset_list (deol) * disable pgc stress test, under an ifdef * disable mrg stress test under an ifdef * Mark chart and host labels, host info for async check and store in the database * dictionary items use arrayalloc * cache section pages structure is allocated with arrayalloc * Add function to wakeup the aclk query threads and check for exit Register function to be called during shutdown after signaling the service to exit * parallel preparation of all dimensions of queries * be more sensitive to enable streaming after replication * atomically finish chart replication * fix last commit * fix last commit again * fix last commit again again * fix last commit again again again * unify the normalization of retention calculation for collected charts; do not enable streaming if more than 60 points are to be transferred; eliminate an allocation during replication * do not cancel start streaming; use high priority queries when we have locked chart data collection * prevent starvation on opcodes execution, by allowing 2% of the requests to be re-ordered * opcode now uses 2 spinlocks one for the caching of allocations and one for the waiting queue * Remove check locks and NETDATA_VERIFY_LOCKS as it is not needed anymore * Fix bad memory allocation / cleanup * Cleanup ACLK sync initialization (part 1) * Don't update metric registry during shutdown (part 1) * Prevent crash when dashboard is refreshed and host goes away * Mark ctx that is shutting down. Test not adding flushed pages to open cache as hot if we are shutting down * make ML work * Fix compile without NETDATA_INTERNAL_CHECKS * shutdown each ctx independently * fix completion of quiesce * do not update shared ML charts * Create ML charts on child hosts. When a parent runs a ML for a child, the relevant-ML charts should be created on the child host. These charts should use the parent's hostname to differentiate multiple parents that might run ML for a child. The only exception to this rule is the training/prediction resource usage charts. These are created on the localhost of the parent host, because they provide information specific to said host. * check new ml code * first save the database, then free all memory * dbengine prep exit before freeing all memory; fixed deadlock in cache hot to dirty; added missing check to query engine about metrics without any data in the db * Cleanup metadata thread (part 2) * increase refcount before dispatching prep command * Do not try to stop anomaly detection threads twice. A separate function call has been added to stop anomaly detection threads. This commit removes the left over function calls that were made internally when a host was being created/destroyed. * Remove allocations when smoothing samples buffer The number of dims per sample is always 1, ie. we are training and predicting only individual dimensions. * set the orphan flag when loading archived hosts * track worker dispatch callbacks and threadpool worker init * make ML threads joinable; mark ctx having flushing in progress as early as possible * fix allocation counter * Cleanup metadata thread (part 3) * Cleanup metadata thread (part 4) * Skip metadata host scan when running unittest * unittest support during init * dont use all the libuv threads for queries * break an infinite loop when sleep_usec() is interrupted * ml prediction is a collector for several charts * sleep_usec() now makes sure it will never loop if it passes the time expected; sleep_usec() now uses nanosleep() because clock_nanosleep() misses signals on netdata exit * worker_unregister() in netdata threads cleanup * moved pdc/epdl/deol/extent_buffer related code to pdc.c and pdc.h * fixed ML issues * removed engine2 directory * added dbengine2 files in CMakeLists.txt * move query plan data to query target, so that they can be exposed by in jsonwrap * uniform definition of query plan according to the other query target members * event_loop should be in daemon, not libnetdata * metric_retention_by_uuid() is now part of the storage engine abstraction * unify time_t variables to have the suffix _s (meaning: seconds) * old dbengine statistics become "dbengine io" * do not enable ML resource usage charts by default * unify ml chart families, plugins and modules * cleanup query plans from query target * cleanup all extent buffers * added debug info for rrddim slot to time * rrddim now does proper gap management * full rewrite of the mem modes * use library functions for madvise * use CHECKSUM_SZ for the checksum size * fix coverity warning about the impossible case of returning a page that is entirely in the past of the query * fix dbengine shutdown * keep the old datafile lock until a new datafile has been created, to avoid creating multiple datafiles concurrently * fine tune cache evictions * dont initialize health if the health service is not running - prevent crash on shutdown while children get connected * rename AS threads to ACLK[hostname] * prevent re-use of uninitialized memory in queries * use JulyL instead of JudyL for PDC operations - to test it first * add also JulyL files * fix July memory accounting * disable July for PDC (use Judy) * use the function to remove datafiles from linked list * fix july and event_loop * add july to libnetdata subdirs * rename time_t variables that end in _t to end in _s * replicate when there is a gap at the beginning of the replication period * reset postponing of sender connections when a receiver is connected * Adjust update every properly * fix replication infinite loop due to last change * packed enums in rrd.h and cleanup of obsolete rrd structure members * prevent deadlock in replication: replication_recalculate_buffer_used_ratio_unsafe() deadlocking with replication_sender_delete_pending_requests() * void unused variable * void unused variables * fix indentation * entries_by_time calculation in VD was wrong; restored internal checks for checking future timestamps * macros to caclulate page entries by time and size * prevent statsd cleanup crash on exit * cleanup health thread related variables Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: vkalintiris <vasilis@netdata.cloud>	2023-01-10 19:59:21 +02:00
Emmanuel Vasilakis	bf1cb6048b	Use print macros (#13876 ) * use print macros * cast instead	2022-10-25 17:24:07 +03:00
Costa Tsaousis	00712b351b	QUERY_TARGET: new query engine for Netdata Agent (#13697 ) * initial implementation of QUERY_TARGET * rrd2rrdr() interface * rrddim_find_best_tier_for_timeframe() ported * added dimension filtering * added db object in query target * rrd2rrdr() ported * working on formatters * working on jsonwrapper * finally, it compiles... * 1st run without crashes * query planer working * cleanup old code * review changes * fix also changing data collection frequency * fix signess * fix rrdlabels and dimension ordering * fixes * remove unused variable * ml should accept NULL response from rrd2rrdr() * number formatting fixes * more number formatting fixes * more number formatting fixes * support mc parallel queries * formatting and cleanup * added rrd2rrdr_legacy() as a simplified interface to run a query * make sure rrdset_find_natural_update_every_for_timeframe() returns a value * make signed comparisons * weights endpoint using rrdcontexts * fix for legacy db modes and cleanup * fix for chart_ids and remove AR chart from weights endpoint * Ignore command if not initialized yet * remove unused members * properly initialize window * code cleanup - rrddim linked list is gone; rrdset rwlock is gone too * reviewed RRDR.internal members * eliminate unnecessary members of QUERY_TARGET * more complete query ids; more detailed information on aborted queries * properly terminate option strings * query id contains group_options which is controlled by users, so escaping is necessary * tense in query id * tense in query id - again * added the remaining query options to the query id * Expose hidden option to the dimension * use the hidden flag when loading context dimensions * Specify table alias for option * dont update chart last access time, unless at least a dimension of the chart will be queried Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>	2022-10-23 23:46:43 +03:00
Costa Tsaousis	8fc3b351a2	Allow netdata plugins to expose functions for querying more information about specific charts (#13720 ) * function renames and code cleanup in popen.c; no actual code changes * netdata popen() now opens both child process stdin and stdout and returns FILE * for both * pass both input and output to parser structures * updated rrdset to call custom functions * RRDSET FUNCTION leading calls for both sync and async operation * put RRDSET functions to a separate file * added format and timeout at function definition * support for synchronous (internal plugins) and asynchronous (external plugins and children) functions * /api/v1/function endpoint * functions are now attached to the host and there is a dictionary view per chart * functions implemented at plugins.d * remove the defer until keyword hook from plugins.d when it is done * stream sender implementation of functions * sanitization of all functions so that certain characters are only allowed * strictier sanitization * common max size * 1st working plugins.d example * always init inflight dictionary * properly destroy dictionaries to avoid parallel insertion of items * add more debugging on disconnection reasons * add more debugging on disconnection reasons again * streaming receiver respects newlines * dont use the same fp for both streaming receive and send * dont free dbengine memory with internal checks * make sender proceed in the buffer * added timing info and garbage collection at plugins.d * added info about routing nodes * added info about routing nodes with delay * added more info about delays * added more info about delays again * signal sending thread to wake up * streaming version labeling and commented code to support capabilities * added functions to /api/v1/data, /api/v1/charts, /api/v1/chart, /api/v1/info * redirect top output to stdout * address coverity findings * fix resource leaks of popen * log attempts to connect to individual destinations * better messages * properly parse destinations * try to find a function from the most matching to the least matching * log added streaming destinations * rotate destinations bypassing a node in the middle that does not accept our connection * break the loops properly * use typedef to define callbacks * capabilities negotiation during streaming * functions exposed upstream based on capabilities; compression disabled per node persisting reconnects; always try to connect with all capabilities * restore functionality to lookup functions * better logging of capabilities * remove old versions from capabilities when a newer version is there * fix formatting * optimization for plugins.d rrdlabels to avoid creating and destructing dictionaries all the time * delayed health initialization for rrddim and rrdset * cleanup health initialization * fix for popen() not returning the right value * add health worker jobs for initializing rrdset and rrddim * added content type support for functions; apps.plugin permanent function to display all the processes * fixes for functions parameters parsing in apps.plugin * fix for process matching in apps.plugiin * first working function for apps.plugin * Dashboard ACL is disabled for functions; Function errors are all in JSON format * apps.plugin function processes returns json table * use json_escape_string() to escape message * fix formatting * apps.plugin exposes all its metrics to function processes * fix json formatting when filtering out some rows * reopen the internal pipe of rrdpush in case of errors * misplaced statement * do not use buffer->len * support for GLOBAL functions (functions that are not linked to a chart * added /api/v1/functions endpoint; removed format from the FUNCTIONS api; * swagger documentation about the new api end points * added plugins.d documentation about functions * never re-close a file * remove uncessesary ifdef * fixed issues identified by codacy * fix for null label value * make edit-config copy-and-paste friendly * Revert "make edit-config copy-and-paste friendly" This reverts commit `54500c0e0a`. * reworked sender handshake to fix coverity findings * timeout is zero, for both send_timeout() and recv_timeout() * properly detect that parent closed the socket * support caching of function responses; limit function response to 10MB; added protection from malformed function responses * disabled excessive logging * added units to apps.plugin function processes and normalized all values to be human readable * shorter field names * fixed issues reported * fixed apps.plugin error response; tested that pluginsd can properly handle faulty responses * use double linked list macros for double linked list management * faster apps.plugin function printing by minimizing file operations * added memory percentage * fix compatibility issues with older compilers and FreeBSD * rrdpush sender code cleanup; rrhost structure cleanup from sender flags and variables; * fix letftover variable in ifdef * apps.plugin: do not call detach from the thread; exit immediately when input is broken * exclude AR charts from health * flush cleaner; prefer sender output * clarity * do not fill the cbuffer if not connected * fix * dont enabled host->sender if streaming is not enabled; send host label updates to parent; * functions are only available through ACLK * Prepared statement reports only in dev mode * fix AR chart detection * fix for streaming not being enabling itself * more cleanup of sender and receiver structures * moved read-only flags and configuration options to rrdhost->options * fixed merge with master * fix for incomplete rename * prevent service thread from working on charts that are being collected Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>	2022-10-05 14:13:46 +03:00
Costa Tsaousis	cb7af25c09	RRD structures managed by dictionaries (#13646 ) * rrdset - in progress * rrdset optimal constructor; rrdset conflict * rrdset final touches * re-organization of rrdset object members * prevent use-after-free * dictionary dfe supports also counting of iterations * rrddim managed by dictionary * rrd.h cleanup * DICTIONARY_ITEM now is referencing actual dictionary items in the code * removed rrdset linked list * Revert "removed rrdset linked list" This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5. * removed rrdset linked list * added comments * Switch chart uuid to static allocation in rrdset Remove unused functions * rrdset_archive() and friends... * always create rrdfamily * enable ml_free_dimension * rrddim_foreach done with dfe * most custom rrddim loops replaced with rrddim_foreach * removed accesses to rrddim->dimensions * removed locks that are no longer needed * rrdsetvar is now managed by the dictionary * set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853 * conflict callback of rrdsetvar now properly checks if it has to reset the variable * dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM * dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe * dictionary walkthrough callbacks get dictionary acquired items * dictionary reference counters that can be dupped from zero * added advanced functions for get and del * rrdvar managed by dictionaries * thread safety for rrdsetvar * faster rrdvar initialization * rrdvar string lengths should match in all add, del, get functions * rrdvar internals hidden from the rest of the world * rrdvar is now acquired throughout netdata * hide the internal structures of rrdsetvar * rrdsetvar is now acquired through out netdata * rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata * better error handling * dont create variables if not initialized for health * dont create variables if not initialized for health again * rrdfamily is now managed by dictionaries; references of it are acquired dictionary items * type checking on acquired objects * rrdcalc renaming of functions * type checking for rrdfamily_acquired * rrdcalc managed by dictionaries * rrdcalc double free fix * host rrdvars is always needed * attempt to fix deadlock 1 * attempt to fix deadlock 2 * Remove unused variable * attempt to fix deadlock 3 * snprintfz * rrdcalc index in rrdset fix * Stop storing active charts and computing chart hashes * Remove store active chart function * Remove compute chart hash function * Remove sql_store_chart_hash function * Remove store_active_dimension function * dictionary delayed destruction * formatting and cleanup * zero dictionary base on rrdsetvar * added internal error to log delayed destructions of dictionaries * typo in rrddimvar * added debugging info to dictionary * debug info * fix for rrdcalc keys being empty * remove forgotten unlock * remove deadlock * Switch to metadata version 5 and drop chart_hash chart_hash_map chart_active dimension_active v_chart_hash * SQL cosmetic changes * do not busy wait while destroying a referenced dictionary * remove deadlock * code cleanup; re-organization; * fast cleanup and flushing of dictionaries * number formatting fixes * do not delete configured alerts when archiving a chart * rrddim obsolete linked list management outside dictionaries * removed duplicate contexts call * fix crash when rrdfamily is not initialized * dont keep rrddimvar referenced * properly cleanup rrdvar * removed some locks * Do not attempt to cleanup chart_hash / chart_hash_map * rrdcalctemplate managed by dictionary * register callbacks on the right dictionary * removed some more locks * rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread * when looking up for an alarm look using both chart id and chart name * host initialization a bit more modular * init rrdlabels on host update * preparation for dictionary views * improved comment * unused variables without internal checks * service threads isolation and worker info * more worker info in service thread * thread cancelability debugging with internal checks * strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647 * dictionary modularization * Remove unused SQL statement definition * unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated * remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops * rewritten dictionary to have 2 separate locks, one for indexing and another for traversal * Update collectors/cgroups.plugin/sys_fs_cgroup.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * Update collectors/cgroups.plugin/sys_fs_cgroup.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * Update collectors/proc.plugin/proc_net_dev.c Co-authored-by: Vladimir Kobal <vlad@prokk.net> * fix memory leak in rrdset cache_dir * minor dictionary changes * dont use index locks in single threaded * obsolete dict option * rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim; * fix jump on uninitialized value in dictionary; remove double free of cache_dir * addressed codacy findings * removed debugging code * use the private refcount on dictionaries * make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim; * more dictionary statistics * global statistics about dictionary operations, memory, items, callbacks * dictionary support for views - missing the public API * removed warning about unused parameter * chart and context name for cloud * chart and context name for cloud, again * dictionary statistics fixed; first implementation of dictionary views - not currently used * only the master can globally delete an item * context needs netdata prefix * fix context and chart it of spins * fix for host variables when health is not enabled * run garbage collector on item insert too * Fix info message; remove extra "using" * update dict unittest for new placement of garbage collector * we need RRDHOST->rrdvars for maintaining custom host variables * Health initialization needs the host->host_uuid * split STRING to its own files; no code changes other than that * initialize health unconditionally * unit tests do not pollute the global scope with their variables * Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>	2022-09-19 23:46:13 +03:00
Costa Tsaousis	3f6a75250d	Obsolete RRDSET state (#13635 ) * move chart_labels to rrdset * rename chart_labels to rrdlabels * renamed hash_id to uuid * turned is_ar_chart into an rrdset flag * removed rrdset state * removed unused senders_connected member of rrdhost * removed unused host flag RRDHOST_FLAG_MULTIHOST * renamed rrdhost host_labels to rrdlabels * Update exporting unit tests Co-authored-by: Vladimir Kobal <vlad@prokk.net>	2022-09-07 15:28:30 +03:00
Costa Tsaousis	5e1b95cf92	Deduplicate all netdata strings (#13570 ) * rrdfamily * rrddim * rrdset plugin and module names * rrdset units * rrdset type * rrdset family * rrdset title * rrdset title more * rrdset context * rrdcalctemplate context and removal of context hash from rrdset * strings statistics * rrdset name * rearranged members of rrdset * eliminate rrdset name hash; rrdcalc chart converted to STRING * rrdset id, eliminated rrdset hash * rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate * rrdcalctemplate * rrdvar * eval_variable * rrddimvar and rrdsetvar * rrdhost hostname, os and tags * fix master commits * added thread cache; implemented string_dup without locks * faster thread cache * rrdset and rrddim now use dictionaries for indexing * rrdhost now uses dictionary * rrdfamily now uses DICTIONARY * rrdvar using dictionary instead of AVL * allocate the right size to rrdvar flag members * rrdhost remaining char * members to STRING * * better error handling on indexing * strings now use a read/write lock to allow parallel searches to the index * removed AVL support from dictionaries; implemented STRING with native Judy calls * string releases should be negative * only 31 bits are allowed for enum flags * proper locking on strings * string threading unittest and fixes * fix lgtm finding * fixed naming * stream chart/dimension definitions at the beginning of a streaming session * thread stack variable is undefined on thread cancel * rrdcontext garbage collect per host on startup * worker control in garbage collection * relaxed deletion of rrdmetrics * type checking on dictfe * netdata chart to monitor rrdcontext triggers * Group chart label updates * rrdcontext better handling of collected rrdsets * rrdpush incremental transmition of definitions should use as much buffer as possible * require 1MB per chart * empty the sender buffer before enabling metrics streaming * fill up to 50% of buffer * reset signaling metrics sending * use the shared variable for status * use separate host flag for enabling streaming of metrics * make sure the flag is clear * add logging for streaming * add logging for streaming on buffer overflow * circular_buffer proper sizing * removed obsolete logs * do not execute worker jobs if not necessary * better messages about compression disabling * proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped * monitor stream sender used buffer ratio * Update exporting unit tests * no need to compare label value with strcmp * streaming send workers now monitor bandwidth * workers now use strings * streaming receiver monitors incoming bandwidth * parser shift of worker ids * minor fixes * Group chart label updates * Populate context with dimensions that have data * Fix chart id * better shift of parser worker ids * fix for streaming compression * properly count received bytes * ensure LZ4 compression ring buffer does not wrap prematurely * do not stream empty charts; do not process empty instances in rrdcontext * need_to_send_chart_definition() does not need an rrdset lock any more * rrdcontext objects are collected, after data have been written to the db * better logging of RRDCONTEXT transitions * always set all variables needed by the worker utilization charts * implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes * lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock * STRING code re-organization for clarity * thread_cache improvements; double numbers precision on worker threads * STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS * rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts * Add index to speed up database context searches * Removed last_updated optimization (was also buggy after latest merge with master) Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>	2022-09-05 19:31:06 +03:00
Stelios Fragkakis	49234f23de	Multi-Tier database backend for long term metrics storage (#13263 ) * Tier part 1 * Tier part 2 * Tier part 3 * Tier part 4 * Tier part 5 * Fix some ML compilation errors * fix more conflicts * pass proper tier * move metric_uuid from state to RRDDIM * move aclk_live_status from state to RRDDIM * move ml_dimension from state to RRDDIM * abstracted the data collection interface * support flushing for mem db too * abstracted the query api * abstracted latest/oldest time per metric * cleanup * store_metric for tier1 * fix for store_metric * allow multiple tiers, more than 2 * state to tier * Change storage type in db. Query param to request min, max, sum or average * Store tier data correctly * Fix skipping tier page type * Add tier grouping in the tier * Fix to handle archived charts (part 1) * Temp fix for query granularity when requesting tier1 data * Fix parameters in the correct order and calculate the anomaly based on the anomaly count * Proper tiering grouping * Anomaly calculation based on anomaly count * force type checking on storage handles * update cmocka tests * fully dynamic number of storage tiers * fix static allocation * configure grouping for all tiers; disable tiers for unittest; disable statsd configuration for private charts mode * use default page dt using the tiering info * automatic selection of tier * fix for automatic selection of tier * working prototype of dynamic tier selection * automatic selection of tier done right (I hope) * ask for the proper tier value, based on the grouping function * fixes for unittests and load_metric_next() * fixes for lgtm findings * minor renames * add dbengine to page cache size setting * add dbengine to page cache with malloc * query engine optimized to loop as little are required based on the view_update_every * query engine grouping methods now do not assume a constant number of points per group and they allocate memory with OWA * report db points per tier in jsonwrap * query planer that switches database tiers on the fly to satisfy the query for the entire timeframe * dbegnine statistics and documentation (in progress) * calculate average point duration in db * handle single point pages the best we can * handle single point pages even better * Keep page type in the rrdeng_page_descr * updated doc * handle future backwards compatibility - improved statistics * support &tier=X in queries * enfore increasing iterations on tiers * tier 1 is always 1 iteration * backfilling higher tiers on first data collection * reversed anomaly bit * set up to 5 tiers * natural points should only be offered on tier 0, except a specific tier is selected * do not allow more than 65535 points of tier0 to be aggregated on any tier * Work only on actually activated tiers * fix query interpolation * fix query interpolation again * fix lgtm finding * Activate one tier for now * backfilling of higher tiers using raw metrics from lower tiers * fix for crash on start when storage tiers is increased from the default * more statistics on exit * fix bug that prevented higher tiers to get any values; added backfilling options * fixed the statistics log line * removed limit of 255 iterations per tier; moved the code of freezing rd->tiers[x]->db_metric_handle * fixed division by zero on zero points_wanted * removed dead code * Decide on the descr->type for the type of metric * dont store metrics on unknown page types * free db_metric_handle on sql based context queries * Disable STORAGE_POINT value check in the exporting engine unit tests * fix for db modes other than dbengine * fix for aclk archived chart queries destroying db_metric_handles of valid rrddims * fix left-over freez() instead of OWA freez on median queries Co-authored-by: Costa Tsaousis <costa@netdata.cloud> Co-authored-by: Vladimir Kobal <vlad@prokk.net>	2022-07-06 14:01:53 +03:00
Costa Tsaousis	c3dfbe52a6	netdata doubles (#13217 ) * netdata doubles * fix cmocka test * fix cmocka test again * fix left-overs of long double to NETDATA_DOUBLE * RRDDIM detached from disk representation; db settings in [db] section of netdata.conf * update the memory before saving * rrdset is now detached from file structures too * on memory mode map, update the memory mapped structures on every iteration * allow RRD_ID_LENGTH_MAX to be changed * granularity secs, back to update every * fix formatting * more formatting	2022-06-28 17:04:37 +03:00
Costa Tsaousis	b32ca44319	Query Engine multi-granularity support (and MC improvements) (#13155 ) * set grouping functions * storage engine should check the validity of timestamps, not the query engine * calculate and store in RRDR anomaly rates for every query * anomaly rate used by volume metric correlations * mc volume should use absolute data, to avoid cancelling effect * return anomaly-rates in jasonwrap with jw-anomaly-rates option to data queries * dont return null on anomaly rates * allow passing group query options from the URL * added countif to the query engine and used it in metric correlations * fix configure * fix countif and anomaly rate percentages * added group_options to metric correlations; updated swagger * added newline at the end of yaml file * always check the time the highlighted window was above/below the highlighted window * properly track time in memory queries * error for internal checks only * moved pack_storage_number() into the storage engines * moved unpack_storage_number() inside the storage engines * remove old comment * pass unit tests * properly detect zero or subnormal values in pack_storage_number() * fill nulls before the value, not after * make sure math.h is included * workaround for isfinite() * fix for isfinite() * faster isfinite() alternative * fix for faster isfinite() alternative * next_metric() now returns end_time too * variable step implemented in a generic way * remove left-over variables * ensure we always complete the wanted number of points * fixes * ensure no infinite loop * mc-volume-improvements: Add information about invalid condition * points should have a duration in the past * removed unneeded info() line * Fix unit tests for exporting engine * new_point should only be checked when it is fetched from the db; better comment about the premature breaking of the main query loop Co-authored-by: Thiago Marques <thiagoftsm@gmail.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>	2022-06-22 11:19:08 +03:00
Costa Tsaousis	986a1abf68	73x times faster metrics correlations at the agent (#13107 ) * faster correlations * 4x times faster correlations * a little bit more help * 10x times faster metrics correlations * 6 digits precision; better comments * enabled metrics correlations by default * abstracted DIFFS_NUMBER to allow easily changing it * reworked the entire logic to have more accuracy and support a baseline that is power of two multiple of highlight * properly calculate shifts * even more improved version * added support for timeout; fixed another memory leak; skipped hidden dimensions * default timeout 1min * reduce memory even further * use dictionary for the list of charts and optimize locks * return 403 forbidden, when mc is not enabled * added query options * dont process zero dimensions * added volume method as an option to metric correlations ; now metric correlations can support multiple implementations * make sure we will never crash * spread results evenly for both kstwo and volume * fixed bug in query engine that was missing misaligned queries when a single point was requested from the db; improved comments; improved query flags * updated swagger and added sane defaults; query options are now supported, including anomaly-bit * added "raw" option to allow cross node correlations; added "group" option to allow different time aggregations; allowed calling metric correlations without any parameters; allowed calling metric correlations with relative timestamps; added timeout to volume method; properly handled timeout on ks2 method; json output now sends all parameters back - same for json_wrap; modified query engine to use present time for relative timestamps; modified "allow_past" to mean both past backwards and forwards * emulate the old behaviour about zero points * 100% accuracy against python ks_2samp(); now the default is volume and the default points are 500 * added config option to change default metric correlations method * removed work-arounds now that rrdlabels are merged	2022-06-13 21:31:52 +03:00
Costa Tsaousis	1b0f6c6b22	Labels with dictionary (#13070 ) * squashed and rebased to master * fix overflow and single character bug in sanitize; include rrd.h instead of node_info.h * added unittest for UTF-8 multibyte sanitization * Fix unit test compilation * Fix CMake build * remove double sanitizer for opentsdb; cleanup sanitize_json_string() * rename error_description to error_message to avoid conflict with json-c * revert last and undef error_description from json-c * more unittests; attempt to fix protobuf map issue * get rid of rrdlabels_get() and replace it with a safe version that writes the value to a buffer * added dictionary sorting unittest; rrdlabels_to_buffer() now is sorted * better sorted dictionary checking * proper unittesting for sorted dictionaries * call dictionary deletion callback when destroying the dictionary * remove obsolete variable * Fix exporting unit tests * Fix k8s label parsing test * workaround for cmocka and strdupz() * Bypass cmocka memory allocation check * Revert "Bypass cmocka memory allocation check" This reverts commit `4c49923839`. * Revert "workaround for cmocka and strdupz()" This reverts commit `7bebee0480`. * Bypass cmocka memory allocation checks * respect json formatting for chart labels * cloud sends colons * print the value only once * allow parenthesis in values and spaces; make stream sender send quotes for values Co-authored-by: Vladimir Kobal <vlad@prokk.net>	2022-06-13 20:35:45 +03:00
Costa Tsaousis	7784a16cc7	Dictionary with JudyHS and double linked list (#13032 ) * dictionary internals isolation * more dictionary cleanups * added unit test * we should use DICT internally * disable cups in cmake * implement DICTIONARY with Judy arrays * operational JUDY implementation * JUDY cleanup * JUDY summary added * JudyHS implementation with double linked list * test negative searches too * optimize destruction * optimize set to insert first without lookup * updated stats * code cleanup; better organization; updated info * more code cleanup and commenting * more cleanup, renames and comments * fix rename * more cleanups * use Judy.h from system paths * added foreach traversal; added flag to add item in front; isolated locks to their own functions; destruction returns the number of bytes freed * more comments; flags are now 16-bit * completed unittesting * addressed comments and added reference counters maintainance * added unittest in main; tested removal of items in front, back and middle * added read/write walkthrough and foreach; allowed walkthrough and foreach in write mode to delete the current element (used by cups.plugin); referenced counters removed from the API * DICTFE.name should be const too * added API calls for exposing all statistics * dictionary flags as enum and reference counters as atomic operations * more comments; improved error handling at unit tests * added functions to allow unsafe access while traversing the dictionary with locks in place * check for libcups in cmake * added delete callback; implemented statsd with this dictionary * added missing dfe_done() * added alternative implementation with AVL * added documentation * added comments and warning about AVL * dictionary walktrhough on new code * simplified foreach; updated docs * updated docs * AVL is much faster without hashes * AVL should follow DBENGINE	2022-06-01 20:01:52 +03:00
Stelios Fragkakis	3071aa055c	Add additional metadata to the data response (#13036 ) * Consolidate query params * Add new option to show full dimensions in the json header (this will include dimensions, charts and chart labels) * Group and pass parameters with query_params	2022-05-31 21:46:44 +03:00
Stelios Fragkakis	881a1b9e13	Use the chart id instead of chart name in response to incoming cloud context queries (#11898 )	2021-12-13 16:03:38 +02:00
Stelios Fragkakis	7ebb0a4da2	Fix memory leak when archived data is requested (#10837 )	2021-03-23 18:09:34 +02:00
Stelios Fragkakis	65bc43d9cb	Add data query support for archived charts (#10771 )	2021-03-22 09:47:22 +02:00
Stelios Fragkakis	cd443de780	Support multiple chart label keys in data queries (#10483 )	2021-01-14 18:50:33 +02:00
Ilya Mashchenko	0f8175dd30	Kubernetes labels (#10107 ) Co-authored-by: Markos Fountoulakis <markos.fountoulakis.senior@gmail.com> Co-authored-by: Vladimir Kobal <vlad@prokk.net>	2020-12-14 17:27:55 +03:00
Markos Fountoulakis	5ffba490e3	Fix race condition in rrdset_first_entry_t() and rrdset_last_entry_t() (#10276 )	2020-11-28 15:53:12 +02:00
Stelios Fragkakis	8f6f1baf9a	Added context parameter to the data endpoint (#9931 ) Added functionality to support composite charts	2020-09-15 19:41:39 +03:00
Vladimir Kobal	8cf5889194	Clean up host labels in API responses (#7616 ) * Remove host labels from the Swagger specification * Remove host labels from the api responses	2020-01-06 17:34:49 +02:00
Andrew Moss	c8c72f18a6	Labels issues (#7515 ) Initial work on host labels from the dedicated branch. Includes work for issues #7096, #7400, #7411, #7369, #7410, #7458, #7459, #7412 and #7408 by @vlvkobal, @thiagoftsm, @cakrit and @amoss.	2019-12-16 15:12:00 +01:00
Valentin Rakush	92642269f1	Add alarm variables to the response of chart and data (#6615 ) ##### Summary Implements feature #6054 Now requests like http://localhost:19999/api/v1/chart?chart=example.random http://localhost:19999/api/v1/data?chart=example.random&options=jsonwrap&options=showcustomvars - return chart variables in their responses. Chart variables include only those with options set to RRDVAR_OPTION_CUSTOM_CHART_VAR - for /api/v1/data requests chart variables are returned when parameter options=jsonwrap and options=showcustomvars ##### Component Name [/database](https://github.com/netdata/netdata/tree/master/database/) [/web/api/formatters](https://github.com/netdata/netdata/tree/master/web/api/formatters)	2019-08-20 11:11:43 +02:00
Markos Fountoulakis	6ca6d840dd	Database engine (#5282 ) * Database engine prototype version 0 * Database engine initial integration with netdata POC * Scalable database engine with file and memory management. * Database engine integration with netdata * Added MIN MAX definitions to fix alpine build of travis CI * Bugfix for backends and new DB engine, remove useless rrdset_time2slot() calls and erroneous checks * DB engine disk protocol correction * Moved DB engine storage file location to /var/cache/netdata/{host}/dbengine * Fix configure to require openSSL for DB engine * Fix netdata daemon health not holding read lock when iterating chart dimensions * Optimized query API for new DB engine and old netdata DB fallback code-path * netdata database internal query API improvements and cleanup * Bugfix for DB engine queries returning empty values * Added netdata internal check for data queries for old and new DB * Added statistics to DB engine and fixed memory corruption bug * Added preliminary charts for DB engine statistics * Changed DB engine ratio statistics to incremental * Added netdata statistics charts for DB engine internal statistics * Fix for netdata not compiling successfully when missing dbengine dependencies * Added DB engine functional test to netdata unittest command parameter * Implemented DB engine dataset generator based on example.random chart * Fix build error in CI * Support older versions of libuv1 * Fixes segmentation fault when using multiple DB engine instances concurrently * Fix memory corruption bug * Fixed createdataset advanced option not exiting * Fix for DB engine not working on FreeBSD * Support FreeBSD library paths of new dependencies * Workaround for unsupported O_DIRECT in OS X * Fix unittest crashing during cleanup * Disable DB engine FS caching in Apple OS X since O_DIRECT is not available * Fix segfault when unittest and DB engine dataset generator don't have permissions to create temporary host * Modified DB engine dataset generator to create multiple files * Toned down overzealous page cache prefetcher * Reduce internal memory fragmentation for page-cache data pages * Added documentation describing the DB engine * Documentation bugfixes * Fixed unit tests compilation errors since last rebase * Added note to back-up the DB engine files in documentation * Added codacy fix. * Support old gcc versions for atomic counters in DB engine	2019-05-15 08:28:06 +03:00
Costa Tsaousis	798c141c49	Split the API formatters in modules (#4504 ) * split all API formatters in modules * added markdown formatting * updated csv readme * updated csv readme * more documentation * added more documentation * updated documentation * fixed typo * fixed typo	2018-10-27 19:44:27 +03:00

38 commits