0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-09 23:57:55 +00:00
Commit graph

43 commits

Author SHA1 Message Date
Costa Tsaousis
cb7af25c09
RRD structures managed by dictionaries ()
* rrdset - in progress

* rrdset optimal constructor; rrdset conflict

* rrdset final touches

* re-organization of rrdset object members

* prevent use-after-free

* dictionary dfe supports also counting of iterations

* rrddim managed by dictionary

* rrd.h cleanup

* DICTIONARY_ITEM now is referencing actual dictionary items in the code

* removed rrdset linked list

* Revert "removed rrdset linked list"

This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5.

* removed rrdset linked list

* added comments

* Switch chart uuid to static allocation in rrdset
Remove unused functions

* rrdset_archive() and friends...

* always create rrdfamily

* enable ml_free_dimension

* rrddim_foreach done with dfe

* most custom rrddim loops replaced with rrddim_foreach

* removed accesses to rrddim->dimensions

* removed locks that are no longer needed

* rrdsetvar is now managed by the dictionary

* set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853

* conflict callback of rrdsetvar now properly checks if it has to reset the variable

* dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM

* dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe

* dictionary walkthrough callbacks get dictionary acquired items

* dictionary reference counters that can be dupped from zero

* added advanced functions for get and del

* rrdvar managed by dictionaries

* thread safety for rrdsetvar

* faster rrdvar initialization

* rrdvar string lengths should match in all add, del, get functions

* rrdvar internals hidden from the rest of the world

* rrdvar is now acquired throughout netdata

* hide the internal structures of rrdsetvar

* rrdsetvar is now acquired through out netdata

* rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata

* better error handling

* dont create variables if not initialized for health

* dont create variables if not initialized for health again

* rrdfamily is now managed by dictionaries; references of it are acquired dictionary items

* type checking on acquired objects

* rrdcalc renaming of functions

* type checking for rrdfamily_acquired

* rrdcalc managed by dictionaries

* rrdcalc double free fix

* host rrdvars is always needed

* attempt to fix deadlock 1

* attempt to fix deadlock 2

* Remove unused variable

* attempt to fix deadlock 3

* snprintfz

* rrdcalc index in rrdset fix

* Stop storing active charts and computing chart hashes

* Remove store active chart function

* Remove compute chart hash function

* Remove sql_store_chart_hash function

* Remove store_active_dimension function

* dictionary delayed destruction

* formatting and cleanup

* zero dictionary base on rrdsetvar

* added internal error to log delayed destructions of dictionaries

* typo in rrddimvar

* added debugging info to dictionary

* debug info

* fix for rrdcalc keys being empty

* remove forgotten unlock

* remove deadlock

* Switch to metadata version 5 and drop
  chart_hash
  chart_hash_map
  chart_active
  dimension_active
  v_chart_hash

* SQL cosmetic changes

* do not busy wait while destroying a referenced dictionary

* remove deadlock

* code cleanup; re-organization;

* fast cleanup and flushing of dictionaries

* number formatting fixes

* do not delete configured alerts when archiving a chart

* rrddim obsolete linked list management outside dictionaries

* removed duplicate contexts call

* fix crash when rrdfamily is not initialized

* dont keep rrddimvar referenced

* properly cleanup rrdvar

* removed some locks

* Do not attempt to cleanup chart_hash / chart_hash_map

* rrdcalctemplate managed by dictionary

* register callbacks on the right dictionary

* removed some more locks

* rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread

* when looking up for an alarm look using both chart id and chart name

* host initialization a bit more modular

* init rrdlabels on host update

* preparation for dictionary views

* improved comment

* unused variables without internal checks

* service threads isolation and worker info

* more worker info in service thread

* thread cancelability debugging with internal checks

* strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647

* dictionary modularization

* Remove unused SQL statement definition

* unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated

* remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops

* rewritten dictionary to have 2 separate locks, one for indexing and another for traversal

* Update collectors/cgroups.plugin/sys_fs_cgroup.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* Update collectors/cgroups.plugin/sys_fs_cgroup.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* Update collectors/proc.plugin/proc_net_dev.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* fix memory leak in rrdset cache_dir

* minor dictionary changes

* dont use index locks in single threaded

* obsolete dict option

* rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim;

* fix jump on uninitialized value in dictionary; remove double free of cache_dir

* addressed codacy findings

* removed debugging code

* use the private refcount on dictionaries

* make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim;

* more dictionary statistics

* global statistics about dictionary operations, memory, items, callbacks

* dictionary support for views - missing the public API

* removed warning about unused parameter

* chart and context name for cloud

* chart and context name for cloud, again

* dictionary statistics fixed; first implementation of dictionary views - not currently used

* only the master can globally delete an item

* context needs netdata prefix

* fix context and chart it of spins

* fix for host variables when health is not enabled

* run garbage collector on item insert too

* Fix info message; remove extra "using"

* update dict unittest for new placement of garbage collector

* we need RRDHOST->rrdvars for maintaining custom host variables

* Health initialization needs the host->host_uuid

* split STRING to its own files; no code changes other than that

* initialize health unconditionally

* unit tests do not pollute the global scope with their variables

* Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-19 23:46:13 +03:00
Costa Tsaousis
5e1b95cf92
Deduplicate all netdata strings ()
* rrdfamily

* rrddim

* rrdset plugin and module names

* rrdset units

* rrdset type

* rrdset family

* rrdset title

* rrdset title more

* rrdset context

* rrdcalctemplate context and removal of context hash from rrdset

* strings statistics

* rrdset name

* rearranged members of rrdset

* eliminate rrdset name hash; rrdcalc chart converted to STRING

* rrdset id, eliminated rrdset hash

* rrdcalc, alarm_entry, alert_config and some of rrdcalctemplate

* rrdcalctemplate

* rrdvar

* eval_variable

* rrddimvar and rrdsetvar

* rrdhost hostname, os and tags

* fix master commits

* added thread cache; implemented string_dup without locks

* faster thread cache

* rrdset and rrddim now use dictionaries for indexing

* rrdhost now uses dictionary

* rrdfamily now uses DICTIONARY

* rrdvar using dictionary instead of AVL

* allocate the right size to rrdvar flag members

* rrdhost remaining char * members to STRING *

* better error handling on indexing

* strings now use a read/write lock to allow parallel searches to the index

* removed AVL support from dictionaries; implemented STRING with native Judy calls

* string releases should be negative

* only 31 bits are allowed for enum flags

* proper locking on strings

* string threading unittest and fixes

* fix lgtm finding

* fixed naming

* stream chart/dimension definitions at the beginning of a streaming session

* thread stack variable is undefined on thread cancel

* rrdcontext garbage collect per host on startup

* worker control in garbage collection

* relaxed deletion of rrdmetrics

* type checking on dictfe

* netdata chart to monitor rrdcontext triggers

* Group chart label updates

* rrdcontext better handling of collected rrdsets

* rrdpush incremental transmition of definitions should use as much buffer as possible

* require 1MB per chart

* empty the sender buffer before enabling metrics streaming

* fill up to 50% of buffer

* reset signaling metrics sending

* use the shared variable for status

* use separate host flag for enabling streaming of metrics

* make sure the flag is clear

* add logging for streaming

* add logging for streaming on buffer overflow

* circular_buffer proper sizing

* removed obsolete logs

* do not execute worker jobs if not necessary

* better messages about compression disabling

* proper use of flags and updating rrdset last access time every time the obsoletion flag is flipped

* monitor stream sender used buffer ratio

* Update exporting unit tests

* no need to compare label value with strcmp

* streaming send workers now monitor bandwidth

* workers now use strings

* streaming receiver monitors incoming bandwidth

* parser shift of worker ids

* minor fixes

* Group chart label updates

* Populate context with dimensions that have data

* Fix chart id

* better shift of parser worker ids

* fix for streaming compression

* properly count received bytes

* ensure LZ4 compression ring buffer does not wrap prematurely

* do not stream empty charts; do not process empty instances in rrdcontext

* need_to_send_chart_definition() does not need an rrdset lock any more

* rrdcontext objects are collected, after data have been written to the db

* better logging of RRDCONTEXT transitions

* always set all variables needed by the worker utilization charts

* implemented double linked list for most objects; eliminated alarm indexes from rrdhost; and many more fixes

* lockless strings design - string_dup() and string_freez() are totally lockless when they dont need to touch Judy - only Judy is protected with a read/write lock

* STRING code re-organization for clarity

* thread_cache improvements; double numbers precision on worker threads

* STRING_ENTRY now shadown STRING, so no duplicate definition is required; string_length() renamed to string_strlen() to follow the paradigm of all other functions, STRING internal statistics are now only compiled with NETDATA_INTERNAL_CHECKS

* rrdhost index by hostname now cleans up; aclk queries of archieved hosts do not index hosts

* Add index to speed up database context searches

* Removed last_updated optimization (was also buggy after latest merge with master)

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-05 19:31:06 +03:00
Costa Tsaousis
e73df78a06
Tiering statistics API endpoint ()
* calculator statistics

* added metrics and metrics_pages counters

* implemented API

* updates to match sheet

* updates to match sheet No2

* fix update every calculation for single point pages

* fix lgtm finding
2022-07-26 12:05:21 +03:00
Stelios Fragkakis
87e9700b2f
Detect stored metric size by page type ()
* Report unknown page only once
Get metric storage size by the page type
Verify validity of the page and skip problematic ones

* Change PAGE_SIZE to PAGE_POINT_SIZE_BYTES

* Add bitmap256 and unittests

* Fix unit test
tier_page_type array
page_type_size arrays

* Add another counter to not rely on uint8_t overflow to stop the test loop
2022-07-11 20:40:26 +03:00
Costa Tsaousis
a6da6beb71
array allocator for dbengine page descriptors ()
* array allocator for dbengine page descriptors

* full implementation of array allocator with cleanup

* faster deallocations

* eliminate entierely the need for loops during free

* addressed comments

* lower the min number of elements to 10
2022-07-08 00:09:33 +03:00
Stelios Fragkakis
49234f23de
Multi-Tier database backend for long term metrics storage ()
* Tier part 1

* Tier part 2

* Tier part 3

* Tier part 4

* Tier part 5

* Fix some ML compilation errors

* fix more conflicts

* pass proper tier

* move metric_uuid from state to RRDDIM

* move aclk_live_status from state to RRDDIM

* move ml_dimension from state to RRDDIM

* abstracted the data collection interface

* support flushing for mem db too

* abstracted the query api

* abstracted latest/oldest time per metric

* cleanup

* store_metric for tier1

* fix for store_metric

* allow multiple tiers, more than 2

* state to tier

* Change storage type in db. Query param to request min, max, sum or average

* Store tier data correctly

* Fix skipping tier page type

* Add tier grouping in the tier

* Fix to handle archived charts (part 1)

* Temp fix for query granularity when requesting tier1 data

* Fix parameters in the correct order and calculate the anomaly based on the anomaly count

* Proper tiering grouping

* Anomaly calculation based on anomaly count

* force type checking on storage handles

* update cmocka tests

* fully dynamic number of storage tiers

* fix static allocation

* configure grouping for all tiers; disable tiers for unittest; disable statsd configuration for private charts mode

* use default page dt using the tiering info

* automatic selection of tier

* fix for automatic selection of tier

* working prototype of dynamic tier selection

* automatic selection of tier done right (I hope)

* ask for the proper tier value, based on the grouping function

* fixes for unittests and load_metric_next()

* fixes for lgtm findings

* minor renames

* add dbengine to page cache size setting

* add dbengine to page cache with malloc

* query engine optimized to loop as little are required based on the view_update_every

* query engine grouping methods now do not assume a constant number of points per group and they allocate memory with OWA

* report db points per tier in jsonwrap

* query planer that switches database tiers on the fly to satisfy the query for the entire timeframe

* dbegnine statistics and documentation (in progress)

* calculate average point duration in db

* handle single point pages the best we can

* handle single point pages even better

* Keep page type in the rrdeng_page_descr

* updated doc

* handle future backwards compatibility - improved statistics

* support &tier=X in queries

* enfore increasing iterations on tiers

* tier 1 is always 1 iteration

* backfilling higher tiers on first data collection

* reversed anomaly bit

* set up to 5 tiers

* natural points should only be offered on tier 0, except a specific tier is selected

* do not allow more than 65535 points of tier0 to be aggregated on any tier

* Work only on actually activated tiers

* fix query interpolation

* fix query interpolation again

* fix lgtm finding

* Activate one tier for now

* backfilling of higher tiers using raw metrics from lower tiers

* fix for crash on start when storage tiers is increased from the default

* more statistics on exit

* fix bug that prevented higher tiers to get any values; added backfilling options

* fixed the statistics log line

* removed limit of 255 iterations per tier; moved the code of freezing rd->tiers[x]->db_metric_handle

* fixed division by zero on zero points_wanted

* removed dead code

* Decide on the descr->type for the type of metric

* dont store metrics on unknown page types

* free db_metric_handle on sql based context queries

* Disable STORAGE_POINT value check in the exporting engine unit tests

* fix for db modes other than dbengine

* fix for aclk archived chart queries destroying db_metric_handles of valid rrddims

* fix left-over freez() instead of OWA freez on median queries

Co-authored-by: Costa Tsaousis <costa@netdata.cloud>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-07-06 14:01:53 +03:00
Costa Tsaousis
eb216a1f4b
Workers utilization charts ()
* initial version of worker utilization

* working example

* without mutexes

* monitoring DBENGINE, ACLKSYNC, WEB workers

* added charts to monitor worker usage

* fixed charts units

* updated contexts

* updated priorities

* added documentation

* converted threads to stacked chart

* One query per query thread

* Revert "One query per query thread"

This reverts commit 6aeb391f5987c3c6ba2864b559fd7f0cd64b14d3.

* fixed priority for web charts

* read worker cpu utilization from proc

* read workers cpu utilization via /proc/self/task/PID/stat, so that we have cpu utilization even when the jobs are too long to finish within our update_every frequency

* disabled web server cpu utilization monitoring - it is now monitored by worker utilization

* tight integration of worker utilization to web server

* monitoring statsd worker threads

* code cleanup and renaming of variables

* contrained worker and statistics conflict to just one variable

* support for rendering jobs per type

* better priorities and removed the total jobs chart

* added busy time in ms per job type

* added proc.plugin monitoring, switch clock to MONOTONIC_RAW if available, global statistics now cleans up old worker threads

* isolated worker thread families

* added cgroups.plugin workers

* remove unneeded dimensions when then expected worker is just one

* plugins.d and streaming monitoring

* rebased; support worker_is_busy() to be called one after another

* added diskspace plugin monitoring

* added tc.plugin monitoring

* added ML threads monitoring

* dont create dimensions and charts that are not needed

* fix crash when job types are added on the fly

* added timex and idlejitter plugins; collected heartbeat statistics; reworked heartbeat according to the POSIX

* the right name is heartbeat for this chart

* monitor streaming senders

* added streaming senders to global stats

* prevent division by zero

* added clock_init() to external C plugins

* added freebsd and macos plugins

* added freebsd and macos to global statistics

* dont use new as a variable; address compiler warnings on FreeBSD and MacOS

* refactored contexts to be unique; added health threads monitoring

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
2022-05-09 16:34:31 +03:00
Costa Tsaousis
87c0cc2d60
One way allocator to double the speed of parallel context queries ()
* one way allocator to speed up context queries

* fixed a bug while expanding memory pages

* reworked for clarity and finally fixed the bug of allocating memory beyond the page size

* further optimize allocation step to minimize the number of allocations made

* implement strdup with memcpy instead of strcpy

* added documentation

* prevent an uninitialized use of owa

* added callocz() interface

* integrate onewayalloc everywhere - apart sql queries

* one way allocator is now used in context queries using archived charts in sql

* align on the size of pointers

* forgotten freez()

* removed not needed memcpys

* give unique names to global variables to avoid conflicts with system definitions
2022-05-03 00:31:19 +03:00
Costa Tsaousis
43b9fdc213
procfile: more comfortable initial settings and faster/fewer reallocs () 2022-05-02 14:33:41 +03:00
Ilya Mashchenko
091540d59a
feat(dbengine): make dbengine page cache undumpable and dedupuble ()
* make netdata more awesome

* reworked on-madvise and mmap to provide clarity
2022-04-28 11:19:15 +03:00
vkalintiris
1ce39d11f4
Remove SIZEOF_VOIDP and ENVIRONMENT{32,64} macros. () 2022-02-22 14:24:57 +02:00
Vladimir Kobal
d8b7b6a25f
Fix compilation warnings on macOS () 2022-02-21 12:23:48 +02:00
vkalintiris
37082fcbc1
Compute platform-specific list of static_threads at runtime. ()
Compute array of static threads at runtime.
2022-01-19 08:54:37 +02:00
vkalintiris
e91d1110e5
Do not use dbengine headers when dbengine is disabled. ()
Prior to this commit both daemon/commands.c and spawn/spawn.c used to
include database/engine/rrdenginelib.h, ie. a header file that is available
only when enabling the dbengine feature.
2022-01-18 10:30:36 +02:00
vkalintiris
9ed4cea590
Anomaly Detection MVP ()
* Add support for feature extraction and K-Means clustering.

This patch adds support for performing feature extraction and running the
K-Means clustering algorithm on the extracted features.

We use the open-source dlib library to compute the K-Means clustering
centers, which has been added as a new git submodule.

The build system has been updated to recognize two new options:

    1) --enable-ml: build an agent with ml functionality, and
    2) --enable-ml-tests: support running tests with the `-W mltest`
       option in netdata.

The second flag is meant only for internal use. To build tests successfully,
you need to install the GoogleTest framework on your machine.

* Boilerplate code to track hosts/dims and init ML config options.

A new opaque pointer field is added to the database's host and dimension
data structures. The fields point to C++ wrapper classes that will be used
to store ML-related information in follow-up patches.

The ML functionality needs to iterate all tracked dimensions twice per
second. To avoid locking the entire DB multiple times, we use a
separate dictionary to add/remove dimensions as they are created/deleted
by the database.

A global configuration object is initialized during the startup of the
agent. It will allow our users to specify ML-related configuration
options, eg. hosts/charts to skip from training, etc.

* Add support for training and prediction of dimensions.

Every new host spawns a training thread which is used to train the model
of each dimension.

Training of dimensions is done in a non-batching mode in order to avoid
impacting the generated ML model by the CPU, RAM and disk utilization of
the training code itself.

For performance reasons, prediction is done at the time a new value
is pushed in the database. The alternative option, ie. maintaining a
separate thread for prediction, would be ~3-4x times slower and would
increase locking contention considerably.

For similar reasons, we use a custom function to unpack storage_numbers
into doubles, instead of long doubles.

* Add data structures required by the anomaly detector.

This patch adds two data structures that will be used by the anomaly
detector in follow-up patches.

The first data structure is a circular bit buffer which is being used to
count the number of set bits over time.

The second data structure represents an expandable, rolling window that
tracks set/unset bits. It is explicitly modeled as a finite-state
machine in order to make the anomaly detector's behaviour easier to test
and reason about.

* Add anomaly detection thread.

This patch creates a new anomaly detection thread per host. Each thread
maintains a BitRateWindow which is updated every second based on the
anomaly status of the correspondent host.

Based on the updated status of the anomaly window, we can identify the
existence/absence of an anomaly event, it's start/end time and the
dimensions that participate in it.

* Create/insert/query anomaly events from Sqlite DB.

* Create anomaly event endpoints.

This patch adds two endpoints to expose information about anomaly
events. The first endpoint returns the list of anomalous events within a
specified time range. The second endpoint provides detailed information
about a single anomaly event, ie. the list of anomalous dimensions in
that event along with their anomaly rate.

The `anomaly-bit` option has been added to the `/data` endpoint in order
to allow users to get the anomaly status of individual dimensions per
second.

* Fix build failures on Ubuntu 16.04 & CentOS 7.

These distros do not have toolchains with C++11 enabled by default.
Replacing nullptr with NULL should be fix the build problems on these
platforms when the ML feature is not enabled.

* Fix `make dist` to include ML makefiles and dlib sources.

Currently, we add ml/kmeans/dlib to EXTRA_DIST. We might want to
generate an explicit list of source files in the future, in order to
bring down the generated archive's file size.

* Small changes to make the LGTM & Codacy bots happy.

- Cast unused result of function calls to void.
- Pass a const-ref string to Database's constructor.
- Reduce the scope of a local variable in the anomaly detector.

* Add user configuration option to enable/disable anomaly detection.

* Do not log dimension-specific operations.

Training and prediction operations happen every second for each
dimension. In prep for making this PR easier to run anomaly detection
for many charts & dimensions, I've removed logs that would cause log
flooding.

* Reset dimensions' bit counter when not above anomaly rate threshold.

* Update the default config options with real values.

With this patch the default configuration options will match the ones
we want our users to use by default.

* Update conditions for creating new ML dimensions.

1. Skip dimensions with update_every != 1,
2. Skip dimensions that come from the ML charts.

With this filtering in place, any configuration value for the
relevant simple_pattern expressions will work correctly.

* Teach buildinfo{,json} about the ML feature.

* Set --enable-ml by default in the configuration options.

This patch is only meant for testing the building of the ML functionality
on Github. It will be reverted once tests pass successfully.

* Minor build system fixes.

- Add path to json header
- Enable C++ linker when ML functionality is enabled
- Rename ml/ml-dummy.cc to ml/ml-dummy.c

* Revert "Set --enable-ml by default in the configuration options."

This reverts commit 28206952a59a577675c86194f2590ec63b60506c.

We pass all Github checks when building the ML functionality, except for
those that run on CentOS 7 due to not having a C++11 toolchain.

* Check for missing dlib and nlohmann files.

We simply check the single-source files upon which our build system
depends. If they are missing, an error message notifies the user
about missing git submodules which are required for the ML
functionality.

* Allow users to specify the maximum number of KMeans iterations.

* Use dlib v19.10

v19.22 broke compatibility with CentOS 7's g++. Development of the
anomaly detection used v19.10, which is the version used by most Debian and
Ubuntu distribution versions that are not past EOL.

No observable performance improvements/regressions specific to the K-Means
algorithm occur between the two versions.

* Detect and use the -std=c++11 flag when building anomaly detection.

This patch automatically adds the -std=c++11 when building netdata
with the ML functionality, if it's supported by the user's toolchain.

With this change we are able to build the agent correctly on CentOS 7.

* Restructure configuration options.

- update default values,
- clamp values to min/max defaults,
- validate and identify conflicting values.

* Add update_every configuration option.

Considerring that the MVP does not support per host configuration
options, the update_every option will be used to filter hosts to train.

With this change anomaly detection will be supported on:

    - Single nodes with update_every != 1, and
    - Children nodes with a common update_every value that might differ from
      the value of the parent node.

* Reorganize anomaly detection charts.

This follows Andrew's suggestion to have four charts to show the number
of anomalous/normal dimensions, the anomaly rate, the detector's window
length, and the events that occur in the prediction step.

Context and family values, along with the necessary information in the
dashboard_info.js file, will be updated in a follow-up commit.

* Do not dump anomaly event info in logs.

* Automatically handle low "train every secs" configuration values.

If a user specifies a very low value for the "train every secs", then
it is possible that the time it takes to train a dimension is higher
than the its allotted time.

In that case, we want the training thread to:

    - Reduce it's CPU usage per second, and
    - Allow the prediction thread to proceed.

We achieve this by limiting the training time of a single dimension to
be equal to half the time allotted to it. This means, that the training
thread will never consume more than 50% of a single core.

* Automatically detect if ML functionality should be enabled.

With these changes, we enable ML if:

    - The user has not explicitly specified --disable-ml, and
    - Git submodules have been checked out properly, and
    - The toolchain supports C++11.

If the user has explicitly specified --enable-ml, the build fails if
git submodules are missing, or the toolchain does not support C++11.

* Disable anomaly detection by default.

* Do not update charts in locked region.

* Cleanup code reading configuration options.

* Enable C++ linker when building ML.

* Disable ML functionality for CMake builds.

* Skip LGTM for dlib and nlohmann libraries.

* Do not build ML if libuuid is missing.

* Fix dlib path in LGTM's yaml config file.

* Add chart to track duration of prediction step.

* Add chart to track duration of training step.

* Limit the number dimensions in an anomaly event.

This will ensure our JSON results won't grow without any limit. The
default ML configuration options, train approximately ~1700 dimensions
in a newly-installed Netdata agent. The hard-limit is set to 2000
dimensions which:

    - Is well above the default number of dimensions we train,
    - If it is ever reached it means that the user had accidentaly a
      very low anomaly rate threshold, and
    - Considering that we sort the result by anomaly score, the cutoff
      dimensions will be the less anomalous, ie. the least important to
      investigate.

* Add information about the ML charts.

* Update family value in ML charts.

This fix will allow us to show the individual charts in the RHS Anomaly
Detection submenu.

* Rename chart type

s/anomalydetection/anomaly_detection/g

* Expose ML feat in /info endpoint.

* Export ML config through /info endpoint.

* Fix CentOS 7 build.

* Reduce the critical region of a host's lock.

Before this change, each host had a single, dedicated lock to protect
its map of dimensions from adding/deleting new dimensions while training
and detecting anomalies. This was problematic because training of a
single dimension can take several seconds in nodes that are under heavy
load.

After this change, the host's lock protects only the insertion/deletion
of new dimensions, and the prediction step. For the training of dimensions
we use a dedicated lock per dimension, which is responsible for protecting
the dimension from deletion while training.

Prediction is fast enough, even on slow machines or under heavy load,
which allows us to use the host's main lock and avoid increasing the
complexity of our implementation in the anomaly detector.

* Improve the way we are tracking anomaly detector's performance.

This change allows us to:

    - track the total training time per update_every period,
    - track the maximum training time of a single dimension per
      update_every period, and
    - export the current number of total, anomalous, normal dimensions
      to the /info endpoint.

Also, now that we use dedicated locks per dimensions, we can train under
heavy load continuously without having to sleep in order to yield the
training thread and allow the prediction thread to progress.

* Use samples instead of seconds in ML configuration.

This commit changes the way we are handling input ML configuration
options from the user. Instead of treating values as seconds, we
interpret all inputs as number of update_every periods. This allows
us to enable anomaly detection on hosts that have update_every != 1
second, and still produce a model for training/prediction & detection
that behaves in an expected way.

Tested by running anomaly detection on an agent with update_every = [1,
2, 4] seconds.

* Remove unecessary log message in detection thread

* Move ML configuration to global section.

* Update web/gui/dashboard_info.js

Co-authored-by: Andrew Maguire <andrewm4894@gmail.com>

* Fix typo

Co-authored-by: Andrew Maguire <andrewm4894@gmail.com>

* Rebase.

* Use negative logic for anomaly bit.

* Add info for prediction_stats and training_stats charts.

* Disable ML on PPC64EL.

The CI test fails with -std=c++11 and requires -std=gnu++11 instead.
However, it's not easy to quickly append the required flag to CXXFLAGS.
For the time being, simply disable ML on PPC64EL and if any users
require this functionality we can fix it in the future.

* Add comment on why we disable ML on PPC64EL.

Co-authored-by: Andrew Maguire <andrewm4894@gmail.com>
2021-10-27 09:26:21 +03:00
Vladimir Kobal
b1ce4fa3b6
Check the version of the default cgroup mountpoint () 2021-05-06 17:02:02 +03:00
Emmanuel Vasilakis
f5bd20e60a
Provide new attributes in health conf files ()
* read and store new attributes (class, component, type) from health conf files. Replace family variable in info strings

* provide the attributes to jsons

* remove extra semicolon

* populate conf files with new attributes

* added newline

* remove extra defines from health.h

* remove empty line

* remove realloc

* use helper variables for find_and_replace. Adjust position for next strstr

* remove comments

* Add type to mysql.conf and vcsa.conf

* fix formatting

* add parenthesis

* remove extra assignment

* changes to mysql_galera_cluster_state from master

* add type Errors to unbound_request_list_overwritten

* fix identation for info strings spawning more than one line

* check for null, replace with empty string if true

* add class, component, type to systemdunits.conf
2021-04-20 16:24:41 +03:00
Tomáš Kopal
757e418090
Rename abs to ABS to avoid clash with standard definitions. Fixes . () 2021-03-17 12:18:33 +02:00
vkalintiris
10bb023852
Skip C++ incompatible header in main libnetdata header () 2021-03-10 22:26:57 +02:00
Vladimir Kobal
720bcf495c
Move network interface speed, duplex, and operstate variables to charts () 2021-03-10 21:02:45 +02:00
Stelios Fragkakis
e9d59e37d9
Migrate metadata log to SQLite () 2020-11-24 20:00:02 +02:00
Tomáš Kopal
bcb9c86827
Make libnetdata headers compilable by C++. () 2020-11-07 00:10:50 +00:00
Vladimir Kobal
3e84239ff6
Use the libbpf library for the eBPF plugin () 2020-07-16 15:10:35 +03:00
Andrew Moss
49719a961d
Fix bugs in streaming and enable support for gap filling ()
This PR adds (inactive) support that we will use to fill the gaps on chart when a receiving agent goes offline and the sender reconnects. The streaming component has been reworked to make the connection bi-directional and fix several outstanding bugs in the area. 

* Fixed an incorrect case of version negotiation. Removed fatal() on exhaustion of fds. 
* Fixed cases that fell through to polling the socket after closing. 
* Fixed locking of data related to sender and receiver in the host structure. 
* Added fine-grained locks to reduce contention.
* Added circular buffer to sender to prevent starvation in high-latency conditions. 
* Fixed case where agent is a proxy and negotiated different streaming versions with sender and receiver. 
* Changed interface to new parser to put the buffering code in streaming. 
* Fixed the bug that stopped senders from reconnecting after their socket times out - this was part of the scaling fixes that provide an early shortcut path for rejecting connections without lock contention. 
* Uses fine-grained locking and a different approach to thread shutdown instead. 
* Added liveness detection to connections to allow selection of the best connection.
2020-06-03 08:38:25 +02:00
Andrew Moss
aa3ec552c8
Enable support for Netdata Cloud.
This PR merges the feature-branch to make the cloud live. It contains the following work:
Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com(opens in new tab)>
Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)>
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud(opens in new tab)>
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com(opens in new tab)>
Co-authored-by: Timotej S <6674623+underhood@users.noreply.github.com(opens in new tab)>
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com(opens in new tab)>
* dashboard with new navbars, v1.0-alpha.9: PR 
* dashboard v1.0.11: 
Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)>
* Added installer code to bundle JSON-c if it's not present. PR 
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
* Fix claiming config PR 
* Adds JSON-c as hard dep. for ACLK PR 
* Fix SSL renegotiation errors in old versions of openssl. PR . Also - we have a transient problem with opensuse CI so this PR disables them with a commit from @prologic.
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
* Fix claiming error handling PR 
* Added CI to verify JSON-C bundling code in installer PR 
* Make cloud-enabled flag in web/api/v1/info be independent of ACLK build success PR 
* Reduce ACLK_STABLE_TIMEOUT from 10 to 3 seconds PR 
* remove old-cloud related UI from old dashboard (accessible now via /old suffix) PR 
* dashboard v1.0.13 PR 
* dashboard v1.0.14 PR 
* Provide feedback on proxy setting changes PR 
* Change the name of the connect message to update during an ongoing session PR 
* Fetch active alarms from alarm_log PR 
2020-05-11 16:37:27 +10:00
Markos Fountoulakis
767b7ba991
Change all https://app.netdata.cloud URLs to https://netdata.cloud to restore connectivity with netdata cloud. 2020-04-03 18:49:23 +03:00
Andrew Moss
844a2d4e03
Fix Coverity defects ()
Fix Coverity CID355287 and CID355289: technically it is a false-positive but it is easier to put a pattern in the code that they can recognise as a sanitizer. The compiler will remove it during optimization. Fix CID353973: the security condition is unlikely to occur but we can avoid it completely. Fix resource leak from CID 355286 and CID 355288. Fixing new resource leak introduced by a previous commit (CID355449)
2020-04-03 12:35:00 +02:00
Andrew Moss
fe722cb2a4
Improve the behavior of claiming ()
The default cloud url has been updated to app.netdata.cloud ready for the release. The claiming process now checks the current user executing claiming and refuses to perform the claim for the wrong user. If the current UID is 0 then claiming proceeds but the file ownership is adjusted to be the correct netdata user. The default expected user is `netdata` unless the script can identify the user from the current configuration. After the claiming script is executed the CLI is used to reload the claiming state.
2020-03-31 13:07:24 +02:00
thiagoftsm
696264006c
eBPF process plugin ()
* syscall_plugin: Compilation

This commit brings the necessaries changes to the compilation files

* syscall_plugin: Collector body

This commit brings the collector body to files.

* syscall_plugin: .gitignore

This commit adds syscall.plugin to .gitignore

* syscall_plugin: Plugin adjust

Fix reference and remove message

* syscall_plugin: Remove limit

Remove call to setrlimit

* syscall: Fix start

This commit fixes problems related with start of the plugin

* syscall_plugin: Bring heartbeat

This commit removes the sleep and changes to heartbeat to avoid plugin receive a SIGTERM

* syscall_plugin: Missing semicolon

* syscall_plugin: Fix dimension

Brings the initial value of chart for the normal dimension of the other values

* syscall_plugin: Fix dimension 2

The previous change did not give the expected results, so I am bringing more a fix

* syscall_plugin: adjust values

Rename function and adjust pid size

* syscall_plugin: Remove Chart and fix var

this commit removes a chart that will not be created and fix an error
when the bytes were calculated

* syscall_plugin: Brings error

This commit brings a new variable that will be used to identify errors

* syscall_plugin: Rename charts

This commit starts to rename the charts properly

* syscall_plugin: Rename plugin

* syscall_plugin: missing changes for rename

* syscall_plugin: fix compilation

* syscall_plugin: bring new charts

* syscall_plugin: Warnings

Remove warnings from compilation time

* vfs_plugin: Fix Error chart plot

There was an error when the chart was being displayed

* vfs_plugin: Change family

This commit changes the family of the VFS plugin

* vfs_plugin: Fix order

This PR fixes the wrong order when creating a chart

* vfs_plugin: Remove path

Remove path from structure

* vfs_plugin: From Perf to HASH

This commit converts the main source a hash table and also split the data collection per chart

* vfs_plugin: Adjusts and exit

This commit brings adjusts to the collect and the complete monitor to exit events

* vfs_plugin: Start process

This commit brings the monitoring of a process start and thread creation to Netdata

* vfs_plugin: Visualization and collection

Adjust variables to show and to collect data

* vfs_plugin: Connection with apps plugin

This commit starts to bring the connection with apps.

* vfs_plugin: Various

This commit brings new label for charts, fix to error chart and adjusts for new charts, I am sorry

* vfs_plugin: basis new chart

This commit brings the basis of the new charts for the plugin

* vfs_plugin: Apps plugin

This commit brings the integration with apps.plugin

* vfs_plugin:fix counter

This commit fixer the difference between apps plugin and counter

* ebpf_plugin: rename charts

This commit renames the charts

* ebpf_plugin: New charts adjusts and log start

* ebpf_plugin: Log thread

Creates the log thread that will be used to store error message

* ebpf_plugin: Rename Web Group

This commit reorganize the charts on dashboard

* ebpf_plugin: Restore

This commit restore the previous status of the collector where we only have a global vision of the problems

* ebpf_plugin: kretprobe

This commit brings the initial changes for the collector works with both eBPF program

* ebpf_plugin: New syscalls

This commit brings the new syscalls that we are monitoring

* ebpf_plugin: New charts

This commit brings new charts to the collector

* ebpf_plugin: Parse config

This commit starts the parser of the file

* ebpf_plugin: collector debug

* ebpf_plugin: Global variables from config

This commit brings the global variable update from the config file

* ebpf_plugin: Clean kprobe_events

This commit brings the clean of kprobe_events and also starts the common library for all eBPF collectors

* ebpf_plugin: Check kernel version

This function brings a check for the kernel version

* ebpf_plugin: Start documentation

This commit brings the initial documentation for the users

* ebpf_plugin: Documentation

This commit brings adjust to code and updates for the documentation

* ebpf_plugin: this commit brings the developer mode to the collector

* ebpf_plugin: Documentation

This commit brings more information to the documentation

* ebpf_plugin: Documentation

This commit brings more information to the documentation

* ebpf_plugin: errno to logs

Brings errno number to logs

* ebpf_plugin: Documentation

This commit brings fixes to the collector documentation

* ebpf_plugin: Move description

This commit move the chart description from the C code to dashboard_info.js

* ebpf_plugin: Rename files

This commit rename files to the final version

* ebpf_plugin: COntinue renaming

This commit continue renaming the files to the final version

* ebpf_plugin: Renaming process

This commit renames the final plugin

* ebpf_plugin: Finish rename

This commit finishes the rename processing

* ebpf_plugin: fix entry charts

This commit removes one chart from  mode

* ebpf_plugin: Fix remove

This commit brings a new function to fix the unload of collector when the collector
is running in entry mode

* ebpf_plugin: Rename on old kernels

This commit brings fixes for syscall names

* ebpf_plugin: Timestamp to log

This commit brings the timestamp to the logs

* ebpf_plugin: Remove syscall

With the changes on the backend, we are not monitoring more sys_clone

* ebpf_plugin: The syscall is important for 5.3 or newer, so I am returning

* ebpf_plugin: Remove concurrency

This commit adds variables necessary to interact with the new structor
of the eBPF program

* ebpf_plugin: Ids to dimension

This commit fews the functions name as ids for the dimensions

* ebpf_plugin: Missing chart

This commit brings the missing chart for Netdata

* ebpf_plugin: Remove unecessary message

Remove unecessary error message from the collector

* ebpf_plugin: Rename dimension

This commit renames the dimension for something more meaninful

* ebpf_plugin: Optional log

This commit converts the developer.log in an optional feature

* redirect to stdoou

This commit starts to bring the capability to redirect everything to stdout

* ebpf_plugin: Disable dev mode

This commit removes the possibility to load the dev mode file for while

* ebpf_plugin: Disable eBPF process

By default this plugin won't be enabled

* ebpf_plugin: Update debug message

* ebpf_plugin: this commit adjusts documentation to next release.

* ebpf_plugin: documentation fix.

* ebpf_plugin: Percpu hash

This commit moves from an unique hash table for various to speed up
the collector

* ebpf_plugin: Compatibility

This commit set compatibility version between kernels
2020-02-17 21:28:33 +00:00
Timotej Šiškovič
f4e1012f5f
initial MQTT over Secure Websockets support for ACLK ()
* add aclk_lws_wss_client

* shorten the thread name in case more threads are necessary

* Draft libmosquitto<->libwebsockets integration

* use ringbuffer for recvd data

* Some code cleanup

* if mqtt connection fails close lws connection and reconect

* clear buffers on connection closed

* work on better loop integration

* move mosquitto read out of loop

* remove useless code when using websockets

* LWS - make host and port configurable

* make default port 9002 as we use MQTT over WSS now

* wait for link up before subscribing

start query thread after connection has been made

* cleanup - remove useless var

* if there is anything to write send it immediatelly

* cleanup: move buffers into engine instace

* allow MQTT IO from multiple threads (although preffered is MQTT IO to be done by single thread)

* add warning to future self

* add some comments for whoever reviews

* add destroy fnc - start work on cleanup

* minor - add mosquitto to .gitignore

* fix codacy errors

* do not reconnect automatically by default

* minor - remove outdated comment

* tab -> spaces

Co-Authored-By: Konstantinos Natsakis <5933427+knatsakis@users.noreply.github.com>

* address thiagoftsm valid comments

* add usefull logs in case of trouble

* fix -Wall -Wextra -Wformat-signedness warnings

* log error when connection fails

* update .gitignore to match new installer

* Fwd LWS logs to Netdata logs

* minor - tabulation fixes

* fix comments from thiago

* force SSL

* move UNUSED to libnetdata.h
@thiago correctly pointed out it might be usefull for others

* minor - rename function for clarity

* minor - remove commented out code

Co-authored-by: Konstantinos Natsakis <5933427+knatsakis@users.noreply.github.com>
2020-02-14 10:54:26 +01:00
Markos Fountoulakis
16f835489c Implement netdata command server and cli tool ()
* Checkpoint commit (POC)

* Implemented command server in the daemon

* Add netdatacli implementation

* Added prints in command server setup functions

* Make libuv version 1 a hard dependency for the agent

* Additional documentation

* Improved accuracy of names and documentation

* Fixed documentation

* Fixed buffer overflow

* Added support for exit status in cli. Added prefixes for exit code, stdout and stderr. Fixed parsers.

* Fix compilation errors

* Fix compile errors

* Fix compile errors

* Fix compile error

* Fix linker error for muslc
2019-12-04 14:21:22 -08:00
Marcin Niestroj
fdeac75f9c include limits.h before using LONG_MAX ()
This fixes build with musl standard C library.

Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com>
2019-10-29 20:52:47 +02:00
Timo
19f1bd14de Utf8 Badge Fix And URL Parser International Support (initial) ()
#### Summary
Fixes 

Additionally it adds support for UTF-8 in URL parser (as it should).
Label sizes now are updated by browser with JavaScript (although guess is still calculated by verdana11_widths with minor improvements)

#### Component Name
API/Badges, LibNetData/URL

#### Additional Information
It was found that not only verdana11_widths need to be updated but the url parser replaces international characters with spaces (one space per each byte of multibyte character).
Therefore I update both to support international chars.
2019-07-24 14:32:08 +02:00
Markos Fountoulakis
0707fbaaac
Reimplemented mypopen() function family ()
* Reimplementd mypopen() family based on posix_spawn() instead of fork()
and execl().

The problem with fork() is that if the parent process has a large address
space then the fork() may fail due to insufficient free memory in the
system if memory overcommit is not enabled.

posix_spawn() does not call fork() and does not suffer from this problem.
It is also more portable than vfork() which is deprecated and clone()
which is linux only.

* Removed dead code
2019-07-09 07:25:54 +03:00
thiagoftsm
c56e086ba3 Easily disable alarms, by persisting the silencers configuration ()
This PR was created to fix , here I am completing the job initiated by Christopher, among the newest features that we are bring we have

JSON inside the core - We are bringing to the core the capacity to work with JSON files, this is available either using the JSON-C library case it is present in the system or using JSMN library that was incorporated to our core. The preference is to have JSON-C, because it is a more complete library, but case the user does not have the library installed we are keeping the JSMN for we do not lose the feature.
Health LIST - We are bringing more one command to the Health API, now with the LIST it is possible to get in JSON format the alarms active with Netdata.
Health reorganized - Previously we had duplicated code in different files, this PR is fixing this (Thanks @cakrit !), the Health is now better organized.
Removing memory leak - The first implementation of the json.c was creating SILENCERS without to link it in anywhere. Now it has been linked properly.
Script updated - We are bringing some changes to the script that tests the Health.
This PR also fixes the race condition created by the previous new position of the SILENCERS creation, I had to move it to daemon/main.c, because after various tests, it was confirmed that the error could happen in different parts of the code, case it was not initialized before the threads starts.

Component Name
health directory
health-cmd

Additional Information
Fixes  and 
2019-07-01 21:07:21 +02:00
Pavlos Emm. Katsoulakis
171d8f5d01 Revert "Easily disable alarms, by persisting the silencers configuration ()"
This reverts commit 60a73e90de.

Emergency rollback of potential culprit as per issue 
Will be re-merging the change after investigation
2019-06-28 08:27:57 +02:00
thiagoftsm
60a73e90de Easily disable alarms, by persisting the silencers configuration ()
* Alarms begin!

* Alarms web interface comments!

* Alarms web interface comments 2!

* Alarms bringing Christopher work!

* Alarms bringing Christopher work!

* Alarms commenting code that will be rewritten!

* Alarms json-c begin!

* Alarms json-c end!

* Alarms missed script!

* Alarms fix json-c parser and change script to test LIST!

* Alarms fix test script!

* Alarms documentation!

* Alarms script step 1!

* Alarms fix script!

* Alarms fix testing script and code!

* Alarms missing arguments to pkg_check_modules

* SSL_backend indentation!

* Alarms, description in Makefile

* Alarms missing extern!

* Alarms compilation!

* Alarms libnetdata/health!

* Alarms fill library!

* Alarms fill CMakeList!

* Alarm fix version!

* Alarm remove readme!

* Alarm fix readme version!
2019-06-27 13:16:28 +02:00
thiagoftsm
5182677831 netdata/daemon: SSL fix - broken compilation case when ssl library not present! ()
* SSL_fix fix the compilation case the library is not present!
2019-06-03 22:25:09 +03:00
thiagoftsm
b6088e08a7 SSL implementation for Netdata ()
* SSL implementation for Netdata

* Upload of fixes asked by @paulkatsoulakis and @cakrit

* Fix local computer

* Adding openssl to webserver

* fixing..

* HTTPS almost there

* Codacity

* HTTPS day 3

* HTTPS without Bio step 1

* HTTPS without Bio step 2

* HTTPS without Bio step 3

* HTTPS without Bio step 4

* HTTPS without Bio step 5

* HTTPS without Bio step 6

* HTTPS without Bio step 7

* HTTPS without Bio step 8

* HTTPS without Bio step 9

* HTTPS without Bio step 10

* SSL on streaming 1

* Daily pull

* HTTPS without Bio step 11

* HTTPS without Bio step 12

* HTTPS without Bio step 13

* HTTPS without Bio step 14

* SSL_Interception change documentation

* HTTPS without Bio step 15

* HTTPS without Bio step 16

* SSL_Interception fix codacity

* SSL_Interception fix doc

* SSL_Interception comments

* SSL_Interception fixing problems!

* SSL_Interception killing bugs

* SSL_Interception changing parameter

* SSL_Implementation documentation and script

* SSL_Implementation multiple fixes

* SSL_Implementation installer and cipher

* SSL_Implementation Redirect 301

* SSL_Implementation webserver doc and install-or-update.sh

* SSL_Implementation error 00000001:lib(0):func(0):reason(1)

* SSL_Implementation web server doc

* SSL_Implementation SEGFAULT on Fedora

* SSL_Implementation fix ^SSL=force|optional

* SSL_Implementation Redirect and Ciphers

* SSL_Implementation race condition 1

* SSL_Implementation Fix Location

* SSL_Implementation Fix Location 2

* SSL_Implementation Fix stream

* SSL_Implementation Fix stream 2

* SSL_Implementation Fix stream 3

* SSL_Implementation last problems!

* SSL_Implementation adjusts to commit!

* SSL_Implementation documentation permission!

* SSL_Implementation documentation permission 2!

* SSL_Implementation documentation permission 3!
2019-05-31 16:27:35 +02:00
Markos Fountoulakis
6ca6d840dd Database engine ()
* Database engine prototype version 0

* Database engine initial integration with netdata POC

* Scalable database engine with file and memory management.

* Database engine integration with netdata

* Added MIN MAX definitions to fix alpine build of travis CI

* Bugfix for backends and new DB engine, remove useless rrdset_time2slot() calls and erroneous checks

* DB engine disk protocol correction

* Moved DB engine storage file location to /var/cache/netdata/{host}/dbengine

* Fix configure to require openSSL for DB engine

* Fix netdata daemon health not holding read lock when iterating chart dimensions

* Optimized query API for new DB engine and old netdata DB fallback code-path

* netdata database internal query API improvements and cleanup

* Bugfix for DB engine queries returning empty values

* Added netdata internal check for data queries for old and new DB

* Added statistics to DB engine and fixed memory corruption bug

* Added preliminary charts for DB engine statistics

* Changed DB engine ratio statistics to incremental

* Added netdata statistics charts for DB engine internal statistics

* Fix for netdata not compiling successfully when missing dbengine dependencies

* Added DB engine functional test to netdata unittest command parameter

* Implemented DB engine dataset generator based on example.random chart

* Fix build error in CI

* Support older versions of libuv1

* Fixes segmentation fault when using multiple DB engine instances concurrently

* Fix memory corruption bug

* Fixed createdataset advanced option not exiting

* Fix for DB engine not working on FreeBSD

* Support FreeBSD library paths of new dependencies

* Workaround for unsupported O_DIRECT in OS X

* Fix unittest crashing during cleanup

* Disable DB engine FS caching in Apple OS X since O_DIRECT is not available

* Fix segfault when unittest and DB engine dataset generator don't have permissions to create temporary host

* Modified DB engine dataset generator to create multiple files

* Toned down overzealous page cache prefetcher

* Reduce internal memory fragmentation for page-cache data pages

* Added documentation describing the DB engine

* Documentation bugfixes

* Fixed unit tests compilation errors since last rebase

* Added note to back-up the DB engine files in documentation

* Added codacy fix.

* Support old gcc versions for atomic counters in DB engine
2019-05-15 08:28:06 +03:00
Chris Akritidis
2a5074ad43
Anonymous statistics ()
* Added shell and dashboard anonymous statistics

* Check for environment var NETDATA_REGISTRY_UNIQUE_ID

* Fix indentation

* Removed health-cmdapi-test

* docs/anonymous-statistics.md
2019-01-27 12:35:09 +02:00
Costa Tsaousis
f739ab110b
allow debugging memory per module ()
* debug info

* removed debug code
2018-10-30 23:14:35 +02:00
Costa Tsaousis
8fbf817ef8
modularized all source code ()
* modularized all external plugins

* added README.md in plugins

* fixed title

* fixed typo

* relative link to external plugins

* external plugins configuration README

* added plugins link

* remove plugins link

* plugin names are links

* added links to external plugins

* removed unecessary spacing

* list to table

* added language

* fixed typo

* list to table on internal plugins

* added more documentation to internal plugins

* moved python, node, and bash code and configs into the external plugins

* added statsd README

* fix bug with corrupting config.h every 2nd compilation

* moved all config files together with their code

* more documentation

* diskspace info

* fixed broken links in apps.plugin

* added backends docs

* updated plugins readme

* move nc-backend.sh to backends

* created daemon directory

* moved all code outside src/

* fixed readme identation

* renamed plugins.d.plugin to plugins.d

* updated readme

* removed linux- from linux plugins

* updated readme

* updated readme

* updated readme

* updated readme

* updated readme

* updated readme

* fixed README.md links

* fixed netdata tree links

* updated codacy, codeclimate and lgtm excluded paths

* update CMakeLists.txt

* updated automake options at top directory

* libnetdata slit into directories

* updated READMEs

* updated READMEs

* updated ARL docs

* updated ARL docs

* moved /plugins to /collectors

* moved all external plugins outside plugins.d

* updated codacy, codeclimate, lgtm

* updated README

* updated url

* updated readme

* updated readme

* updated readme

* updated readme

* moved api and web into webserver

* web/api web/gui web/server

* modularized webserver

* removed web/gui/version.txt
2018-10-15 23:16:42 +03:00
Renamed from src/libnetdata/libnetdata.h (Browse further)