* Handle ephemeral hosts
* Node empheral removal timeout 86400 seconds (1 day)
* Move config from health to global section
* Set a node to queryable false when it is ephemeral and is removed
* Log queryable. Send queryable=0 only when forcing host deletion (the node is ephemeral)
* Switch to "is ephemeral node"
Document stream.conf
* Unregister node id
* cleanup of logging - wip
* first working iteration
* add errno annotator
* replace old logging functions with netdata_logger()
* cleanup
* update error_limit
* fix remanining error_limit references
* work on fatal()
* started working on structured logs
* full cleanup
* default logging to files; fix all plugins initialization
* fix formatting of numbers
* cleanup and reorg
* fix coverity issues
* cleanup obsolete code
* fix formatting of numbers
* fix log rotation
* fix for older systems
* add detection of systemd journal via stderr
* finished on access.log
* remove left-over transport
* do not add empty fields to the logs
* journal get compact uuids; X-Transaction-ID header is added in web responses
* allow compiling on systems without memfd sealing
* added libnetdata/uuid directory
* move datetime formatters to libnetdata
* add missing files
* link the makefiles in libnetdata
* added uuid_parse_flexi() to parse UUIDs with and without hyphens; the web server now read X-Transaction-ID and uses it for functions and web responses
* added stream receiver, sender, proc plugin and pluginsd log stack
* iso8601 advanced usage; line_splitter module in libnetdata; code cleanup
* add message ids to streaming inbound and outbound connections
* cleanup line_splitter between lines to avoid logging garbage; when killing children, kill them with SIGABRT if internal checks is enabled
* send SIGABRT to external plugins only if we are not shutting down
* fix cross cleanup in pluginsd parser
* fatal when there is a stack error in logs
* compile netdata with -fexceptions
* do not kill external plugins with SIGABRT
* metasync info logs to debug level
* added severity to logs
* added json output; added options per log output; added documentation; fixed issues mentioned
* allow memfd only on linux
* moved journal low level functions to journal.c/h
* move health logs to daemon.log with proper priorities
* fixed a couple of bugs; health log in journal
* updated docs
* systemd-cat-native command to push structured logs to journal from the command line
* fix makefiles
* restored NETDATA_LOG_SEVERITY_LEVEL
* fix makefiles
* systemd-cat-native can also work as the logger of Netdata scripts
* do not require a socket to systemd-journal to log-as-netdata
* alarm notify logs in native format
* properly compare log ids
* fatals log alerts; alarm-notify.sh working
* fix overflow warning
* alarm-notify.sh now logs the request (command line)
* anotate external plugins logs with the function cmd they run
* added context, component and type to alarm-notify.sh; shell sanitization removes control character and characters that may be expanded by bash
* reformatted alarm-notify logs
* unify cgroup-network-helper.sh
* added quotes around params
* charts.d.plugin switched logging to journal native
* quotes for logfmt
* unify the status codes of streaming receivers and senders
* alarm-notify: dont log anything, if there is nothing to do
* all external plugins log to stderr when running outside netdata; alarm-notify now shows an error when notifications menthod are needed but are not available
* migrate cgroup-name.sh to new logging
* systemd-cat-native now supports messages with newlines
* socket.c logs use priority
* cleanup log field types
* inherit the systemd set INVOCATION_ID if found
* allow systemd-cat-native to send messages to a systemd-journal-remote URL
* log2journal command that can convert structured logs to journal export format
* various fixes and documentation of log2journal
* updated log2journal docs
* updated log2journal docs
* updated documentation of fields
* allow compiling without libcurl
* do not use socket as format string
* added version information to newly added tools
* updated documentation and help messages
* fix the namespace socket path
* print errno with error
* do not timeout
* updated docs
* updated docs
* updated docs
* log2journal updated docs and params
* when talking to a remote journal, systemd-cat-native batches the messages
* enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote
* Revert "enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote"
This reverts commit b079d53c11.
* note about uncompressed traffic
* log2journal: code reorg and cleanup to make modular
* finished rewriting log2journal
* more comments
* rewriting rules support
* increased limits
* updated docs
* updated docs
* fix old log call
* use journal only when stderr is connected to journal
* update netdata.spec for libcurl, libpcre2 and log2journal
* pcre2-devel
* do not require pcre2 in centos < 8, amazonlinux < 2023, open suse
* log2journal only on systems pcre2 is available
* ignore log2journal in .gitignore
* avoid log2journal on centos 7, amazonlinux 2 and opensuse
* add pcre2-8 to static build
* undo last commit
* Bundle to static
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Add build deps for deb packages
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Add dependencies; build from source
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Test build for amazon linux and centos expect to fail for suse
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* fix minor oversight
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Reorg code
* Add the install from source (deps) as a TODO
* Not enable the build on suse ecosystem
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
---------
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
Co-authored-by: Tasos Katsoulas <tasos@netdata.cloud>
* Switch alarm_log to use the buffer json functions
* Remove commented out code
* Fix finalize when an object is not explicitly closed
* Use buffer_json_member_add_boolean
* Send node update info only if the host has finished replication
* Log number of hosts replicating / pending to load context
* Remove prefix (thread name is enough)
* split systemd-journal.c
* split fstat caching
* split systemd-journal further
* working systemd-units function
* do not enable systemd-units when libsystemd does not provide the interface
* move the header to the right place
* mixed parantheses
* update codacy exlcusions
* update codacy exlcusions
* update codacy exlcusions
* added option to show show expanded filters by default
* keep the original extension and decode descriptions too
* updated systemd-units function to handle all known unit states
* dont show the path by default
* final touches
* remove trailing spaces
* Collectors should not be running at this point, but allow shutdown to continue after several retries (workaround)
* Proceed with shutdown after 10 attempts
* Prepare metadata sync thread cleanup earlier in the shutdown process
* Set flag for the dimensions that need ML MODEL load instead of queueing a message in the event loop
* Process the dimension ML load during the normal dimension metadata save loop
* Use spinlock for cmd queue / dequeue instead of mutex
Cleanup queue structure
* Remove old ML model load code
* Rebase and cleanup
* maintain in /tmp/stream-receiver-X.txt a copy of metadata received
* stream log metadata to /tmp/stream-sender-localhost.txt
* log the stream of all senders
* cleanup use of X_update_metadata() functions
* fix for last commit
* rrdlabel unmark/mark/delete unmarked restored
* cache ctx in collection handle
* cache rd together with rda
* do not repeatedy call rrdcontexts - cached collection status; optimize pluginsd_acquire_dimension()
* fix unit tests
* do the absolutely minimum while updating timestamps, ensure validity during reading them
* when the stream is INTERPOLATED, buffer outstanding data for up to 50ms if the buffer contains DATA only.
* remove the spinlock from mrg
* remove the metric flags that are not used any more
* mrg writers can be different threads
* update first time when latest clean is also updated
* cleanup
* set hot page with a simple atomic operation
* sender sets chart slot for every chart
* work on senders without SLOT
* enable SLOT capability
* send slot at BEGIN when SLOT is enabled
* fix slot generation and parsing
* send slot while re-streaming
* use the sender capabilities, not the receiver
* cleanup
* add slots support to all chart and dimension related plugin commands
* fix condition
* fix calculation
* check sender capabilties
* assign slots in constructors
* we need the dimension slot at the DIMENSION keyword
* more debug info in case of dimension mismatch
* ensure the RRDDIM EXPOSED flag is multi-threaded and set it after the sender buffer has been committed, so that replication will not send dimensions prematurely
* fix renumbering on child restart
* reset rda caching when receiving a chart definition
* optimize pluginsd_end_v2()
* do not do zero sized allocations
* trust the chart slot id of the child
* cleanup charts on pluginsd thread exit
* better cleanup
* find the chart and put it in the slot, if it not already there
* move slots array to host
* initialize pluginsd slots properly
* add slots to replay begin; do not cleanup slots that dont belong to a chart
* cleanup on obsolete
* cleanup slots on obsoletions
* cleanup and renames about obsoletion
* rewrite obsolation service code to remove race conditions
* better service obsoletion log
* added debugging
* more debug
* exposed flag now compares versions
* removed debugging messages
* respolve conflicts
* fix replication check for unsent dimensions
* move compression header to compression.h
* prototype with zstd compression
* updated capabilities
* no need for resetting compression
* left-over reset function
* use ZSTD_compressStream() instead of ZSTD_compressStream2() for backwards compatibility
* remove call to LZ4_decoderRingBufferSize()
* debug signature failures
* fix the buffers of lz4
* fix decoding of zstd
* detect compression based on initialization; prefer ZSTD over LZ4
* allow both lz4 and zstd
* initialize zstd streams
* define missing ZSTD_CLEVEL_DEFAULT
* log zero compressed size
* debug log
* flush compression buffer
* add sender compression statistics
* removed debugging messages
* do not fail if zstd is not available
* cleanup and buildinfo
* fix max message size, use zstd level 1, add compressio ratio reporting
* use compression level 1
* fix ratio title
* better compression error logs
* for backwards compatibility use buffers of COMPRESSION_MAX_CHUNK
* switch to default compression level
* additional streaming error conditions detection
* do not expose compression stats when compression is not enabled
* test for the right lz4 functions
* moved lz4 and zstd to their own files
* add gzip streaming compression
* gzip error handling
* added unittest for streaming compression
* eliminate a copy of the uncompressed data during zstd compression
* eliminate not needed zstd allocations
* cleanup
* decode gzip with Z_SYNC_FLUSH
* set the decoding gzip algorithm
* user configuration for compression levels and compression algorithms order
* fix exclusion of not preferred compressions
* remove now obsolete compression define, since gzip is always available
* rename compression algorithms order in stream.conf
* move common checks in compression.c
* cleanup
* backwards compatible error checking
* Retrieve last connected timestamp from the database (host->last_connected)
* Improve context load performance
Check for agent shutdown while context load in progress
Log information about host load start and finish
* Remove check for slot as it will only reach this part when a slot is found
* Add additional checks
rrdlabels_find_label_with_key_unsafe finds a label with specified key (not only if it exists but with difefrent value)
Quick check if label already exists (avoids JudyLIns)
* Add migration unit test
Add additional unit test to verify that adding a label with the same key will replace its value
* stop the query 250ms before the timeout, to allow sending back partial responses
* on timeout return partial responses
* give it 500ms
* give some additional timeout to plugins.d garbage collection
* define an extension to the timeout for all intermediate hops
* hunting for the crash...
* set value name and len to zero
* remove unneeded memset()
* Remove unused functions
* No need for prepare statement because the function is not used frequently
* Remove db_meta check, already assumed valid
* Remove D_ACLK_SYNC and D_METADATALOG, fix log message
* Reuse prepared statements per run to avoid sql parsing all the time
* Keep rowid in charts and dimensions
* Host and chart labels keep rowids
* Don't store internal flags
* Remove commented out code
* Formatting
* Fix algorithm when updating dimension
* remove loading and storing families from alert configs
* remove families from silencers
* remove from alarm log
* start remove from alarm-notify.sh.in
* fix test alarm
* rebase
* remove from api/v1/alarm_log
* remove from alert stream
* remove from config stream
* remove from more
* remove from swagger for health api
* revert md changes
* remove from health cmd api test
* dyncfg fncnames as constants
* add helper macros to know parser streaming/plugin
* plugins dictionary per RRDHOST
* api_request_v2_config add support for /host/
* streamify pluginsd_register_plugin
* streamify pluginsd_register_module
* streamify report_job_status
* streamify dyncfg get functions
* module_type2str
* add job type and flags
* add DYNCFG_REGISTER_JOB
* implement register job
* push all to parent at startup
* add helper function is_dyncfg_function
* forward virtual functions trough streaming
* separate job2json
* add api/v2/job_statuses
* do cleanup on streaming
* streamify set functions
* support FUNCTION_PAYLOAD trough streaming
* WIP tests
* dont attempt loading non-localhost configs
* move cfg persistence to proper place
* prevent race
* properly update job state at runtime
* cleanup 1
* job2json add missing reason
* add tests
* correct HTTP code
* add test
* streamify delete_job_cb
* add DELETE_JOB keyword
* job delete over streaming
* add tests for create and delete job over parent
* rrdpush common checks to macro
* add missing forwarders
* fix jobs according to test results
* more tests
* review comment 1
* codacy remove valid warning
* codacy ruby fixes
* fix wrong rc check
* minimal test plugin for child
* add test
* dict walk insted of master lock
* minor - english spelling fixes
* thiago comments 1
* minor - rename folder to dynconf
* enable only when built with -DNETDATA_TEST_DYNCFG
* minor - compiler warning
* create dir post daemonization
* stricter URL check