* Centralize cache path handling.
* Add better debugging support for static build infrastructure.
* Centralize handling of git repository fetching for static builds.
* Centralize build directory handling in the static build.
And quit using a sub-directory of packaging/makeself from the source
tree for it.
* Remove version numbers from static build jobs.
* Better organize static build jobs.
0x numbers for non-build prep-work
1x numbers for libraries that potentially impact multiple other things
we vendor.
2x numbers for libraries that are direct dependencies of Netdata
3x numbers for combined libraries and tooling used by Netdata
4x numbers for general tooling used by Netdata
5x numbers for tooling used by single specific components
6x numbers for non-build prep-work for the Netdata build
7x numbers for the actual build of Netdata
8x numbers for post-build checks
9x numbers for the actual packaging
* Clean up variable handling in Netdata build job.
* Split post-build handling steps to their own jobs.
This will make it easier to see what is actually going on in the build
process.
* Clean up CI messages.
* Split archive creation sub-steps into indivudal jobs.
* Disable shell tracing for archive creation job.
It’s not needed in 99.9% of cases, and should only be enabled locally if
it is needed.
* Assorted fixes for code restructuring.
* Tidy up paths for runtime check.
* Fix CI handling of artifacts.
* initial implementation of libbacktrace
* in buildinfo show the parameters of libbacktrace
* do not disable libbacktrace if threading is not supported
* Don’t install libbacktrace, only build it.
* Disable libbacktrace for 32-bit ARM builds.
* Make libunwind and libbacktrace mutually exclusive at configure time.
Instead of relying on it being mutually exclusive at build time. This
ensures we don’t waste time on libunwind when using libbacktrace.
* Only use libbacktrace on Linux and Windows
* Work around broken logic in openSUSE rpmbuild.
* Fix handling of libbacktrace for 32-bit ARM static builds.
---------
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud>
* enable libunwind in static builds
* add libunwind and backtrace to buildinfo
* add linunwind to alpine packages
* add -dev packages
* add remove libunwind binary from the packages
* Vendor libunwind in static builds instead of using a copy from the build environment.
This is required to ensure that the C++ exception handling functionality
in libunwind is _disabled_, because it does not play nice with static
linking when using C++ with exception handling support enabled.
* Remove changes from local testing.
* Fix cross architecture builds.
* Disable libunwind on 64-bit POWER builds.
musl libc does not include functions that are required to build
libunwind for this platform, so just disable it there for now.
---------
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud>
* detect the system ca bundle at runtime
* minor fix
* fix for older libcurl versions
* added X509_get_default_cert_file()
* added validation for the certificates
* moved ssl/curl code to separate file; now it configured both libcurl and openssl; added defaults to libcurl static install
* run the new code only in netdata static builds
* auto to check
* disable runtime ssl checks
* Switch to tonistiigi/binfmt for cross-build emulation.
It’s actually being actively updated, and it also supports hosts other
than x86-64.
* Auto-detect existing QEMU user emulation in static build.
Instead of relying on the user to explicitly ask for no emulation.
* updated copyright notices everywhere (I hope)
* Update makeself.lsm
* Update coverity-scan.sh
* make all newlines be linux, not windows
* remove copyright from all files (the take it from the repo), unless it is printed to users
* Enforce usage of specific CPU models for static build runtime checks.
* Add explicit architecture overrides for ARMv6l static builds.
* Fix handling of source paths.
* Enable tracing for static build code.
* Fix cflags and version handling.
* Restructure cflags handling and add Go architecture flags.
* Don't use symlinks when preparing static build artifacts.
* Roll back to v4.4.0 for actions/upload-artifact action.
There appears to be a bug in the latest release that is causing some
files to not be found when creating artifacts.
* Bump actions/upload-artifact to v4.4.2 which fixes the bugs.
* split claiming into multiple files; WIP claiming with api
* pidfile is now dynamically allocated
* netdata_exe_path is now dynamically allocated
* remove ENABLE_CLOUD and ENABLE_ACLK
* fix compilation
* remove ENABLE_HTTPS and ENABLE_OPENSSL
* remove the ability to disable cloud
* remove netdata_cloud_enabled variable; split rooms into a json array
* global libcurl initialization
* detect common claiming errors
* more common claiming errors
* finished claiming via API
* same as before
* same as before
* remove the old claiming logic that runs the claim script
* working claim.conf
* cleanup
* fix log message; default proxy is env
* fix log message
* remove netdata-claim.sh from run.sh
* remove netdata-claim.sh from everywhere, except kickstart scripts
* create cloud.d if it does not exist.
* better error handling and logging
* handle proxy disable
* merged master
* fix cmakelists for new files
* left-overs removal
* Include libcurl in required dependencies.
* Fix typo in dependency script.
* Use pkg-config for finding cURL.
This properly handles transitive dependencies, unlike the FindCURL
module.
* netdata installer writes claiming info to /etc/netdata/claim.conf
* remove claim from netdata
* add libcurl to windows packages
* add libcurl to windows packages
* compile-on-windows.sh installs too
* add NODE_ID streaming back to child and INDIRECT cloud status
* log child kill on windows
* fixes for spawn server on windows to ensure we have a valid pid and the process is properly terminated
* better handling to windows processes exit code
* pass the cloud url from parents to children
* add retries and timeout to claiming curl request
* remove FILE * from plugins.d
* spawn-tester to unittest spawning processes communication
* spawn-tester now tests FILE pointer I/O
* external plugins run in posix mode
* set blocking I/O on all pipes
* working spawn server on windows
* latest changes in spawn_popen applied to linux tools
* push environment
* repeated tests of fds
* export variable CYGWIN_BASE_PATH
* renamed to NETDATA_CYGWIN_BASE_PATH
* added cmd and help to adapt the command and the information to be presented to users during claiming
* split spawn server versions into files
* restored spawn server libuv based
* working libuv based spawn server
* fixes in libuv for windows
* working spawn server based on posix_spawn()
* fix fd leads on all spawn servers
* fixed windows spawn server
* fix signal handling to ensure proper cooperation with libuv
* switched windows to posix_spawn() based spawn server
* improvement on libuv version
* callocz() event loop
* simplification of libuv spawn server
* minor fixes in libuv and spawn tester
* api split into parts and separated by version; introduced /api/v3; no changes to old /api/v1 and /api/v2
* completed APIs splitting
* function renames
* remove dead code
* split basic functions into a directory
* execute external plugins in nofork spawn server with posix_spawn() for improved performance
* reset signals when using posix_spawn()
* fix spawn server logs and log cmdline in posix server
* bearer_get_token() implemented as function
* agent cloud status now exposes parent claim_id in indirect mode
* fixes for node id streaming from parent to children
* extract claimed id to separate file
* claim_id is no longer in host structure; there is a global claim_id for this agent and there are parent and origin claim ids in host structure
* fix issue on older compilers
* implement /api/v3 using calls from v1 and v2
* prevent asan leaks on local-sockets callback
* codacy fixes
* moved claim web api to web/api/v2
* when the agent is offline, prefer indirect connection when available; log a warning when a node changes node id
* improve inheritance of claim id from parent
* claim_id for bearer token show match any of the claim ids known
* aclk_connected replaced with functions
* aclk api can now be limited to node information, implementing [cloud].scope = license manager
* comment out most options in stream.conf so that internal defaults will be applied
* respect negative matches for send charts matching
* hidden functions are not accessible via the API; bearer_get_token function checks the request is coming from Netdata Cloud
* /api/v3/settings API
* added error logs to settings api
* saving and loading of bearer tokens
* Fix parameter when calling send_to_plugin
* Prevent overflow
* expose struct parser and typedef PARSER to enforce strict type checking on send_to_plugin()
* ensure the parser will not go away randomly from the receiver - it is now cleared when the receiver lock is acquired; also ensure the output sockets are set in the parser as long as the parser runs
* Add newline
* Send parent claim id downstream
* do not send anything when nodeid is zero
* code re-organization and cleanup
* add aclk capabilities, nodes summary and api version and protection to /api/v2,3/info
* added /api/v3/me which returns information about the current user
* make /api/v3/info accessible always
* Partially revert "remove netdata-claim.sh from everywhere, except kickstart scripts"
Due to how we handle files in our static builds and local builds, we
actually need to continue installing `netdata-claim.sh` to enable a
seamless transition to the new claiming mechanims without breaking
compatibility with existing installs or existing automation tooling that
is directly invoking the claiming script.
The script itself will be rewritten in a subsequent commit to simply
wrap the new claiming methodology, together with some additional changes
to ensure that a warning is issued if the script is invoked by anything
other than the kickstart script.
* Rewrite claiming script to use new claiming method.
* Revert "netdata installer writes claiming info to /etc/netdata/claim.conf"
Same reasoning as for 2e27bedb3fbf9df523bff407f2e8c8428e350e38.
We need to keep the old claiming support code in the kickstart script
for the forseeable future so that existing installs can still be
claimed, since the kickstart script is _NOT_ versioned with the agent.
A later commit will add native support for the new claiming method and
use that in preference to the claiming script if it appears to be
available.
* Add support for new claiming method to kickstart.sh.
This adds native support to the kickstart script to use the new claiming
method without depending on the claiming script, as well as adding a few
extra tweaks to the claiming script to enable it to better handle the
transition.
Expected behavior is for the kickstart script to use the new claiming
code path if the claiming script is either not installed, or does not
contain the specific string `%%NEW_CLAIMING_METHOD%%`. This way we will
skip the claiming script on systems which have the updated copy that
uses the new claiming approach, which should keep kickstart behavior
consistent with what Netdata itself supports.
* Depend on JSON-C 0.14 as a minimum supported version.
Needed for uint64 functions.
* Fix claiming option validation in kickstart script.
* do not cache auth in web client
* reuse bearer tokens when the request to create one matches an existing
* dictionaries dfe loops now allow using return statement
* bearer token files are now fixed for specific agents by having the machine guid of the agent in them
* systemd journal now respects facets and disables the default facets when not given
* fixed commands.c
* restored log for not openning config file
* Fix Netdata group templating for claiming script.
* Warn on failed templating in claiming script.
* Make `--require-cloud` a slient no-op.
We don’t need to warn users that it does nothing, we should just have ti
do nothing.
* added debugging info to claiming
* log also the response
* do not send double / at the url
* properly remove keyword from parameters
* disable debug during claimming
* fix log messages
* Update packaging/installer/kickstart.sh
* Update packaging/installer/kickstart.sh
* implemented POST request payload parsing for systemd-journal
* added missing reset of facets in json parsing
* JSON payload does not need hashes any more. I can accept the raw values
---------
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud>
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
* Add improved handling for TLS certificates for static builds.
* Properly replace symlinks.
* Fix shellcheck warning.
* Fix option handling.
- Persist certificate handling mode and check URL across reinstalls.
- Properly consume the arguments for the certificate handling options.
* Add five minute hard timeout on certificate check.
* Differentiate specific error results from curl.
* Persist cert handling options regardless of how they’re passed in.
* Escape slashes in REINSTALL_OPTIONS.
* Fix escaping of reinstall options.
* Check for supplementary components in libexec during runtime checks.
This should catch issues like what the PR it’s part of is fixing.
* Fix building DEB packages with CPack.
This both disables the logs-management plugin in the builds (which has
never actually worked properly in our packages for multiple reasons)
and fixes a botched merge involving the OS detection in the build system.
* Fix up CI checks.
Don’t clean repo during static builds.
We’re using a separate build directory in all cases, so there is no
longer any need to try to clean the repository before a static build.
This enables running static builds in linked git worktrees.
* Skip building Go components for Docker CI if they have not changed.
* Properly handle Go code in general checks PR.
* Skip Go code in build checks if it hasn’t changed.
* Fix linting issues.
* Fix propagation of installer flags.
* Fix propagation of environment variables through static build process.
* Fix handling of extra install options in static builds.
* Skip starting the agent in updater checks.
* Fix actionlint warning.
To optimize the CI we are trying to cache build artifacts such as all the software we build and statically bundle for static binaries (for each arch) In a nutshell the artifacts of these https://github.com/netdata/netdata/tree/master/packaging/makeself/jobs source files. With this https://github.com/netdata/netdata/blob/master/.github/scripts/get-static-cache-key.sh script we generate the keys for these cached artifacts taking into account the (source files of the jobs, version of the software, static packages bundled in the base images). The effort #16303 to make a centralized file for all the versions expanded the problem of not considering the exact versions.
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* ndsudo command
* added help
* make ndsudo setuid to root
* fix megacli binary name on FreeBSD
* move ndsudo to collectors/plugins.d/
* address PR comments
* do not print the command line argument, instead print its index
---------
Co-authored-by: Ilya Mashchenko <ilya@netdata.cloud>
* cleanup of logging - wip
* first working iteration
* add errno annotator
* replace old logging functions with netdata_logger()
* cleanup
* update error_limit
* fix remanining error_limit references
* work on fatal()
* started working on structured logs
* full cleanup
* default logging to files; fix all plugins initialization
* fix formatting of numbers
* cleanup and reorg
* fix coverity issues
* cleanup obsolete code
* fix formatting of numbers
* fix log rotation
* fix for older systems
* add detection of systemd journal via stderr
* finished on access.log
* remove left-over transport
* do not add empty fields to the logs
* journal get compact uuids; X-Transaction-ID header is added in web responses
* allow compiling on systems without memfd sealing
* added libnetdata/uuid directory
* move datetime formatters to libnetdata
* add missing files
* link the makefiles in libnetdata
* added uuid_parse_flexi() to parse UUIDs with and without hyphens; the web server now read X-Transaction-ID and uses it for functions and web responses
* added stream receiver, sender, proc plugin and pluginsd log stack
* iso8601 advanced usage; line_splitter module in libnetdata; code cleanup
* add message ids to streaming inbound and outbound connections
* cleanup line_splitter between lines to avoid logging garbage; when killing children, kill them with SIGABRT if internal checks is enabled
* send SIGABRT to external plugins only if we are not shutting down
* fix cross cleanup in pluginsd parser
* fatal when there is a stack error in logs
* compile netdata with -fexceptions
* do not kill external plugins with SIGABRT
* metasync info logs to debug level
* added severity to logs
* added json output; added options per log output; added documentation; fixed issues mentioned
* allow memfd only on linux
* moved journal low level functions to journal.c/h
* move health logs to daemon.log with proper priorities
* fixed a couple of bugs; health log in journal
* updated docs
* systemd-cat-native command to push structured logs to journal from the command line
* fix makefiles
* restored NETDATA_LOG_SEVERITY_LEVEL
* fix makefiles
* systemd-cat-native can also work as the logger of Netdata scripts
* do not require a socket to systemd-journal to log-as-netdata
* alarm notify logs in native format
* properly compare log ids
* fatals log alerts; alarm-notify.sh working
* fix overflow warning
* alarm-notify.sh now logs the request (command line)
* anotate external plugins logs with the function cmd they run
* added context, component and type to alarm-notify.sh; shell sanitization removes control character and characters that may be expanded by bash
* reformatted alarm-notify logs
* unify cgroup-network-helper.sh
* added quotes around params
* charts.d.plugin switched logging to journal native
* quotes for logfmt
* unify the status codes of streaming receivers and senders
* alarm-notify: dont log anything, if there is nothing to do
* all external plugins log to stderr when running outside netdata; alarm-notify now shows an error when notifications menthod are needed but are not available
* migrate cgroup-name.sh to new logging
* systemd-cat-native now supports messages with newlines
* socket.c logs use priority
* cleanup log field types
* inherit the systemd set INVOCATION_ID if found
* allow systemd-cat-native to send messages to a systemd-journal-remote URL
* log2journal command that can convert structured logs to journal export format
* various fixes and documentation of log2journal
* updated log2journal docs
* updated log2journal docs
* updated documentation of fields
* allow compiling without libcurl
* do not use socket as format string
* added version information to newly added tools
* updated documentation and help messages
* fix the namespace socket path
* print errno with error
* do not timeout
* updated docs
* updated docs
* updated docs
* log2journal updated docs and params
* when talking to a remote journal, systemd-cat-native batches the messages
* enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote
* Revert "enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote"
This reverts commit b079d53c11.
* note about uncompressed traffic
* log2journal: code reorg and cleanup to make modular
* finished rewriting log2journal
* more comments
* rewriting rules support
* increased limits
* updated docs
* updated docs
* fix old log call
* use journal only when stderr is connected to journal
* update netdata.spec for libcurl, libpcre2 and log2journal
* pcre2-devel
* do not require pcre2 in centos < 8, amazonlinux < 2023, open suse
* log2journal only on systems pcre2 is available
* ignore log2journal in .gitignore
* avoid log2journal on centos 7, amazonlinux 2 and opensuse
* add pcre2-8 to static build
* undo last commit
* Bundle to static
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Add build deps for deb packages
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Add dependencies; build from source
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Test build for amazon linux and centos expect to fail for suse
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* fix minor oversight
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Reorg code
* Add the install from source (deps) as a TODO
* Not enable the build on suse ecosystem
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
---------
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
Co-authored-by: Tasos Katsoulas <tasos@netdata.cloud>
1. Add repo clean up instructions for the netdata/netdata repo (clean up from previous builds)
2. Make the static build instruction to use the more generic script
---------
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud>
* claim script now accepts the same params as the kickstart
* rewrote buildinfo to unify all methods
* added cloud unavailable in cloud status
* added all exporters
* renamed httpd to h2o
* rename ENABLE_COMPRESSION to ENABLE_LZ4
* rename global variable
* rename ENABLE_HTTPS to ENABLE_OPENSSL
* fix coverity-scan for openssl
* add lz4 to coverity-scan
* added all plugins and most of the features
* added all plugins and most of the features
* generalize bitmap code so that we can have any size of bitmaps
* cleanup
* fix compilation without protobuf
* fix compilation with others allocators
* fix bitmap
* comprehensive bitmaps unit test
* bitmap as macros
* added developer mode
* added system info to build info
* cloud available/unavailable
* added /api/v2/info
* added units and ni to transitions
* when showing instances and transitions, show only the instances that have transitions
* cleanup
* add missing quotes
* add anchor to transitions
* added more to build info
* calculate retention per tier and expose it to /api/v2/info
* added currently collected metrics
* do not show space and retention when no numbers are available
* fix impossible overflow
* Add function for transitions and execute callback
* In case of error, reset and try next dictionary entry
* Fix error message
* simpler logic to maintain retention per tier
* /api/v2/alert_transitions
* Handle case of recipient null
Convert after and before to usec
* Add classification, type and component
* working /api/v2/alert_transitions
* Fix query to properly handle context and alert name
* cleanup
* Add search with transition
* accept transition in /api/v2/alert_transitions
* totaly dynamic facets
* fixed debug info
* restructured facets
* cleanup; removal of options=transitions
* updated alert entries flags
* method to exec
* Return also exec run timestamp
Temp table cleanup only when we don't execute with a transition
* cleanup obsolete anchor parameter
* Add sql_get_alert_configuration function
* added options=config to alert_transitions
* added /api/v2/alert_config
* preliminary work for /api/v2/claim
* initialize variables; do not expose expected retention if no disk space info is available; do not report aclk as initializing when not claimed
* fix claim session key filename
* put a newline into the session key file
* more progress on claiming
* final /api/v2/claim endpoint
* after claiming, refresh our state at the output
* Fix query to fetch config
* Remove debug log
* add configuration objects
* add configuration objects - fixed
* respect the NETDATA_DISABLE_CLOUD env variable
* NETDATA_DISABLE_CLOUD env variable sets the default, but the config sets the final value
* use a new claimed_id on every claiming
* regenerate random key on claiming and wait for online status
* ignore write() return value when writing a newline
* dont show cloud status disabled when claimed_id is missing
* added ctx to alert instances
* cleanup config and transitions from /api/v2/alerts
* fix unused variable
* in /api/v2/alert_config show 1 config without an array
* show alert values conditionally, by appending options=values
* When storing host info if the key value is empty, store unknown
* added options=summary to control when the alerts summary is shown
* increased http_api_v2 to version 5
* claming random key file is now not world readable
* added local-listeners binary that detects all the listening ports, their IPs and their command lines
---------
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>