This PR merges the feature-branch to make the cloud live. It contains the following work:
Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com(opens in new tab)>
Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)>
Co-authored-by: Austin S. Hemmelgarn <austin@netdata.cloud(opens in new tab)>
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
Co-authored-by: Markos Fountoulakis <44345837+mfundul@users.noreply.github.com(opens in new tab)>
Co-authored-by: Timotej S <6674623+underhood@users.noreply.github.com(opens in new tab)>
Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com(opens in new tab)>
* dashboard with new navbars, v1.0-alpha.9: PR #8478
* dashboard v1.0.11: netdata/dashboard#76
Co-authored-by: Jacek Kolasa <jacek.kolasa@gmail.com(opens in new tab)>
* Added installer code to bundle JSON-c if it's not present. PR #8836
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
* Fix claiming config PR #8843
* Adds JSON-c as hard dep. for ACLK PR #8838
* Fix SSL renegotiation errors in old versions of openssl. PR #8840. Also - we have a transient problem with opensuse CI so this PR disables them with a commit from @prologic.
Co-authored-by: James Mills <prologic@shortcircuit.net.au(opens in new tab)>
* Fix claiming error handling PR #8850
* Added CI to verify JSON-C bundling code in installer PR #8853
* Make cloud-enabled flag in web/api/v1/info be independent of ACLK build success PR #8866
* Reduce ACLK_STABLE_TIMEOUT from 10 to 3 seconds PR #8871
* remove old-cloud related UI from old dashboard (accessible now via /old suffix) PR #8858
* dashboard v1.0.13 PR #8870
* dashboard v1.0.14 PR #8904
* Provide feedback on proxy setting changes PR #8895
* Change the name of the connect message to update during an ongoing session PR #8927
* Fetch active alarms from alarm_log PR #8944
Preparing for the cloud release. This changes how we handle the feature flag so that it no longer requires installer switches and can be set from the config file. This still requires internal access to use and is not ready for public access yet.
* update_info: New variables
This commit creates inside script and it reads them to Netdata
* update_info: API
This commit changes the web api response
* update_info: Disk space
This commit brings the disk space to info and renames the environment variables inside Netdata
* update_info: Rename variable
This commit renames the environment variable
* update_info: Rename response variable
This commit renames a response variable
* update_info: Labels
This commit creates the missing labels
* update_info: test before free
* update_info: Doc function
This commit brings docummentation to the functions to give instructions to developer
* update_info: Fix info message
This commit removes some info messages from the error.log
* update_info: Remove unecessary ifs, considering free manual
* alarms_values: New endpoint
This commit brings the new endpoint to Netdata
* alarms_values: Documentation
This commit brings the missing documentation for the PR
* alarms_values: New function
This commit brings a new code that removes dupplication
* alarms_values: Fix typo
* alarms_values: Fix missing word
This commit fixes the missing word inside the documentation
* alarms_values: Fix order
This commit fixes the order of the alarm answer
* alarms_values:
Fixes typo and remmove unecessary variable
* alarms_values: Fixes doc
Describe all paramenters present in the endpoint
* alarms_values: Same options
This commit brings the same input pattern for alams and alams_values
* alarms_values: Update swagger
This commit brings the missing information to swagger json
* alarms_values: Update swagger
This commit brings the missing information to swagger yaml
* - Add initial mqtt support
* [WIP] Agent cloud link
- Setup main mqtt thread to connect to a broker using V5 of the MQTT protocol (TBD)
- Send alarms to "netdata/alarm"
- Add error checks to handle connection failures
- Add params for
Broker, port
Maximum concurrent sent / recev messages
- Dummy function to check claiming status
- Generic mqtt_send command to publish message to a base topic , sub topic
It will end up in the form base_topic/sub_topic
- Add host/port in the connection failure error message
* Test libmosquitto libs
* connect to broker locally (assume localhost:1883)
* subscribe to channel netdata/command
* Test try a reload command to trigger health reload
* publish alerts to netdata/alarm
* - Fix compile issues
* - Use sleep_usec instead of usleep
* - Delay reconnection on failure due to misconfiguration (high cpu usage)
* - Remove the TLS connection config
* - Fix NETDATA_MQTT_INITIALIZATION_SLEEP_WAIT to use seconds
* - Gather ACLK related code under aclk folder
- Add aclk_ functions for abstract layer
- Moved low level libs intergration in mqtt.c
* - Add README.md file with initial comment
* - Clean MQTT v5
* - Code cleanup
* - Remove alarm log for now
- Remove the heart beat
* - Remove message properties for V5
* - Remove message properties for V5 (header)
* Fixed the netdata target to use a local static version of libmosquitto.
The installer does not yet have steps to pull and build the local library.
cd project_root
git clone ssh://git@github.com/netdata/mosquitto mosquitto/
(cd mosquitto/lib && make) # Ignore the cpp error
This will leave mosquitto/lib/libmosquitto.a for the build process to use.
* - Fix compile issues with older < 1.6 libmosquitto lib
* - Enable alarm events to check it works
- Re arrange includes
- Rework topic to be agent/guid/. Actual id will be
returned by the is_agent_claimed
* - Add initial metadata info
- Added helper function in web_api
- Added a debug command (info)
* Update the claiming state to retrieve the claimed id.
* - Use define for constants like command and metadata topics
- Function to wait for initialization of the ACLK link
- New aclk_subscribe command with QOS parameter for the mqtt subscription
- Use the is_agent_claimed function to get the real claim id and use it to build the topics
that will be used for the cloud communication
- Change in netdata-claim.sh.in to write the claim id without a trailing \n
* - Use define for constants like command and metadata topics
- Function to wait for initialization of the ACLK link
- New aclk_subscribe command with QOS parameter for the mqtt subscription
- Use the is_agent_claimed function to get the real claim id and use it to build the topics
that will be used for the cloud communication
- Change in netdata-claim.sh.in to write the claim id without a trailing \n
* - Remove the alarm log for now
- Add code (but disabled) to send charts
* - Use dummy anon, anon as username and password for testing purposes
* - Use client id anon as well
* Testing without TLS
* Switching TLS back on to fix docker environment.
* - Added query processing
An incoming URL now calls web_client_api_request_v1_data to handle a request and push the results
back to the "data" topic
- Move the above processing from the message callback to the query handle loop
- Added helper "pause" , "resume" commands to stop and resume query processing to stress test loading the queue
with queries before executing them
- Changed the endpoint topics to "meta", and "cmd" (previously metadata and command)
* make info message follow protocol
* move metadata msg generation into new func
* move metadata msg generation into new func
* - Add metadata to the responses
- Add hook to queue chart changes on creation and dimensions
- Changed the queue mechanism to include delay for X seconds
- Add delayed submittion of charts to the cloud so that all DIMs are defined to avoid resubmission
* - Add additional data info for aclk_queue command
* - Use web_clinet_api_request_v1 to handle the incoming request
This will handle all requests coming from the cloud
* - Cleanup and aclk_query structure
- Add msg_id parameter
- Enable the incoming JSON request
- Enable the outgoing JSON response
* - Added new thread to handle query processing
- Add lock and cond wait to wakeup thread when queries are submitted
- Cleanup on the main init function
* - Add wait time on agent init, to allow for chart, alarms and other definitions to be completed.
- During the wait time, no queries will be queued
* - Send metadata on query thread init
- New generic create header function for the JSON response
- Pack info and charts into one message
- Modified chart to remove entries (test)
- Modified charts mod to remove entries e.g alarms and volatile info
- Change input to aclk_update_chart (RRDHOST / instead of hostname)
* - When a request fails, add to the payload
- We may need to handle in a different key
- Error check in json parsing
* - Add dummy aclk_update_alarm command
* - Move incoming request JSON parsing code away from mqtt.c
- Added #ifdef ACLK_ENABLE so that we can have code merged but disabled by default
- Added version in incoming and outgoing JSON dict
* - Disable code if ACLK_ENABLE is not defined
- Remove references to the mqtt (mosquitto) lib
- Add dummy stubs in mqtt.c for completeness if ACLK_ENABLE is not defined
* - Disable challenge sample code for now
* - Remove libmosquitto from makefile
* - Fix spaces in Makefile.am
- Remove ifdef to avoid warning from LGTM
* - Remove for now the code that builds an along log test message to send to the cloud
* - Add check for ACLK_ENABLE definition and avoid calling the chart update functions
* - Remove commented code
* - Move source files to the correct place (ACLK_PLUGIN_FILES)
* - Remove include file thats not needed
* - Remove include file thats not needed
- Add improved checks for load_claiming_state()
* - Fix error message. Used error() that also logs errno and message
* - Fix some codacy issues
* - Fix more codacy issues, code cleanup
* - Revert code to address codacy warnings
* - Revert spaces added in a previous commit by mistake
* clean up if/else nest
* print error if fopen fails
* minor - error already logs errno
* - Fix version formatting
* - Cleanup all ACLK related compiler warnings
- Re-arrange include files
- Removed unused defines
* - More compilation warnings fixed
- Bug with thread creation fixed
* - Add condition to skip compilation of the ACLK code entirely. Add env variable ACLK="yes" to enable
* - Add condition to skip the libmosquitto
* - Change feature flag from ACLK_ENABLE to ENABLE_ACLK in accordance with the rest of ENABLE_xx flags
- Typo in info message fix
Co-authored-by: Andrew Moss <1043609+amoss@users.noreply.github.com>
Co-authored-by: Timo <6674623+underhood@users.noreply.github.com>
* missing_extern: Fix missing
Fix few externs that were missing in global variables
* missing_extern: Variables
This commit declares the variables inside .c files
Improve the metadata detection for containers. The system_info structure has been updated to hold separate copies of OS_NAME, OS_ID, OS_ID_LIKE, OS_VERSION, OS_VERSION_ID and OS_DETECTION for both the container environment and the host. This new information is communicated through the /api/v1/info endpoint. For the streaming interface a partial copy of the info is carried until the stream protocol is upgraded. The anonymous_statistics script has been updated to carry the new data to Google Analytics. Some minor improvements have been made to OS-X / FreeBSD detection, and the detection of virtualization. The docs have been updated to explain how to pass the host environment to the docker container running Netdata.
* Add labels to the JSON exporting connector
* Add labels to the Graphite exporting connector
* Add labels to the OpenTSDB telnet exporting connector
* Add labels to the OpenTSDB HTTP exporting connector
* Replace control characters in JSON strings
* Add unit tests
Initial work on host labels from the dedicated branch. Includes work for issues #7096, #7400, #7411, #7369, #7410, #7458, #7459, #7412 and #7408 by @vlvkobal, @thiagoftsm, @cakrit and @amoss.
* health_connection: http_error_pattern
This commit brings an unique pattern for the Netdata webserver errors,
now Netdata uses define for all web error
* http_error_pattern: API v1
This PR also brings the pattern for the web_api_v1.c
##### Summary
This is implementation of a prerequisite for the requested feature #6536 (Generate an overall status badge/chart for the health of category)
##### Component Name
web/api/
health/
##### Details
Provide a new, `alarm_count` API call that returns the total number of alarms for given contexts and alarm states. Default is the total number of raised alarms, for all contexts.
* URL_parser_review comments 1
* URL_parser_review restoring web_client.c
* URL_parser_review restoring url.h
* URL_parser_review restoring web_client.h
* URL_parser_review restoring inlined.h
* URL_parser_review restoring various
* URL_parser_review commenting!
* URL_parser_review last checks!
* URL_parser_review registry!
* URL_parser_review codacy errors!
* URL_parser_review codacy errors 2!
* URL_parser_review end of request!
* URL_parser_review
* URL_parser_review format fix
* URL_parser_review restoring
* URL_parser_review stopped at 5!
* URL_parser_review formatting!
* URL_parser_review:
Started the map of the query string when it is necessary
* URL_parser_review:
With these adjusts in the URL library we are now able to parser all the escape characters!
* URL_parser_review: code review
Fixes problems and format asked by coworkers!
* URL_parser_review: adjust script
The script was not 100% according the shellcheck specifications, no less important
it was a direct script instead a .in file
* sslstream: Rebase 2
It was necessary to change a function due the UTF-8
* sslstream: Fixing 6426
We had a cast error introduced by other PR, so I am fixing here
* URL_parser_review
Change .gitignore to avoid considering a script file.
This reverts commit 58b7d95a7e.
---
As agreed with @thiago and @cakrit we revert URL parser changes,
to buy the time on a more detailed investigation
---
* Add array of collector plugins-modules to api/v1/info
* Add system info to api/v1/info, collect data from separate script, use environment vars in anonymous statistics script
##### Summary
fixes#2673fixes#2149fixes#5017fixes#3830fixes#3187fixes#5154
Implements a command API for health which will accept commands via a socket to selectively suppress health checks.
Allows different ports to accept different request types (streaming, dashboard, api, registry, netdata.conf, badges, management)
Removes support for multi-threaded and single-threaded web servers.
##### Component Name
health, daemon
* Minor changes in registy/README.md
* Minor improvements to README files within web/ directory
* Minor
* Added minor comments
* Improved web_client_api_request_v1_registry() comment
* Minor
* Minor formatting
* modularized exporters
* modularized API data queries
* optimized queries
* modularized API data reduction methods
* modularized api queries
* added new directories in makefiles
* added median db query
* moved all RRDR_GROUPING related to query.h
* added stddev query
* operational median and stddev
* working simple exponential smoothing
* too complex to do it right
* fixed ses
* fixed ses
* rewrote query engine
* fix double-exponential-smoothing
* cleanup
* fixed bug identified by @vlvkobal at rrdset_first_slot()
* enable freeipmi on systems with libipmimonitoring; #4440