75792b33e2
Revert "Support to WolfSSL (Step 1) (#17516)"
This reverts commit
|
||
---|---|---|
.. | ||
README.md | ||
claim.c | ||
claim.h | ||
netdata-claim.sh.in |
README.md
Connect Agent to Cloud
This guide walks you through the process of securely connecting a Netdata Agent to Netdata Cloud via the encrypted Agent-Cloud Link (ACLK).
When connecting an Agent (also known as a node) to Netdata Cloud, it's essential to complete a verification process. This process ensures that you have the necessary authorization level to manage the node effectively. Verification serves as a crucial security measure, preventing unauthorized access to the data on your node. Note that only administrators of a Space in Netdata Cloud can access the claiming token and the corresponding script generated by Netdata Cloud.
Info
The connection process guarantees that no third party can add your node to a Cloud account, Space, or War Room without your authorization, thus preventing unauthorized access to your node's metrics.
When you connect your node, its data is securely sent to Netdata Cloud using ACLK. We verify your agent's identity and can access the data only while it's being transferred. Netdata Cloud does not store or log your monitoring data
There are several ways to connect your node to Netdata Cloud:
- During Netdata Cloud onboarding: This is the easiest option if you're just getting started.
- From your Space: Click on "Connect Nodes" in the Space's left sidebar.
- Prompts within the app: Look out for prompts to connect nodes throughout the Netdata Cloud app.
Note
Each node can be connected to a single Space in Netdata Cloud. However, you can add that connected node to multiple War Rooms within the same Space. You'll need to repeat the connection process for each additional node you want to monitor.
How to connect a node
There are three main ways to connect your node to Netdata Cloud:
- From your War Room: This is ideal if you're setting up your first node and want to start monitoring right away.
- From the Space Management screen: Click "Connect Nodes" to add a new node to your existing Space.
- From the Nodes tab: While you can see connected nodes here, the connection process itself happens in the Space Management screen.
Empty War Room
When you enter a War Room with no nodes, you can either:
- Connect a New Node: Netdata Cloud will guide you through connecting a new node directly to the War Room. Simply select the environment where your node is running (e.g., Linux, Docker). Once you select the environment, Netdata Cloud will generate a unique script. Copy and paste the script into your terminal.
- Add a Previously Connected Node: If you already have a node connected to Netdata Cloud, you can easily add it to this War Room.
Once you've chosen your option, refer to the specific instructions for your environment:
This process can be repeated for each additional node you want to monitor in Netdata Cloud. You can add nodes during the initial onboarding process or afterward.
Manage Space or War Room area
Accessing Management screens:
- Space Management: Click the cogwheel icon in the bottom-left corner of the UI.
- War Room Management: Click the cogwheel icon next to the War Room name at the top of the UI.
Connecting a Node to a War Room:
- Select War Rooms: Choose which War Rooms you want to add the node to using the dropdown menu.
- Copy and Paste Script: Netdata Cloud will generate a script. Copy the entire script and paste it into your node's terminal window. Press Enter to initiate the connection process.
Note: When connecting from the Nodes tab, the room parameter will be set to the current War Room.
Connect an Agent running in Linux
Netdata Cloud provides a script called kickstart.sh to simplify the process of connecting your Linux node. This script performs two key actions:
- Installs the Netdata Agent (if needed): If the Netdata Agent isn't already installed on your Linux node, the script will take care of the installation process.
- Connects the Node to Netdata Cloud: The script will securely connect your node to Netdata Cloud, allowing you to monitor its performance.
It should be similar to:
wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh && sh /tmp/netdata-kickstart.sh --claim-token TOKEN --claim-rooms ROOM1,ROOM2 --claim-url https://app.netdata.cloud
Copy and paste the entire script provided by Netdata Cloud into your terminal window and press Enter.
If successful, you should see a message like "Agent was successfully claimed."
If you encounter any errors during the process, or the node doesn't appear in your Space within 60 seconds, refer to the troubleshooting information section.
To run the script, you'll need either root privileges or to run it as the user that runs the Netdata Agent on your node. Refer to the Connect an Agent without root privileges section for more details.
For in-depth information about the optional parameters --claim-token
, --claim-rooms
, and --claim-url
,
see Connect node to Netdata Cloud during installation.
Connect an Agent without root privileges
If you don't have root access, you can still connect your node to Netdata Cloud by following these steps:
- Identify the Netdata Agent User: Use the
grep
command to search yournetdata.conf
file which is located at your config directory and find therun as user
setting. This will tell you which user is running the Netdata Agent.grep "run as user" /etc/netdata/netdata.conf # run as user = netdata
- Switch User: Use the
sudo su - username -s /bin/bash
command (replacingusername
with the identified user) to switch to the Netdata Agent user account. For example, if therun as user
setting pointed tonetdata
, you would usesudo su - netdata -s /bin/bash
. - Run the Script: Once switched to the correct user, copy and paste the script provided by Netdata Cloud into the terminal and press Enter.
Connect an Agent running in Docker
To ensure the configuration and state information needed for the Cloud connection is preserved across container
restarts, the contents of the /var/lib/netdata
directory must be persisted. See
our documentation for
details on using persistent volumes.
Known issues on older hosts with seccomp enabled
The nodes running on the following hosts cannot be claimed:
libseccomp
version less than v2.3.3.- Docker version less than v18.04.0-ce.
- The kernel is configured with CONFIG_SECCOMP enabled.
To check if your kernel supports seccomp
:
# grep CONFIG_SECCOMP= /boot/config-$(uname -r) 2>/dev/null || zgrep CONFIG_SECCOMP /proc/config.gz 2>/dev/null
CONFIG_SECCOMP=y
To resolve the issue, do one of the following actions:
- Update to a newer version of Docker and
libseccomp
(recommended). - Run without the default seccomp profile (unsafe, not recommended). You can pass unconfined to run a container without the default seccomp profile.
- Create a custom profile and pass it for the container.
-
Download the moby default seccomp profile and change
defaultAction
toSCMP_ACT_TRACE
on line 2.sudo wget https://raw.githubusercontent.com/moby/moby/master/profiles/seccomp/default.json -O /etc/docker/seccomp.json sudo sed -i '2s/SCMP_ACT_ERRNO/SCMP_ACT_TRACE/' /etc/docker/seccomp.json
-
Specify the new policy for the container explicitly.
-
When using
docker run
:docker run -d --name=netdata \ --security-opt=seccomp=/etc/docker/seccomp.json \ ...
-
When using
docker-compose
:⚠️ The security_opt option is ignored when deploying a stack in swarm mode.
version: '3' services: netdata: security_opt: - seccomp:/etc/docker/seccomp.json
-
When using
docker stack deploy
: Change the default profile globally by adding--seccomp-profile=/etc/docker/seccomp.json
to the options passed to dockerd on startup.
-
-
There are two main approaches to connect a Netdata Agent running inside a Docker container to Netdata Cloud:
-
Connecting New Agents (Automatic):
- This method is ideal for new container deployments.
- You can configure the connection automatically during startup by setting specific environment variables within your Docker container.
-
Connecting Existing Agents (Manual). This method is used for existing containers that haven't been connected yet:
- Using the Agent UI: The Netdata Agent UI provides a "Connect to netdata" button. Click on it and follow the on-screen instructions.
- Executing Claiming Script Directly (Advanced): For advanced users, you can connect an existing Agent by executing
the claiming script directly within the container using the
docker exec
command.
Using environment variables
The Netdata Docker container looks for the following environment variables on startup:
NETDATA_CLAIM_TOKEN
NETDATA_CLAIM_URL
NETDATA_CLAIM_ROOMS
NETDATA_CLAIM_PROXY
If the token and URL are specified in their corresponding variables and the container is not already connected, it will use these values to attempt to connect to Netdata Cloud, automatically adding the node to the specified War Rooms.
If a proxy is specified, it will be used for the connection process and for connecting to Netdata Cloud.
These variables can be specified using any mechanism supported by your container tooling for setting environment variables inside containers.
When using the docker run
script, if you have an Agent container already running, it is important to know that there
will be a short period of downtime. This is due to the process of recreating the new Agent container.
The script to connect a new node to Netdata Cloud is:
docker run -d --name=netdata \
-p 19999:19999 \
-v netdataconfig:/etc/netdata \
-v netdatalib:/var/lib/netdata \
-v netdatacache:/var/cache/netdata \
-v /etc/passwd:/host/etc/passwd:ro \
-v /etc/group:/host/etc/group:ro \
-v /proc:/host/proc:ro \
-v /sys:/host/sys:ro \
-v /etc/os-release:/host/etc/os-release:ro \
--restart unless-stopped \
--cap-add SYS_PTRACE \
--security-opt apparmor=unconfined \
-e NETDATA_CLAIM_TOKEN=TOKEN \
-e NETDATA_CLAIM_URL="https://app.netdata.cloud" \
-e NETDATA_CLAIM_ROOMS=ROOM1,ROOM2 \
-e NETDATA_CLAIM_PROXY=PROXY \
netdata/netdata
Note: This command is suggested for connecting a new container. Using this command for an existing container recreates the container, though data and configuration of the old container may be preserved.
If you are claiming an existing container that can not be recreated, you can add the container by going to Netdata Cloud, clicking the Nodes tab, clicking Connect Nodes, selecting Docker, and following the instructions and scripts provided, or by following the instructions in an empty War Room.
The output that would be seen from the connection process when using other methods will be present in the container logs.
Using the environment variables like this to handle the connection process is the preferred method of connecting Docker containers as it works in the widest variety of situations and simplifies configuration management.
Using Docker compose
If you use docker compose
, you can copy the config provided by Netdata Cloud, which should be same as the one below:
version: '3'
services:
netdata:
image: netdata/netdata
container_name: netdata
hostname: example.com # set to fqdn of host
ports:
- 19999:19999
restart: unless-stopped
cap_add:
- SYS_PTRACE
security_opt:
- apparmor:unconfined
volumes:
- netdataconfig:/etc/netdata
- netdatalib:/var/lib/netdata
- netdatacache:/var/cache/netdata
- /etc/passwd:/host/etc/passwd:ro
- /etc/group:/host/etc/group:ro
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /etc/os-release:/host/etc/os-release:ro
environment:
- NETDATA_CLAIM_TOKEN=TOKEN
- NETDATA_CLAIM_URL="https://app.netdata.cloud"
- NETDATA_CLAIM_ROOMS=ROOM1,ROOM2
volumes:
netdataconfig:
netdatalib:
netdatacache:
Then run the following command in the same directory as the docker-compose.yml
file to start the container.
docker-compose up -d
Using docker exec
In order to connect a running Netdata Agent container, where you don't want to recreate the existing container, append
the script offered by Netdata Cloud to a docker exec ...
command, replacing netdata
with the name of your running
container:
docker exec -it netdata netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud
The values for ROOM1,ROOM2
can be found in any add-node screens, at the top of the tab. Click on them to reveal them
and copy them to your clipboard.
The script should return Agent was successfully claimed.
. If the connection process returns errors, or if you don't
see the node in your Space after 60 seconds, see the troubleshooting information.
Connect an Agent running in macOS
To connect a node that is running on a macOS environment the script that will be provided to you by Netdata Cloud is the kickstart script which will install the Netdata Agent on your node, if it isn't already installed, and connect the node to Netdata Cloud. It should be similar to:
curl https://get.netdata.cloud/kickstart.sh > /tmp/netdata-kickstart.sh && sh /tmp/netdata-kickstart.sh --install-prefix /usr/local/ --claim-token TOKEN --claim-rooms ROOM1,ROOM2 --claim-url https://app.netdata.cloud
Note
Hit Enter. The script should return
Agent was successfully claimed.
. If the process returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.
Connect a Kubernetes cluster's parent Netdata pod
Read our Kubernetes installation for details on connecting a cluster's parent Netdata pod.
Connect through a proxy
A Space's administrator can connect a node through HTTP(S) proxy.
You should first configure the proxy in the [cloud]
section of netdata.conf
. The proxy settings you specify here
will also be used to tunnel the ACLK. The default proxy
setting is none
.
[cloud]
proxy = none
The proxy
setting can take one of the following values:
none
: Do not use a proxy, even if the system configured otherwise.env
: Try to read proxy settings from set environment variableshttp_proxy
.http://[user:pass@]host:ip
: The ACLK and connection process will use the specified HTTP(S) proxy.
For example, a HTTP proxy setting may look like the following:
[cloud]
proxy = http://203.0.113.0:1080 # With an IP address
proxy = http://proxy.example.com:1080 # With a URL
You can now move on to connecting. When you connect with
the kickstart
script, add the --claim-proxy=
parameter and append the same proxy setting you added to netdata.conf
.
wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh && sh /tmp/netdata-kickstart.sh --claim-token TOKEN --claim-rooms ROOM1,ROOM2 --claim-url https://app.netdata.cloud --claim-proxy http://[user:pass@]host:ip
Note
Hit Enter. The script should return
Agent was successfully claimed.
. If the process returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.
Troubleshooting
If you're having trouble connecting a node, this may be because the ACLK cannot connect to Cloud.
With the Netdata Agent running, visit http://NODE:19999/api/v1/info
in your browser, replacing NODE
with the IP
address or hostname of your Agent. The returned JSON contains four keys that will be helpful to diagnose any issues you
might be having with the ACLK or connection process.
"cloud-enabled"
"cloud-available"
"agent-claimed"
"aclk-available"
Note
On Netdata Agent version
1.32
(netdata -v
to find your version) and newer,sudo netdatacli aclk-state
can be used to get some diagnostic information about ACLK. Sample output:
ACLK Available: Yes
ACLK Implementation: Next Generation
New Cloud Protocol Support: Yes
Claimed: Yes
Claimed Id: 53aa76c2-8af5-448f-849a-b16872cc4ba1
Online: Yes
Used Cloud Protocol: New
Use these keys and the information below to troubleshoot the ACLK.
kickstart: unsupported Netdata installation
If you run the kickstart script and get the following
error Existing install appears to be handled manually or through the system package manager.
you most probably
installed Netdata using an unsupported package.
Note
If you are using an unsupported package, such as a third-party
.deb
/.rpm
package provided by your distribution, please remove that package and reinstall using
our recommended kickstart script.
kickstart: Failed to write new machine GUID
If you run the kickstart script but don't have privileges required for the actions done on the connecting to Netdata Cloud process you will get the following error:
Failed to write new machine GUID. Please make sure you have rights to write to /var/lib/netdata/registry/netdata.public.unique.id.
For a successful execution you will need to run the script with root privileges or run it with the user that is running the Agent, more details on the Connect an Agent without root privileges section.
bash: netdata-claim.sh: command not found
If you run the claiming script and see a command not found
error, you either installed Netdata in a non-standard
location or are using an unsupported package. If you installed Netdata in a non-standard path using
the --install-prefix
option, you need to update your $PATH
or run netdata-claim.sh
using the full path.
For example, if you installed Netdata to /opt/netdata
, use /opt/netdata/bin/netdata-claim.sh
to run the claiming
script.
Note
If you are using an unsupported package, such as a third-party
.deb
/.rpm
package provided by your distribution, please remove that package and reinstall using
our recommended kickstart script.
Connecting on older distributions (Ubuntu 14.04, Debian 8, CentOS 6)
If you're running an older Linux distribution or one that has reached EOL, such as Ubuntu 14.04 LTS, Debian 8, or CentOS 6, your Agent may not be able to securely connect to Netdata Cloud due to an outdated version of OpenSSL. These old versions of OpenSSL cannot perform hostname validation, which helps securely encrypt SSL connections.
We recommend you reinstall Netdata with a static build, which uses an up-to-date version of OpenSSL with hostname validation enabled.
If you choose to continue using the outdated version of OpenSSL, your node will still connect to Netdata Cloud, albeit with hostname verification disabled. Without verification, your Netdata Cloud connection could be vulnerable to man-in-the-middle attacks.
cloud-enabled is false
If cloud-enabled
is false
, you probably ran the installer with --disable-cloud
option.
Additionally, check that the enabled
setting in var/lib/netdata/cloud.d/cloud.conf
is set to true
:
[global]
enabled = true
To fix this issue, reinstall Netdata using
your preferred method and do not add
the --disable-cloud
option.
cloud-available is false / ACLK Available: No
If cloud-available
is false
after you verified Cloud is enabled in the previous step, the most likely issue is that
Cloud features failed to build during installation.
If Cloud features fail to build, the installer continues and finishes the process without Cloud functionality as opposed to failing the installation altogether.
We do this to ensure the Agent will always finish installing.
If you can't see an explicit error in the installer's output, you can run the installer with the --require-cloud
option. This option causes the installation to fail if Cloud functionality can't be built and enabled, and the
installer's output should give you more error details.
You may see one of the following error messages during installation:
Failed to build libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.
Unable to fetch sources for libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.
Failed to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
Unable to fetch sources for libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
Could not find cmake, which is required to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
Could not find cmake, which is required to build JSON-C. The install process will continue, but Netdata Cloud support will be disabled.
Failed to build JSON-C. Netdata Cloud support will be disabled.
Unable to fetch sources for JSON-C. Netdata Cloud support will be disabled.
One common cause of the installer failing to build Cloud features is not having one of the following dependencies on
your system: cmake
, json-c
and OpenSSL
, including corresponding devel
packages.
You can also look for error messages in /var/log/netdata/error.log
. Try one of the following two commands to search
for ACLK-related errors.
less /var/log/netdata/error.log
grep -i ACLK /var/log/netdata/error.log
If the installer's output does not help you enable Cloud features, contact us
by creating an issue on GitHub
with details about your system and relevant output from error.log
.
agent-claimed is false / Claimed: No
You must connect your node.
aclk-available is false / Online: No
If aclk-available
is false
and all other keys are true
, your Agent is having trouble connecting to the Cloud
through the ACLK. Please check your system's firewall.
If your Agent needs to use a proxy to access the internet, you must set up a proxy for connecting.
If you are certain firewall and proxy settings are not the issue, you should consult the Agent's error.log
at /var/log/netdata/error.log
and contact us
by creating an issue on GitHub
with details about your system and relevant output from error.log
.
Remove and reconnect a node
Linux based installations
To remove a node from your Space in Netdata Cloud, delete the cloud.d/
directory in your Netdata library directory.
cd /var/lib/netdata # Replace with your Netdata library directory, if not /var/lib/netdata/
sudo rm -rf cloud.d/
This node no longer has access to the credentials it was used when connecting to Netdata Cloud via the ACLK. You will still be able to see this node in your War Rooms in an unreachable state.
If you want to reconnect this node, you need to:
- Ensure that the
/var/lib/netdata/cloud.d
directory doesn't exist. In some installations, the path is/opt/netdata/var/lib/netdata/cloud.d
- Stop the Agent
- Ensure that the
uuidgen-runtime
package is installed. Runecho "$(uuidgen)"
and validate you get back a UUID - Copy the kickstart.sh command to add a node from your space and add to the end of it
--claim-id "$(uuidgen)"
. Run the command and look for the messageNode was successfully claimed.
- Start the Agent
Docker based installations
To remove a node from you Space in Netdata Cloud, and connect it to another Space, follow these steps:
-
Enter the running container you wish to remove from your Space
docker exec -it CONTAINER_NAME sh
Replacing
CONTAINER_NAME
with either the container's name or ID. -
Delete
/var/lib/netdata/cloud.d
and/var/lib/netdata/registry/netdata.public.unique.id
rm -rf /var/lib/netdata/cloud.d/ rm /var/lib/netdata/registry/netdata.public.unique.id
-
Stop and remove the container
Docker CLI:
docker stop CONTAINER_NAME docker rm CONTAINER_NAME
Replacing
CONTAINER_NAME
with either the container's name or ID.Docker Compose:
Inside the directory that has thedocker-compose.yml
file, run:docker compose down
Docker Swarm:
Run the following, and replaceSTACK
with your Stack's name:docker stack rm STACK
-
Finally, go to your new Space, copy the install command with the new claim token and run it.
If you are using adocker-compose.yml
file, you will have to overwrite it with the new claiming token.
The node should now appear online in that Space.
Regenerate Claiming Token
If in case of some security reason, or other, you need to revoke your previous claiming token and generate a new one you can achieve that from the Netdata Cloud UI.
On any screen where you see the connect the node to Netdata Cloud command you'll see above it, next to the updates channel, a button to Regenerate token. This action will invalidate your previous token and generate a fresh new one.
Only the administrators of a Space in Netdata Cloud can trigger this action.
Connecting reference
In the sections below, you can find reference material for the kickstart script, claiming script, connecting via the
Agent's command line tool, and details about the files found in cloud.d
.
The cloud.conf
file
This section defines how and whether your Agent connects to Netdata Cloud using the ACLK.
setting | default | info |
---|---|---|
cloud base url | https://app.netdata.cloud | The URL for the Netdata Cloud web application. You should not change this. If you want to disable Cloud, change the enabled setting. |
enabled | yes | The runtime option to disable the Agent-Cloud link and prevent your Agent from connecting to Netdata Cloud. |
Claiming script
A Space's administrator can also connect an Agent by directly calling the netdata-claim.sh
script either with root
privileges
using sudo
, or as the user running the Agent (typically netdata
), and passing the following arguments:
-token=TOKEN
where TOKEN is the Space's claiming token.
-rooms=ROOM1,ROOM2,...
where ROOMX is the War Room this node should be added to. This list is optional.
-url=URL_BASE
where URL_BASE is the Netdata Cloud endpoint base URL. By default, this is https://app.netdata.cloud.
-id=AGENT_ID
where AGENT_ID is the unique identifier of the Agent. This is the Agent's MACHINE_GUID by default.
-hostname=HOSTNAME
where HOSTNAME is the result of the hostname command by default.
-proxy=PROXY_URL
where PROXY_URL is the endpoint of a HTTP or HTTPS proxy.
For example, the following command connects an Agent and adds it to rooms room1
and room2
:
netdata-claim.sh -token=MYTOKEN1234567 -rooms=room1,room2
You should then update the netdata
service about the result with netdatacli
:
netdatacli reload-claiming-state
This reloads the Agent connection state from disk.
Our recommendation is to trigger the connection process using the kickstart whenever possible.
Netdata Agent command line
If a Netdata Agent is running, the Space's administrator can connect a node using the netdata
service binary with
additional command line parameters:
-W "claim -token=TOKEN -rooms=ROOM1,ROOM2"
For example:
/usr/sbin/netdata -D -W "claim -token=MYTOKEN1234567 -rooms=room1,room2"
If need be, the user can override the Agent's defaults by providing additional arguments like those described here.
Connection directory
Netdata stores the Agent's connection-related state in the Netdata library directory under cloud.d
. For a default
installation, this directory exists at /var/lib/netdata/cloud.d
. The directory and its files should be owned by the
user that runs the Agent, which is typically the netdata
user.
The cloud.d/token
file should contain the claiming-token and the cloud.d/rooms
file should contain the list of War
Rooms you added that node to.
The user can also put the Cloud endpoint's full certificate chain in cloud.d/cloud_fullchain.pem
so that the Agent
can trust the endpoint if necessary.