netdata_netdata/src/claim
Stelios Fragkakis 75792b33e2
Revert "Support to WolfSSL (Step 1)" (#17697)
Revert "Support to WolfSSL (Step 1) (#17516)"

This reverts commit 8d9c464de3.
2024-05-17 19:08:51 +03:00
..
README.md fix broken links and links pointing to Learn (#17675) 2024-05-16 12:54:02 +03:00
claim.c Revert "Support to WolfSSL (Step 1)" (#17697) 2024-05-17 19:08:51 +03:00
claim.h Create a top-level directory to contain source code. (#16896) 2024-02-01 13:41:44 +02:00
netdata-claim.sh.in Create a top-level directory to contain source code. (#16896) 2024-02-01 13:41:44 +02:00

README.md

Connect Agent to Cloud

This guide walks you through the process of securely connecting a Netdata Agent to Netdata Cloud via the encrypted Agent-Cloud Link (ACLK).

When connecting an Agent (also known as a node) to Netdata Cloud, it's essential to complete a verification process. This process ensures that you have the necessary authorization level to manage the node effectively. Verification serves as a crucial security measure, preventing unauthorized access to the data on your node. Note that only administrators of a Space in Netdata Cloud can access the claiming token and the corresponding script generated by Netdata Cloud.

Info

The connection process guarantees that no third party can add your node to a Cloud account, Space, or War Room without your authorization, thus preventing unauthorized access to your node's metrics.

When you connect your node, its data is securely sent to Netdata Cloud using ACLK. We verify your agent's identity and can access the data only while it's being transferred. Netdata Cloud does not store or log your monitoring data

There are several ways to connect your node to Netdata Cloud:

  • During Netdata Cloud onboarding: This is the easiest option if you're just getting started.
  • From your Space: Click on "Connect Nodes" in the Space's left sidebar.
  • Prompts within the app: Look out for prompts to connect nodes throughout the Netdata Cloud app.

Note

Each node can be connected to a single Space in Netdata Cloud. However, you can add that connected node to multiple War Rooms within the same Space. You'll need to repeat the connection process for each additional node you want to monitor.

How to connect a node

There are three main ways to connect your node to Netdata Cloud:

  • From your War Room: This is ideal if you're setting up your first node and want to start monitoring right away.
  • From the Space Management screen: Click "Connect Nodes" to add a new node to your existing Space.
  • From the Nodes tab: While you can see connected nodes here, the connection process itself happens in the Space Management screen.

Empty War Room

When you enter a War Room with no nodes, you can either:

  1. Connect a New Node: Netdata Cloud will guide you through connecting a new node directly to the War Room. Simply select the environment where your node is running (e.g., Linux, Docker). Once you select the environment, Netdata Cloud will generate a unique script. Copy and paste the script into your terminal.
  2. Add a Previously Connected Node: If you already have a node connected to Netdata Cloud, you can easily add it to this War Room.

Once you've chosen your option, refer to the specific instructions for your environment:

This process can be repeated for each additional node you want to monitor in Netdata Cloud. You can add nodes during the initial onboarding process or afterward.

Manage Space or War Room area

Accessing Management screens:

  • Space Management: Click the cogwheel icon in the bottom-left corner of the UI.
  • War Room Management: Click the cogwheel icon next to the War Room name at the top of the UI.

Connecting a Node to a War Room:

  1. Select War Rooms: Choose which War Rooms you want to add the node to using the dropdown menu.
  2. Copy and Paste Script: Netdata Cloud will generate a script. Copy the entire script and paste it into your node's terminal window. Press Enter to initiate the connection process.

Note: When connecting from the Nodes tab, the room parameter will be set to the current War Room.

Connect an Agent running in Linux

Netdata Cloud provides a script called kickstart.sh to simplify the process of connecting your Linux node. This script performs two key actions:

  1. Installs the Netdata Agent (if needed): If the Netdata Agent isn't already installed on your Linux node, the script will take care of the installation process.
  2. Connects the Node to Netdata Cloud: The script will securely connect your node to Netdata Cloud, allowing you to monitor its performance.

It should be similar to:

wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh && sh /tmp/netdata-kickstart.sh --claim-token TOKEN --claim-rooms ROOM1,ROOM2 --claim-url https://app.netdata.cloud

Copy and paste the entire script provided by Netdata Cloud into your terminal window and press Enter.

If successful, you should see a message like "Agent was successfully claimed."

If you encounter any errors during the process, or the node doesn't appear in your Space within 60 seconds, refer to the troubleshooting information section.

To run the script, you'll need either root privileges or to run it as the user that runs the Netdata Agent on your node. Refer to the Connect an Agent without root privileges section for more details.

For in-depth information about the optional parameters --claim-token, --claim-rooms, and --claim-url, see Connect node to Netdata Cloud during installation.

Connect an Agent without root privileges

If you don't have root access, you can still connect your node to Netdata Cloud by following these steps:

  1. Identify the Netdata Agent User: Use the grep command to search your netdata.conf file which is located at your config directory and find the run as user setting. This will tell you which user is running the Netdata Agent.
    grep "run as user" /etc/netdata/netdata.conf
    # run as user = netdata
    
  2. Switch User: Use the sudo su - username -s /bin/bash command (replacing username with the identified user) to switch to the Netdata Agent user account. For example, if the run as user setting pointed to netdata, you would use sudo su - netdata -s /bin/bash.
  3. Run the Script: Once switched to the correct user, copy and paste the script provided by Netdata Cloud into the terminal and press Enter.

Connect an Agent running in Docker

To ensure the configuration and state information needed for the Cloud connection is preserved across container restarts, the contents of the /var/lib/netdata directory must be persisted. See our documentation for details on using persistent volumes.

Known issues on older hosts with seccomp enabled

The nodes running on the following hosts cannot be claimed:

  • libseccomp version less than v2.3.3.
  • Docker version less than v18.04.0-ce.
  • The kernel is configured with CONFIG_SECCOMP enabled.

To check if your kernel supports seccomp:

# grep CONFIG_SECCOMP= /boot/config-$(uname -r) 2>/dev/null || zgrep CONFIG_SECCOMP  /proc/config.gz 2>/dev/null
CONFIG_SECCOMP=y

To resolve the issue, do one of the following actions:

  • Update to a newer version of Docker and libseccomp (recommended).
  • Run without the default seccomp profile (unsafe, not recommended). You can pass unconfined to run a container without the default seccomp profile.
  • Create a custom profile and pass it for the container.
    • Download the moby default seccomp profile and change defaultAction to SCMP_ACT_TRACE on line 2.

      sudo wget https://raw.githubusercontent.com/moby/moby/master/profiles/seccomp/default.json -O /etc/docker/seccomp.json
      sudo sed -i '2s/SCMP_ACT_ERRNO/SCMP_ACT_TRACE/' /etc/docker/seccomp.json
      
    • Specify the new policy for the container explicitly.

      • When using docker run:

        docker run -d --name=netdata \
        --security-opt=seccomp=/etc/docker/seccomp.json \
        ...
        
      • When using docker-compose:

        ⚠️ The security_opt option is ignored when deploying a stack in swarm mode.

        version: '3'
        services:
          netdata:
            security_opt:
              - seccomp:/etc/docker/seccomp.json
        
      • When using docker stack deploy: Change the default profile globally by adding --seccomp-profile=/etc/docker/seccomp.json to the options passed to dockerd on startup.

There are two main approaches to connect a Netdata Agent running inside a Docker container to Netdata Cloud:

  1. Connecting New Agents (Automatic):

    • This method is ideal for new container deployments.
    • You can configure the connection automatically during startup by setting specific environment variables within your Docker container.
  2. Connecting Existing Agents (Manual). This method is used for existing containers that haven't been connected yet:

    • Using the Agent UI: The Netdata Agent UI provides a "Connect to netdata" button. Click on it and follow the on-screen instructions.
    • Executing Claiming Script Directly (Advanced): For advanced users, you can connect an existing Agent by executing the claiming script directly within the container using the docker exec command.

Using environment variables

The Netdata Docker container looks for the following environment variables on startup:

  • NETDATA_CLAIM_TOKEN
  • NETDATA_CLAIM_URL
  • NETDATA_CLAIM_ROOMS
  • NETDATA_CLAIM_PROXY

If the token and URL are specified in their corresponding variables and the container is not already connected, it will use these values to attempt to connect to Netdata Cloud, automatically adding the node to the specified War Rooms.

If a proxy is specified, it will be used for the connection process and for connecting to Netdata Cloud.

These variables can be specified using any mechanism supported by your container tooling for setting environment variables inside containers.

When using the docker run script, if you have an Agent container already running, it is important to know that there will be a short period of downtime. This is due to the process of recreating the new Agent container.

The script to connect a new node to Netdata Cloud is:

docker run -d --name=netdata \
  -p 19999:19999 \
  -v netdataconfig:/etc/netdata \
  -v netdatalib:/var/lib/netdata \
  -v netdatacache:/var/cache/netdata \
  -v /etc/passwd:/host/etc/passwd:ro \
  -v /etc/group:/host/etc/group:ro \
  -v /proc:/host/proc:ro \
  -v /sys:/host/sys:ro \
  -v /etc/os-release:/host/etc/os-release:ro \
  --restart unless-stopped \
  --cap-add SYS_PTRACE \
  --security-opt apparmor=unconfined \
  -e NETDATA_CLAIM_TOKEN=TOKEN \
  -e NETDATA_CLAIM_URL="https://app.netdata.cloud" \
  -e NETDATA_CLAIM_ROOMS=ROOM1,ROOM2 \
  -e NETDATA_CLAIM_PROXY=PROXY \
 netdata/netdata

Note: This command is suggested for connecting a new container. Using this command for an existing container recreates the container, though data and configuration of the old container may be preserved.

If you are claiming an existing container that can not be recreated, you can add the container by going to Netdata Cloud, clicking the Nodes tab, clicking Connect Nodes, selecting Docker, and following the instructions and scripts provided, or by following the instructions in an empty War Room.

The output that would be seen from the connection process when using other methods will be present in the container logs.

Using the environment variables like this to handle the connection process is the preferred method of connecting Docker containers as it works in the widest variety of situations and simplifies configuration management.

Using Docker compose

If you use docker compose, you can copy the config provided by Netdata Cloud, which should be same as the one below:

version: '3'
services:
  netdata:
    image: netdata/netdata
    container_name: netdata
  hostname: example.com # set to fqdn of host
  ports:
    - 19999:19999
  restart: unless-stopped
  cap_add:
    - SYS_PTRACE
  security_opt:
    - apparmor:unconfined
  volumes:
    - netdataconfig:/etc/netdata
    - netdatalib:/var/lib/netdata
    - netdatacache:/var/cache/netdata
    - /etc/passwd:/host/etc/passwd:ro
    - /etc/group:/host/etc/group:ro
    - /proc:/host/proc:ro
    - /sys:/host/sys:ro
    - /etc/os-release:/host/etc/os-release:ro
  environment:
    - NETDATA_CLAIM_TOKEN=TOKEN
    - NETDATA_CLAIM_URL="https://app.netdata.cloud"
    - NETDATA_CLAIM_ROOMS=ROOM1,ROOM2

volumes:
  netdataconfig:
  netdatalib:
  netdatacache:

Then run the following command in the same directory as the docker-compose.yml file to start the container.

docker-compose up -d

Using docker exec

In order to connect a running Netdata Agent container, where you don't want to recreate the existing container, append the script offered by Netdata Cloud to a docker exec ... command, replacing netdata with the name of your running container:

docker exec -it netdata netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud

The values for ROOM1,ROOM2 can be found in any add-node screens, at the top of the tab. Click on them to reveal them and copy them to your clipboard.

The script should return Agent was successfully claimed.. If the connection process returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.

Connect an Agent running in macOS

To connect a node that is running on a macOS environment the script that will be provided to you by Netdata Cloud is the kickstart script which will install the Netdata Agent on your node, if it isn't already installed, and connect the node to Netdata Cloud. It should be similar to:

curl https://get.netdata.cloud/kickstart.sh > /tmp/netdata-kickstart.sh && sh /tmp/netdata-kickstart.sh --install-prefix /usr/local/ --claim-token TOKEN --claim-rooms ROOM1,ROOM2 --claim-url https://app.netdata.cloud

Note

Hit Enter. The script should return Agent was successfully claimed.. If the process returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.

Connect a Kubernetes cluster's parent Netdata pod

Read our Kubernetes installation for details on connecting a cluster's parent Netdata pod.

Connect through a proxy

A Space's administrator can connect a node through HTTP(S) proxy.

You should first configure the proxy in the [cloud] section of netdata.conf. The proxy settings you specify here will also be used to tunnel the ACLK. The default proxy setting is none.

[cloud]
    proxy = none

The proxy setting can take one of the following values:

  • none: Do not use a proxy, even if the system configured otherwise.
  • env: Try to read proxy settings from set environment variables http_proxy.
  • http://[user:pass@]host:ip: The ACLK and connection process will use the specified HTTP(S) proxy.

For example, a HTTP proxy setting may look like the following:

[cloud]
    proxy = http://203.0.113.0:1080       # With an IP address
    proxy = http://proxy.example.com:1080 # With a URL

You can now move on to connecting. When you connect with the kickstart script, add the --claim-proxy= parameter and append the same proxy setting you added to netdata.conf.

wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh && sh /tmp/netdata-kickstart.sh --claim-token TOKEN --claim-rooms ROOM1,ROOM2 --claim-url https://app.netdata.cloud --claim-proxy http://[user:pass@]host:ip

Note

Hit Enter. The script should return Agent was successfully claimed.. If the process returns errors, or if you don't see the node in your Space after 60 seconds, see the troubleshooting information.

Troubleshooting

If you're having trouble connecting a node, this may be because the ACLK cannot connect to Cloud.

With the Netdata Agent running, visit http://NODE:19999/api/v1/info in your browser, replacing NODE with the IP address or hostname of your Agent. The returned JSON contains four keys that will be helpful to diagnose any issues you might be having with the ACLK or connection process.

 "cloud-enabled"
"cloud-available"
"agent-claimed"
"aclk-available"

Note

On Netdata Agent version 1.32 (netdata -v to find your version) and newer, sudo netdatacli aclk-state can be used to get some diagnostic information about ACLK. Sample output:

ACLK Available: Yes
ACLK Implementation: Next Generation
New Cloud Protocol Support: Yes
Claimed: Yes
Claimed Id: 53aa76c2-8af5-448f-849a-b16872cc4ba1
Online: Yes
Used Cloud Protocol: New

Use these keys and the information below to troubleshoot the ACLK.

kickstart: unsupported Netdata installation

If you run the kickstart script and get the following error Existing install appears to be handled manually or through the system package manager. you most probably installed Netdata using an unsupported package.

Note

If you are using an unsupported package, such as a third-party .deb/.rpm package provided by your distribution, please remove that package and reinstall using

our recommended kickstart script.

kickstart: Failed to write new machine GUID

If you run the kickstart script but don't have privileges required for the actions done on the connecting to Netdata Cloud process you will get the following error:

Failed to write new machine GUID. Please make sure you have rights to write to /var/lib/netdata/registry/netdata.public.unique.id.

For a successful execution you will need to run the script with root privileges or run it with the user that is running the Agent, more details on the Connect an Agent without root privileges section.

bash: netdata-claim.sh: command not found

If you run the claiming script and see a command not found error, you either installed Netdata in a non-standard location or are using an unsupported package. If you installed Netdata in a non-standard path using the --install-prefix option, you need to update your $PATH or run netdata-claim.sh using the full path.

For example, if you installed Netdata to /opt/netdata, use /opt/netdata/bin/netdata-claim.sh to run the claiming script.

Note

If you are using an unsupported package, such as a third-party .deb/.rpm package provided by your distribution, please remove that package and reinstall using

our recommended kickstart script.

Connecting on older distributions (Ubuntu 14.04, Debian 8, CentOS 6)

If you're running an older Linux distribution or one that has reached EOL, such as Ubuntu 14.04 LTS, Debian 8, or CentOS 6, your Agent may not be able to securely connect to Netdata Cloud due to an outdated version of OpenSSL. These old versions of OpenSSL cannot perform hostname validation, which helps securely encrypt SSL connections.

We recommend you reinstall Netdata with a static build, which uses an up-to-date version of OpenSSL with hostname validation enabled.

If you choose to continue using the outdated version of OpenSSL, your node will still connect to Netdata Cloud, albeit with hostname verification disabled. Without verification, your Netdata Cloud connection could be vulnerable to man-in-the-middle attacks.

cloud-enabled is false

If cloud-enabled is false, you probably ran the installer with --disable-cloud option.

Additionally, check that the enabled setting in var/lib/netdata/cloud.d/cloud.conf is set to true:

[global]
    enabled = true

To fix this issue, reinstall Netdata using your preferred method and do not add the --disable-cloud option.

cloud-available is false / ACLK Available: No

If cloud-available is false after you verified Cloud is enabled in the previous step, the most likely issue is that Cloud features failed to build during installation.

If Cloud features fail to build, the installer continues and finishes the process without Cloud functionality as opposed to failing the installation altogether.

We do this to ensure the Agent will always finish installing.

If you can't see an explicit error in the installer's output, you can run the installer with the --require-cloud option. This option causes the installation to fail if Cloud functionality can't be built and enabled, and the installer's output should give you more error details.

You may see one of the following error messages during installation:

  • Failed to build libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.
  • Unable to fetch sources for libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.
  • Failed to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
  • Unable to fetch sources for libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
  • Could not find cmake, which is required to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.
  • Could not find cmake, which is required to build JSON-C. The install process will continue, but Netdata Cloud support will be disabled.
  • Failed to build JSON-C. Netdata Cloud support will be disabled.
  • Unable to fetch sources for JSON-C. Netdata Cloud support will be disabled.

One common cause of the installer failing to build Cloud features is not having one of the following dependencies on your system: cmake, json-c and OpenSSL, including corresponding devel packages.

You can also look for error messages in /var/log/netdata/error.log. Try one of the following two commands to search for ACLK-related errors.

less /var/log/netdata/error.log
grep -i ACLK /var/log/netdata/error.log

If the installer's output does not help you enable Cloud features, contact us by creating an issue on GitHub with details about your system and relevant output from error.log.

agent-claimed is false / Claimed: No

You must connect your node.

aclk-available is false / Online: No

If aclk-available is false and all other keys are true, your Agent is having trouble connecting to the Cloud through the ACLK. Please check your system's firewall.

If your Agent needs to use a proxy to access the internet, you must set up a proxy for connecting.

If you are certain firewall and proxy settings are not the issue, you should consult the Agent's error.log at /var/log/netdata/error.log and contact us by creating an issue on GitHub with details about your system and relevant output from error.log.

Remove and reconnect a node

Linux based installations

To remove a node from your Space in Netdata Cloud, delete the cloud.d/ directory in your Netdata library directory.

cd /var/lib/netdata   # Replace with your Netdata library directory, if not /var/lib/netdata/
sudo rm -rf cloud.d/

This node no longer has access to the credentials it was used when connecting to Netdata Cloud via the ACLK. You will still be able to see this node in your War Rooms in an unreachable state.

If you want to reconnect this node, you need to:

  1. Ensure that the /var/lib/netdata/cloud.d directory doesn't exist. In some installations, the path is /opt/netdata/var/lib/netdata/cloud.d
  2. Stop the Agent
  3. Ensure that the uuidgen-runtime package is installed. Run echo "$(uuidgen)" and validate you get back a UUID
  4. Copy the kickstart.sh command to add a node from your space and add to the end of it --claim-id "$(uuidgen)". Run the command and look for the message Node was successfully claimed.
  5. Start the Agent

Docker based installations

To remove a node from you Space in Netdata Cloud, and connect it to another Space, follow these steps:

  1. Enter the running container you wish to remove from your Space

    docker exec -it CONTAINER_NAME sh
    

    Replacing CONTAINER_NAME with either the container's name or ID.

  2. Delete /var/lib/netdata/cloud.d and /var/lib/netdata/registry/netdata.public.unique.id

    rm -rf /var/lib/netdata/cloud.d/
    
    rm /var/lib/netdata/registry/netdata.public.unique.id 
    
  3. Stop and remove the container

    Docker CLI:

    docker stop CONTAINER_NAME
    
    docker rm CONTAINER_NAME
    

    Replacing CONTAINER_NAME with either the container's name or ID.

    Docker Compose:
    Inside the directory that has the docker-compose.yml file, run:

    docker compose down
    

    Docker Swarm:
    Run the following, and replace STACK with your Stack's name:

    docker stack rm STACK
    
  4. Finally, go to your new Space, copy the install command with the new claim token and run it.
    If you are using a docker-compose.yml file, you will have to overwrite it with the new claiming token.
    The node should now appear online in that Space.

Regenerate Claiming Token

If in case of some security reason, or other, you need to revoke your previous claiming token and generate a new one you can achieve that from the Netdata Cloud UI.

On any screen where you see the connect the node to Netdata Cloud command you'll see above it, next to the updates channel, a button to Regenerate token. This action will invalidate your previous token and generate a fresh new one.

Only the administrators of a Space in Netdata Cloud can trigger this action.

Connecting reference

In the sections below, you can find reference material for the kickstart script, claiming script, connecting via the Agent's command line tool, and details about the files found in cloud.d.

The cloud.conf file

This section defines how and whether your Agent connects to Netdata Cloud using the ACLK.

setting default info
cloud base url https://app.netdata.cloud The URL for the Netdata Cloud web application. You should not change this. If you want to disable Cloud, change the enabled setting.
enabled yes The runtime option to disable the Agent-Cloud link and prevent your Agent from connecting to Netdata Cloud.

Claiming script

A Space's administrator can also connect an Agent by directly calling the netdata-claim.sh script either with root privileges using sudo, or as the user running the Agent (typically netdata), and passing the following arguments:

-token=TOKEN
    where TOKEN is the Space's claiming token.
-rooms=ROOM1,ROOM2,...
    where ROOMX is the War Room this node should be added to. This list is optional.
-url=URL_BASE
    where URL_BASE is the Netdata Cloud endpoint base URL. By default, this is https://app.netdata.cloud.
-id=AGENT_ID
    where AGENT_ID is the unique identifier of the Agent. This is the Agent's MACHINE_GUID by default.
-hostname=HOSTNAME
    where HOSTNAME is the result of the hostname command by default.
-proxy=PROXY_URL
    where PROXY_URL is the endpoint of a HTTP or HTTPS proxy.

For example, the following command connects an Agent and adds it to rooms room1 and room2:

netdata-claim.sh -token=MYTOKEN1234567 -rooms=room1,room2

You should then update the netdata service about the result with netdatacli:

netdatacli reload-claiming-state

This reloads the Agent connection state from disk.

Our recommendation is to trigger the connection process using the kickstart whenever possible.

Netdata Agent command line

If a Netdata Agent is running, the Space's administrator can connect a node using the netdata service binary with additional command line parameters:

-W "claim -token=TOKEN -rooms=ROOM1,ROOM2"

For example:

/usr/sbin/netdata -D -W "claim -token=MYTOKEN1234567 -rooms=room1,room2"

If need be, the user can override the Agent's defaults by providing additional arguments like those described here.

Connection directory

Netdata stores the Agent's connection-related state in the Netdata library directory under cloud.d. For a default installation, this directory exists at /var/lib/netdata/cloud.d. The directory and its files should be owned by the user that runs the Agent, which is typically the netdata user.

The cloud.d/token file should contain the claiming-token and the cloud.d/rooms file should contain the list of War Rooms you added that node to.

The user can also put the Cloud endpoint's full certificate chain in cloud.d/cloud_fullchain.pem so that the Agent can trust the endpoint if necessary.