netdata_netdata/docs/metrics-storage-management/enable-streaming.md

12 KiB

How metrics streaming works

Each node running Netdata can stream the metrics it collects, in real time, to another node. Streaming allows you to replicate metrics data across multiple nodes, or centralize all your metrics data into a single time-series database (TSDB).

When one node streams metrics to another, the node receiving metrics can visualize them on the dashboard, run health checks to trigger alerts and send notifications, and export all metrics to an external TSDB. When Netdata streams metrics to another Netdata, the receiving one is able to perform everything a Netdata instance is capable of.

Streaming lets you decide exactly how you want to store and maintain metrics data. While we believe Netdata's distributed architecture is ideal for speed and scale, streaming provides centralization options and high data availability.

This document will get you started quickly with streaming. More advanced concepts and suggested production deployments can be found in the streaming and replication reference.

Streaming basics

There are three types of nodes in Netdata's streaming ecosystem.

  • Parent: A node, running Netdata, that receives streamed metric data.
  • Child: A node, running Netdata, that streams metric data to one or more parent.
  • Proxy: A node, running Netdata, that receives metric data from a child and "forwards" them on to a separate parent node.

Netdata uses API keys, which are just random GUIDs, to authorize the communication between child and parent nodes. We recommend using uuidgen for generating API keys, which can then be used across any number of streaming connections. Or, you can generate unique API keys for each parent-child relationship.

Once the parent node authorizes the child's API key, the child can start streaming metrics.

It's important to note that the streaming connection uses TCP, UDP, or Unix sockets, not HTTP. To proxy streaming metrics, you need to use a proxy that tunnels OSI layer 4-7 traffic without interfering with it, such as SOCKS or Nginx's TCP/UDP load balancing.

Supported streaming configurations

Netdata supports any combination of parent, child, and proxy nodes that you can imagine. Any node can act as both a parent, child, or proxy at the same time, sending or receiving streaming metrics from any number of other nodes.

Here are a few example streaming configurations:

  • Headless collector:
    • Child A, without a database or web dashboard, streams metrics to parent B.
    • A metrics are only available via the local Agent dashboard for B.
    • B generates alerts for A.
  • Replication:
    • Child A, with a database and web dashboard, streams metrics to parent B.
    • A metrics are available on both local Agent dashboards, and can be stored with the same or different metrics retention policies.
    • Both A and B generate alerts.
  • Proxy:
    • Child A, with or without a database, sends metrics to proxy C, also with or without a database. C sends metrics to parent B.
    • Any node with a database can generate alerts.

A basic parent child setup

simple-parent-child

For a predictable number of non-ephemeral nodes, install a Netdata agent on each node and replicate its data to a Netdata parent, preferrably on a management/admin node outside your production infrastructure. There are two variations of the basic setup:

  • When your nodes have sufficient RAM and disk IO the Netdata agents on each node can run with the default settings for data collection and retention.

  • When your nodes have severe RAM and disk IO limitations (e.g. Raspberry Pis), you should optimize the Netdata agent's performance.

Secure your nodes to protect them from the internet by making their UI accessible only via an nginx proxy, with potentially different subdomains for the parent and even each child, if necessary.

Both children and the parent are connected to the cloud, to enable infrastructure observability, without transferring the collected data. Requests for data are always serverd by a connected Netdata agent. When both a child and a parent are connected, the cloud will always select the parent to query the user requested data.

An advanced setup

Ephemeral nodes with two parents

When the nodes are ephemeral, we recommend using two parents in an active-active setup, and having the children not store data at all.

Both parents are configured on each child, so that if one is not available, they connect to the other.

The children in this set up are not connected to Netdata Cloud at all, as high availability is achieved with the second parent.

Enable streaming between nodes

The simplest streaming configuration is replication, in which a child node streams its metrics in real time to a parent node, and both nodes retain metrics in their own databases.

To configure replication, you need two nodes, each running Netdata. First you'll first enable streaming on your parent node, then enable streaming on your child node. When you're finished, you'll be able to see the child node's metrics in the parent node's dashboard, quickly switch between the two dashboards, and be able to serve alert notifications from either or both nodes.

Enable streaming on the parent node

First, log onto the node that will act as the parent.

Run uuidgen to create a new API key, which is a randomly-generated machine GUID the Netdata Agent uses to identify itself while initiating a streaming connection. Copy that into a separate text file for later use.

Find out how to install uuidgen on your node if you don't already have it.

Next, open stream.conf using edit-config from within the Netdata config directory.

cd /etc/netdata
sudo ./edit-config stream.conf

Scroll down to the section beginning with [API_KEY]. Paste the API key you generated earlier between the brackets, so that it looks like the following:

[11111111-2222-3333-4444-555555555555]

Set enabled to yes, and default memory mode to dbengine. Leave all the other settings as their defaults. A simplified version of the configuration, minus the commented lines, looks like the following:

[11111111-2222-3333-4444-555555555555]
    enabled = yes
    default memory mode = dbengine

Save the file and close it, then restart Netdata with sudo systemctl restart netdata, or the appropriate method for your system.

Enable streaming on the child node

Connect to your child node with SSH.

Open stream.conf again. Scroll down to the [stream] section and set enabled to yes. Paste the IP address of your parent node at the end of the destination line, and paste the API key generated on the parent node onto the api key line.

Leave all the other settings as their defaults. A simplified version of the configuration, minus the commented lines, looks like the following:

[stream]
    enabled = yes 
    destination = 203.0.113.0
    api key = 11111111-2222-3333-4444-555555555555

Save the file and close it, then restart Netdata with sudo systemctl restart netdata, or the appropriate method for your system.

Enable TLS/SSL on streaming (optional)

While encrypting the connection between your parent and child nodes is recommended for security, it's not required to get started. If you're not interested in encryption, skip ahead to view streamed metrics.

In this example, we'll use self-signed certificates.

On the parent node, use OpenSSL to create the key and certificate, then use chown to make the new files readable by the netdata user.

sudo openssl req -newkey rsa:2048 -nodes -sha512 -x509 -days 365 -keyout /etc/netdata/ssl/key.pem -out /etc/netdata/ssl/cert.pem
sudo chown netdata:netdata /etc/netdata/ssl/cert.pem /etc/netdata/ssl/key.pem

Next, enforce TLS/SSL on the web server. Open netdata.conf, scroll down to the [web] section, and look for the bind to setting. Add ^SSL=force to turn on TLS/SSL. See the web server reference for other TLS/SSL options.

[web]
    bind to = *=dashboard|registry|badges|management|streaming|netdata.conf^SSL=force

Next, connect to the child node and open stream.conf. Add :SSL to the end of the existing destination setting to connect to the parent using TLS/SSL. Uncomment the ssl skip certificate verification line to allow the use of self-signed certificates.

[stream]
    enabled = yes
    destination = 203.0.113.0:SSL
    ssl skip certificate verification = yes
    api key = 11111111-2222-3333-4444-555555555555

Restart both the parent and child nodes with sudo systemctl restart netdata, or the appropriate method for your system, to stream encrypted metrics using TLS/SSL.

View streamed metrics in Netdata Cloud

In Netdata Cloud you should now be able to see a new parent showing up in the Home tab under "Nodes by data replication". The replication factor for the child node has now increased to 2, meaning that its data is now highly available.

You don't need to do anything else, as the cloud will automatically prefer to fetch data about the child from the parent and switch to querying the child only when the parent is unavailable, or for some reason doesn't have the requested data (e.g. the connection between parent and the child is broken).

View streamed metrics in Netdata's dashboard

At this point, the child node is streaming its metrics in real time to its parent. Open the local Agent dashboard for the parent by navigating to http://PARENT-NODE:19999 in your browser, replacing PARENT-NODE with its IP address or hostname.

This dashboard shows parent metrics. To see child metrics, open the left-hand sidebar with the hamburger icon Hamburger icon in the top panel. Both nodes appear under the Replicated Nodes menu. Click on either of the links to switch between separate parent and child dashboards.

Switching between parent and child dashboards

The child dashboard is also available directly at http://PARENT-NODE:19999/host/CHILD-HOSTNAME, which in this example is http://203.0.113.0:19999/host/netdata-child.