0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-06 22:38:55 +00:00
netdata_netdata/docs/deployment-guides/deployment-strategies.md
Costa Tsaousis a399128dbf
config parsers ()
* added parser for durations

* preliminary work for timeframes

* Update CMakeLists.txt

* updated parsing and generation for durations

* renames

* report parser errors; added compatibility to existing config_parse_duration()

* duration parsing is used on most netdata.conf and stream.conf entries

* more uses of duration parsing; simplification of stream.conf

* code cleanup

* more duration changes

* added html playground

* improved js code

* duration parsing applied to dbengine retention

* fixed doc

* simplified logic; added size parser

* added parsing for sizes

* renames and documentation updates

* hide appconfig internals from the rest of netdata

* fix crash on cleanup of streaming receivers

* fix buffer overflow in gorilla compression

* config return values are const

* ksm set to auto

* support reformatting migrated values

* removed obsolete metrics correlations settings

* split appconfig to multiple files

* durations documentation

* sizes documentation

* added backward compatibility in retention configuration

* provide description on migrations and reformattings

* config options are now a double linked list

* config sections are now a double linked list; config uses spinlocks; code cleanup and renames

* added data type to all config options

* update data types

* split appconfig api to multiple files

* code cleanup and renames

* removed size units above PiB

* Revert "fix buffer overflow in gorilla compression"

This reverts commit 3d5c48e84b.

* appconfig internal api changes
2024-09-04 14:42:01 +03:00

8 KiB
Raw Blame History

Deployment Examples

Deployment Options Overview

This section provides a quick overview for a few common deployment options for Netdata.

You can read about Standalone Deployment and Deployment with Centralization Points in the documentation inside this section.

The sections below go into configuration examples about these deployment concepts.

Deployment Configuration Details

Stand-alone

The stand-alone setup is configured out of the box with reasonable defaults, but please consult our configuration documentation for details, including the overview of common configuration changes.

Parent Child

For setups involving Parent and Child Agents, they need to be configured for streaming, through the configuration file stream.conf.

This will instruct the Child to stream data to the Parent and the Parent to accept streaming connections for one or more Child Agents. To secure this connection, both need a shared API key (to replace the string API_KEY in the examples below). Additionally, the Child can be configured with one or more addresses of Parent Agents (PARENT_IP_ADDRESS).

An API key is a key created with uuidgen and is used for authentication and/or customization on the Parent side. For example, a Child can stream using the API key, and a Parent can be configured to accept connections from the Child, but it can also apply different options for Children by using multiple different API keys. The easiest setup uses just one API key for all Child Agents.

Child config

As mentioned above, we do not recommend to claim the Child to Cloud directly during your setup.

This is done in order to reduce the footprint of the Netdata Agent on your production system, as some capabilities can be switched OFF for the Child and kept ON for the Parent.

In this example, Machine Learning and Alerting are disabled for the Child, so that the Parent can take the load. We also use RAM instead of disk to store metrics with limited retention, covering temporary network issues.

netdata.conf

On the child node, edit netdata.conf by using the edit-config script and set the following parameters:

[db]
    # https://github.com/netdata/netdata/blob/master/src/database/README.md
    # none = no retention, ram = some retention in ram
    mode = ram
    # The retention in seconds.
    # This provides some tolerance to the time the child has to find a parent in
    # order to transfer the data. For IoT this can be lowered to 120.
    retention = 1200
    # The granularity of metrics, in seconds.
    # You may increase this to lower CPU resources.
    update every = 1
[ml]
    # Disable Machine Learning
    enabled = no
[health]
    # Disable Health Checks (Alerting)
    enabled = no
[web]
    # Disable remote access to the local dashboard
    bind to = lo
[plugins]
    # Uncomment the following line to disable all external plugins on extreme
    # IoT cases by default.
    # enable running new plugins = no
stream.conf

To edit stream.conf, use again the edit-config script and set the following parameters:

[stream]
    # Stream metrics to another Netdata
    enabled = yes
    # The IP and PORT of the parent
    destination = PARENT_IP_ADDRESS:19999
    # The shared API key, generated by uuidgen
    api key = API_KEY

Parent config

For the Parent, besides setting up streaming, this example also provides configuration for multiple tiers of metrics storage, for 10 Children, with about 2k metrics each. This allows for:

  • 1s granularity at tier 0 for 1 week
  • 1m granularity at tier 1 for 1 month
  • 1h granularity at tier 2 for 1 year

Requiring:

  • 25GB of disk
  • 3.5GB of RAM (2.5GB under pressure)
netdata.conf

On the Parent, edit netdata.conf by using the edit-config script and set the following parameters:

[db]
    mode = dbengine
    dbengine tier backfill = new
    storage tiers = 3
    dbengine page cache size = 1.4GiB
    # storage tier 0
    update every = 1
    dbengine tier 0 retention space = 12GiB
    # storage tier 1
    dbengine tier 1 update every iterations = 60
    dbengine tier 1 retention space = 4GiB
    # storage tier 2
    dbengine tier 2 update every iterations = 60
    dbengine tier 2 retention space = 2GiB
[ml]
    # Enabled by default
    # enabled = yes
[health]
    # Enabled by default
    # enabled = yes
[web]
    # Enabled by default
    # bind to = *
stream.conf

On the Parent node, edit stream.conf by using the edit-config script and set the following parameters:

[API_KEY]
    # Accept metrics streaming from other Agents with the specified API key
    enabled = yes

ActiveActive Parents

In order to setup activeactive streaming between Parent 1 and Parent 2, Parent 1 needs to be instructed to stream data to Parent 2 and Parent 2 to stream data to Parent 1. The Child Agents need to be configured with the addresses of both Parent Agents. An Agent will only connect to one Parent at a time, falling back to the next upon failure. These examples use the same API key between Parent Agents and for connections for Child Agents.

On both Netdata Parent and all Child Agents, edit stream.conf by using the edit-config script:

stream.conf on Parent 1

[stream]
    # Stream metrics to another Netdata
    enabled = yes
    # The IP and PORT of Parent 2
    destination = PARENT_2_IP_ADDRESS:19999
    # This is the API key for the outgoing connection to Parent 2
    api key = API_KEY
[API_KEY]
    # Accept metrics streams from Parent 2 and Child Agents
    enabled = yes

stream.conf on Parent 2

[stream]
    # Stream metrics to another Netdata
    enabled = yes
    # The IP and PORT of Parent 1
    destination = PARENT_1_IP_ADDRESS:19999
    api key = API_KEY
[API_KEY]
    # Accept metrics streams from Parent 1 and Child Agents
    enabled = yes

stream.conf on Child Agents

[stream]
    # Stream metrics to another Netdata
    enabled = yes
    # The IP and PORT of the parent
    destination = PARENT_1_IP_ADDRESS:19999 PARENT_2_IP_ADDRESS:19999
    # The shared API key, generated by uuidgen
    api key = API_KEY

Further Reading

We strongly recommend the following configuration changes for production deployments:

  1. Understand Netdata's security and privacy design and secure your nodes

    To safeguard your infrastructure and comply with your organization's security policies.

  2. Optimize the Netdata Agents system utilization and performance

    To save valuable system resources, especially when running on weak IoT devices.

We also suggest that you:

  1. Use Netdata Cloud to access the dashboards

    For increased security, user management and access to our latest features, tools and troubleshooting solutions.

  2. Change how long Netdata stores metrics

    To control Netdata's memory use, when you have a lot of ephemeral metrics.

  3. Use host labels

    To organize systems, metrics, and alerts.