0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-17 11:12:42 +00:00

Collecting metrics docs grammar pass ()

* grammar pass

* grammar pass

* grammar pass and some edits

* grammar pass

* grammar pass on the whole dir and remove duplicates

* simplify wording application-metrics.md

* Update docs/collecting-metrics/application-metrics.md

---------

Co-authored-by: ilyam8 <ilya@netdata.cloud>
This commit is contained in:
Fotis Voutsas 2024-05-27 11:16:06 +03:00 committed by GitHub
parent 84daebfd14
commit 9bdc1f595e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 120 additions and 304 deletions

View file

@ -1,83 +1,35 @@
<!--
title: "Collect application metrics with Netdata"
sidebar_label: "Application metrics"
description: "Monitor and troubleshoot every application on your infrastructure with per-second metrics, zero configuration, and meaningful charts."
custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/collecting-metrics/application-metrics.md"
learn_status: "Published"
learn_topic_type: "Concepts"
learn_rel_path: "Concepts"
-->
# Collect Application Metrics with Netdata
# Collect application metrics with Netdata
Netdata collects per-second metrics from a wide variety of applications running on your systems, including web servers,
databases, message brokers, email servers, search platforms, and more. These metrics collectors are pre-installed with
every Netdata Agent and typically require no configuration. Netdata also
uses [`apps.plugin`](/src/collectors/apps.plugin/README.md) to gather and visualize resource utilization per application
on Linux systems.
Netdata instantly collects per-second metrics from many different types of applications running on your systems, such as
web servers, databases, message brokers, email servers, search platforms, and much more. Metrics collectors are
pre-installed with every Netdata Agent and usually require zero configuration. Netdata also collects and visualizes
resource utilization per application on Linux systems using `apps.plugin`.
The `apps.plugin` inspects the Linux process tree every second, similar to `top` or `ps fax`, and collects resource
utilization data for every running process. However, Netdata goes a step further: instead of just displaying raw data,
it transforms it into easy-to-understand charts. Rather than presenting a long list of processes, Netdata categorizes
applications into meaningful groups, such as "web servers" or "databases." Each category has its own charts in the
**Applications** section of your Netdata dashboard. Additionally, there are charts for individual users and user groups
under the **Users** and **User Groups** sections.
[**apps.plugin**](/src/collectors/apps.plugin/README.md) looks at the Linux process tree every second, much like `top` or
`ps fax`, and collects resource utilization information on every running process. By reading the process tree, Netdata
shows CPU, disk, networking, processes, and eBPF for every application or Linux user. Unlike `top` or `ps fax`, Netdata
adds a layer of meaningful visualization on top of the process tree metrics, such as grouping applications into useful
dimensions, and then creates per-application charts under the **Applications** section of a Netdata dashboard, per-user
charts under **Users**, and per-user group charts under **User Groups**.
In addition to charts, `apps.plugin` offers the **Processes** [Function](/docs/top-monitoring-netdata-functions.md),
which visualizes process entries in a table and allows for intuitive exploration of the processes. For more details on
how the visualization of Functions works, check out the documentation on
the [Top tab](/docs/dashboards-and-charts/top-tab.md).
Our most popular application collectors:
- [Prometheus endpoints](/src/go/collectors/go.d.plugin/modules/prometheus/README.md): Gathers
metrics from one or more Prometheus endpoints that use the OpenMetrics exposition format. Auto-detects more than 600
endpoints.
- [Web server logs (Apache, NGINX)](/src/go/collectors/go.d.plugin/modules/weblog/README.md):
Tail access logs and provide very detailed web server performance statistics. This module is able to parse 200k+
rows in less than half a second.
- [MySQL](/src/go/collectors/go.d.plugin/modules/mysql/README.md): Collect database global,
replication, and per-user statistics.
- [Redis](/src/go/collectors/go.d.plugin/modules/redis/README.md): Monitor database status by
reading the server's response to the `INFO` command.
- [Apache](/src/go/collectors/go.d.plugin/modules/apache/README.md): Collect Apache web server
performance metrics via the `server-status?auto` endpoint.
- [Nginx](/src/go/collectors/go.d.plugin/modules/nginx/README.md): Monitor web server status
information by gathering metrics via `ngx_http_stub_status_module`.
- [Postgres](/src/go/collectors/go.d.plugin/modules/postgres/README.md): Collect database health
and performance metrics.
- [ElasticSearch](/src/go/collectors/go.d.plugin/modules/elasticsearch/README.md): Collect search
engine performance and health statistics. Optionally collects per-index metrics.
- [PHP-FPM](/src/go/collectors/go.d.plugin/modules/phpfpm/README.md): Collect application summary
and processes health metrics by scraping the status page (`/status?full`).
Our [supported collectors list](/src/collectors/COLLECTORS.md#service-and-application-collectors) shows all Netdata's
application metrics collectors, including those for containers/k8s clusters.
## Collect metrics from applications running on Windows
Netdata is fully capable of collecting and visualizing metrics from applications running on Windows systems. The only
caveat is that you must [install Netdata](/packaging/installer/README.md) on a separate system or a compatible VM because there
is no native Windows version of the Netdata Agent.
Once you have Netdata running on that separate system, you can follow the [collectors configuration reference](/src/collectors/REFERENCE.md) documentation to tell the collector to look for exposed metrics on the Windows system's IP
address or hostname, plus the applicable port.
For example, you have a MySQL database with a root password of `my-secret-pw` running on a Windows system with the IP
address 203.0.113.0. you can configure the [MySQL
collector](/src/go/collectors/go.d.plugin/modules/mysql/README.md) to look at `203.0.113.0:3306`:
```yml
jobs:
- name: local
dsn: root:my-secret-pw@tcp(203.0.113.0:3306)/
```
This same logic applies to any application in our [supported collectors
list](/src/collectors/COLLECTORS.md#service-and-application-collectors) that can run on Windows.
## What's next?
If you haven't yet seen the [supported collectors list](/src/collectors/COLLECTORS.md) give it a once-over for any
additional applications you may want to monitor using Netdata's native collectors, or the [generic Prometheus
collector](/src/go/collectors/go.d.plugin/modules/prometheus/README.md).
Collecting all the available metrics on your nodes, and across your entire infrastructure, is just one piece of the
puzzle. Next, learn more about Netdata's famous real-time visualizations by [seeing an overview of your
infrastructure](/docs/dashboards-and-charts/home-tab.md) using Netdata Cloud.
Popular application collectors:
| Collector | Description |
|--------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|
| [Prometheus](/src/go/collectors/go.d.plugin/modules/prometheus/README.md) | Gathers metrics from one or more Prometheus endpoints that use the OpenMetrics exposition format. Auto-detects more than 600 endpoints. |
| [Web server logs (Apache, NGINX)](/src/go/collectors/go.d.plugin/modules/weblog/README.md) | Tails access logs and provides very detailed web server performance statistics. This module is able to parse 200k+ rows in less than half a second. |
| [MySQL](/src/go/collectors/go.d.plugin/modules/mysql/README.md) | Collects database global, replication, and per-user statistics. |
| [Redis](/src/go/collectors/go.d.plugin/modules/redis/README.md) | Monitors database status by reading the server's response to the `INFO` command. |
| [Apache](/src/go/collectors/go.d.plugin/modules/apache/README.md) | Collects Apache web server performance metrics via the `server-status?auto` endpoint. |
| [Nginx](/src/go/collectors/go.d.plugin/modules/nginx/README.md) | Monitors web server status information by gathering metrics via `ngx_http_stub_status_module`. |
| [Postgres](/src/go/collectors/go.d.plugin/modules/postgres/README.md) | Collects database health and performance metrics. |
| [ElasticSearch](/src/go/collectors/go.d.plugin/modules/elasticsearch/README.md) | Collects search engine performance and health statistics. Can optionally collect per-index metrics as well. |
Check available [data collection integrations](/src/collectors/COLLECTORS.md#available-data-collection-integrations) for
a comprehensive view to all the integrations you can use to gather metrics with Netdata.

View file

@ -1,48 +1,20 @@
<!--
title: "Collect container metrics with Netdata"
sidebar_label: "Container metrics"
description: "Use Netdata to collect per-second utilization and application-level metrics from Linux/Docker containers and Kubernetes clusters."
custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/collecting-metrics/container-metrics.md"
learn_status: "Published"
learn_topic_type: "Concepts"
learn_rel_path: "Concepts"
-->
# Collect container metrics with Netdata
# Collect Container Metrics with Netdata
Thanks to close integration with Linux cgroups and the virtual files it maintains under `/sys/fs/cgroup`, Netdata can
monitor the health, status, and resource utilization of many different types of Linux containers.
monitor health, status, and resource utilization of many different types of Linux containers.
Netdata uses [cgroups.plugin](/src/collectors/cgroups.plugin/README.md) to poll `/sys/fs/cgroup` and convert the raw data
into human-readable metrics and meaningful visualizations. Through cgroups, Netdata is compatible with **all Linux
containers**, such as Docker, LXC, LXD, Libvirt, systemd-nspawn, and more. Read more about [Docker-specific
monitoring](#collect-docker-metrics) below.
Netdata uses the [cgroups.plugin](/src/collectors/cgroups.plugin/README.md) to poll `/sys/fs/cgroup` and convert the raw data into human-readable metrics and meaningful visualizations. Through cgroups, Netdata is compatible with **all Linux containers**, such as Docker, LXC, LXD, Libvirt, systemd-nspawn, and more. Read more about [Docker-specific monitoring](#collect-docker-metrics) below.
Netdata also has robust **Kubernetes monitoring** support thanks to a
[Helmchart](/packaging/installer/methods/kubernetes.md) to automate deployment, collectors for k8s agent services, and
robust [service discovery](https://github.com/netdata/agent-service-discovery/#service-discovery) to monitor the
services running inside of pods in your k8s cluster. Read more about [Kubernetes
monitoring](#collect-kubernetes-metrics) below.
Netdata also has robust **Kubernetes monitoring** support thanks to a [Helmchart](/packaging/installer/methods/kubernetes.md) to automate deployment, collectors for k8s agent services, and robust [service discovery](https://github.com/netdata/agent-service-discovery/#service-discovery) to monitor the services running inside of pods in your k8s cluster. Read more about [Kubernetes monitoring](#collect-kubernetes-metrics) below.
A handful of additional collectors gather metrics from container-related services, such as
[dockerd](/src/go/collectors/go.d.plugin/modules/docker/README.md) or [Docker
Engine](/src/go/collectors/go.d.plugin/modules/docker_engine/README.md). You can find all
container collectors in our supported collectors list under the
[containers/VMs](/src/collectors/COLLECTORS.md#containers-and-vms) and
[Kubernetes](/src/collectors/COLLECTORS.md#containers-and-vms) headings.
A handful of additional collectors gather metrics from container-related services, such as [dockerd](/src/go/collectors/go.d.plugin/modules/docker/README.md) or [Docker Engine](/src/go/collectors/go.d.plugin/modules/docker_engine/README.md). You can find all
container collectors in our data collection integrations list under the [containers/VMs](/src/collectors/COLLECTORS.md#containers-and-vms) and [Kubernetes](/src/collectors/COLLECTORS.md#kubernetes) sections.
## Collect Docker metrics
Netdata has robust Docker monitoring thanks to the aforementioned
[cgroups.plugin](/src/collectors/cgroups.plugin/README.md). By polling cgroups every second, Netdata can produce meaningful
visualizations about the CPU, memory, disk, and network utilization of all running containers on the host system with
zero configuration.
Netdata has robust Docker monitoring thanks to the aforementioned [cgroups.plugin](/src/collectors/cgroups.plugin/README.md).
Netdata also collects metrics from applications running inside of Docker containers. For example, if you create a MySQL
database container using `docker run --name some-mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql:tag`, it exposes
metrics on port 3306. You can configure the [MySQL
collector](/src/go/collectors/go.d.plugin/modules/mysql/README.md) to look at `127.0.0.0:3306` for
MySQL metrics:
Netdata also collects metrics from applications running inside of Docker containers. For example, if you create a MySQL database container using `docker run --name some-mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql:tag`, it exposes metrics on port 3306. You can configure the [MySQL collector](/src/go/collectors/go.d.plugin/modules/mysql/README.md) to look at `127.0.0.0:3306` for MySQL metrics:
```yml
jobs:
@ -50,52 +22,22 @@ jobs:
dsn: root:my-secret-pw@tcp(127.0.0.1:3306)/
```
Netdata then collects metrics from the container itself, but also dozens [MySQL-specific
metrics](/src/go/collectors/go.d.plugin/modules/mysql/README.md#charts) as well.
Netdata then collects metrics from the container itself, but also dozens [MySQL-specific metrics](/src/go/collectors/go.d.plugin/modules/mysql/README.md#charts) as well.
### Collect metrics from applications running in Docker containers
You could use this technique to monitor an entire infrastructure of Docker containers. The same [enable and configure](/src/collectors/REFERENCE.md) procedures apply whether an application runs on the host system or inside
a container. You may need to configure the target endpoint if it's not the application's default.
You could use this technique to monitor an entire infrastructure of Docker containers. The same [enable and configure](/src/collectors/REFERENCE.md) procedures apply whether an application runs on the host system or inside a container. You may need to configure the target endpoint if it's not the application's default.
Netdata can even [run in a Docker container](/packaging/docker/README.md) itself, and then collect metrics about the
host system, its own container with cgroups, and any applications you want to monitor.
Netdata can even [run in a Docker container](/packaging/docker/README.md) itself, and then collect metrics about the host system, its own container with cgroups, and any applications you want to monitor.
See our [application metrics doc](/docs/collecting-metrics/application-metrics.md) for details about Netdata's application metrics
collection capabilities.
See our [application metrics doc](/docs/collecting-metrics/application-metrics.md) for details about Netdata's application metrics collection capabilities.
## Collect Kubernetes metrics
We already have a few complementary tools and collectors for monitoring the many layers of a Kubernetes cluster,
_entirely for free_. These methods work together to help you troubleshoot performance or availability issues across
your k8s infrastructure.
- A [Helm chart](https://github.com/netdata/helmchart), which bootstraps a Netdata Agent pod on every node in your
cluster, plus an additional parent pod for storing metrics and managing alert notifications.
- A [service discovery plugin](https://github.com/netdata/agent-service-discovery), which discovers and creates
configuration files for [compatible
applications](https://github.com/netdata/helmchart#service-discovery-and-supported-services) and any endpoints
covered by our [generic Prometheus
collector](/src/go/collectors/go.d.plugin/modules/prometheus/README.md). With these
configuration files, Netdata collects metrics from any compatible applications as they run _inside_ a pod.
Service discovery happens without manual intervention as pods are created, destroyed, or moved between nodes.
- A [Kubelet collector](/src/go/collectors/go.d.plugin/modules/k8s_kubelet/README.md), which runs
on each node in a k8s cluster to monitor the number of pods/containers, the volume of operations on each container,
and more.
- A [kube-proxy collector](/src/go/collectors/go.d.plugin/modules/k8s_kubeproxy/README.md), which
also runs on each node and monitors latency and the volume of HTTP requests to the proxy.
- A [cgroups collector](/src/collectors/cgroups.plugin/README.md), which collects CPU, memory, and bandwidth metrics for
each container running on your k8s cluster.
For a holistic view of Netdata's Kubernetes monitoring capabilities, see our guide: [_Monitor a Kubernetes (k8s) cluster
with Netdata_](/docs/developer-and-contributor-corner/kubernetes-k8s-netdata.md).
## What's next?
Netdata is capable of collecting metrics from hundreds of applications, such as web servers, databases, messaging
brokers, and more. See more in the [application metrics doc](/docs/collecting-metrics/application-metrics.md).
If you already have all the information you need about collecting metrics, move into Netdata's meaningful visualizations
with [seeing an overview of your infrastructure](/docs/dashboards-and-charts/home-tab.md) using Netdata Cloud.
Netdata provides a host of tools and collectors for monitoring the many layers of a Kubernetes cluster. These methods work together to help you troubleshoot performance or availability issues across your k8s infrastructure.
- A [Helm chart](https://github.com/netdata/helmchart), which bootstraps a Netdata Agent pod on every node in your cluster, plus an additional parent pod for storing metrics and managing alert notifications.
- A [service discovery plugin](https://github.com/netdata/agent-service-discovery), which discovers and creates configuration files for [compatible applications](https://github.com/netdata/helmchart#service-discovery-and-supported-services) and any endpoints covered by our [generic Prometheus collector](/src/go/collectors/go.d.plugin/modules/prometheus/README.md). With these configuration files, Netdata collects metrics from any compatible applications as they run _inside_ a pod. Service discovery happens without manual intervention as pods are created, destroyed, or moved between nodes.
- A [Kubelet collector](/src/go/collectors/go.d.plugin/modules/k8s_kubelet/README.md), which runs on each node in a k8s cluster to monitor the number of pods/containers, the volume of operations on each container, and more.
- A [kube-proxy collector](/src/go/collectors/go.d.plugin/modules/k8s_kubeproxy/README.md), which also runs on each node and monitors latency and the volume of HTTP requests to the proxy.
- A [cgroups collector](/src/collectors/cgroups.plugin/README.md), which collects CPU, memory, and bandwidth metrics for each container running on your k8s cluster.

View file

@ -1,62 +1,13 @@
<!--
title: "Collect system metrics with Netdata"
sidebar_label: "System metrics"
description: "Netdata collects thousands of metrics from physical and virtual systems, IoT/edge devices, and containers with zero configuration."
custom_edit_url: "https://github.com/netdata/netdata/edit/master/docs/collecting-metrics/system-metrics.md"
learn_status: "Published"
learn_topic_type: "Concepts"
learn_rel_path: "Concepts"
-->
# Collect System Metrics with Netdata
# Collect system metrics with Netdata
Netdata collects thousands of metrics directly from the operating systems of physical and virtual machines, IoT/edge devices, and [containers](/docs/collecting-metrics/container-metrics.md) with zero configuration.
Netdata collects thousands of metrics directly from the operating systems of physical and virtual systems, IoT/edge
devices, and [containers](/docs/collecting-metrics/container-metrics.md) with zero configuration.
To gather system metrics, Netdata uses various plugins, each of which has one or more collectors for very specific metrics exposed by the host. The system metrics Netdata users interact with most for health monitoring and performance troubleshooting are collected and visualized by `proc.plugin`, `cgroups.plugin`, and `ebpf.plugin`.
To gather system metrics, Netdata uses roughly a dozen plugins, each of which has one or more collectors for very
specific metrics exposed by the host. The system metrics Netdata users interact with most for health monitoring and
performance troubleshooting are collected and visualized by `proc.plugin`, `cgroups.plugin`, and `ebpf.plugin`.
[**proc.plugin**](/src/collectors/proc.plugin/README.md) gathers metrics from the `/proc` and `/sys` folders in Linux systems, along with a few other endpoints, and is responsible for the bulk of the system metrics collected and visualized by Netdata. It collects CPU, memory, disks, load, networking, mount points, and more with zero configuration. It also allows Netdata to monitor its own resource utilization.
[**proc.plugin**](/src/collectors/proc.plugin/README.md) gathers metrics from the `/proc` and `/sys` folders in Linux
systems, along with a few other endpoints, and is responsible for the bulk of the system metrics collected and
visualized by Netdata. It collects CPU, memory, disks, load, networking, mount points, and more with zero configuration.
It even allows Netdata to monitor its own resource utilization!
[**cgroups.plugin**](/src/collectors/cgroups.plugin/README.md) collects rich metrics about containers and virtual machines
using the virtual files under `/sys/fs/cgroup`. By reading cgroups, Netdata can instantly collect resource utilization
metrics for systemd services, all containers (Docker, LXC, LXD, Libvirt, systemd-nspawn), and more. Learn more in the
[collecting container metrics](/docs/collecting-metrics/container-metrics.md) doc.
[**ebpf.plugin**](/src/collectors/ebpf.plugin/README.md): Netdata's extended Berkeley Packet Filter (eBPF) collector
monitors Linux kernel-level metrics for file descriptors, virtual filesystem IO, and process management. You can use our
eBPF collector to analyze how and when a process accesses files, when it makes system calls, whether it leaks memory or
creating zombie processes, and more.
While the above plugins and associated collectors are the most important for system metrics, there are many others. You
can find all system collectors in our [supported collectors list](/src/collectors/COLLECTORS.md#system-collectors).
## Collect Windows system metrics
Netdata is also capable of monitoring Windows systems. The [Windows
collector](/src/go/collectors/go.d.plugin/modules/windows/README.md) integrates with
[windows_exporter](https://github.com/prometheus-community/windows_exporter), a small Go-based binary that you can run
on Windows systems. The Windows collector then gathers metrics from an endpoint created by windows_exporter, for more
details see [the requirements](/src/go/collectors/go.d.plugin/modules/windows/README.md#requirements).
Next, [configure](/src/go/collectors/go.d.plugin/modules/windows/README.md#configuration) the Windows
collector to point to the URL and port of your exposed endpoint. Restart Netdata with `sudo systemctl restart netdata`, or the [appropriate
method](/packaging/installer/README.md#maintaining-a-netdata-agent-installation) for your system. You'll start seeing Windows system metrics, such as CPU
utilization, memory, bandwidth per NIC, number of processes, and much more.
For information about collecting metrics from applications _running on Windows systems_, see the [application metrics
doc](/docs/collecting-metrics/application-metrics.md#collect-metrics-from-applications-running-on-windows).
## What's next?
Because there's some overlap between system metrics and [container metrics](/docs/collecting-metrics/container-metrics.md), you
should investigate Netdata's container compatibility if you use them heavily in your infrastructure.
If you don't use containers, skip ahead to collecting [application metrics](/docs/collecting-metrics/application-metrics.md) with
Netdata.
[**cgroups.plugin**](/src/collectors/cgroups.plugin/README.md) collects rich metrics about containers and virtual machines using the virtual files under `/sys/fs/cgroup`. By reading cgroups, Netdata can instantly collect resource utilization metrics for systemd services, all containers (Docker, LXC, LXD, Libvirt, systemd-nspawn), and more. Learn more in the [collecting container metrics](/docs/collecting-metrics/container-metrics.md) doc.
[**ebpf.plugin**](/src/collectors/ebpf.plugin/README.md): Netdata's extended Berkeley Packet Filter (eBPF) collector monitors Linux kernel-level metrics for file descriptors, virtual filesystem IO, and process management. You can use our eBPF collector to analyze how and when a process accesses files, when it makes system calls, whether it leaks memory or creating zombie processes, and more.
While the above plugins and associated collectors are the most important for system metrics, there are many others. You can find all of our data collection integrations [here](/src/collectors/COLLECTORS.md#system-collectors).

View file

@ -1,6 +1,6 @@
# Deployment Guides
Netdata can be used to monitor all kinds of infrastructure, from stand-alone tiny IoT devices to complex hybrid setups combining on-premise and cloud infrastructure, mixing bare-metal servers, virtual machines and containers.
Netdata can be used to monitor all kinds of infrastructure, from tiny stand-alone IoT devices to complex hybrid setups combining on-premise and cloud infrastructure, mixing bare-metal servers, virtual machines and containers.
There are 3 components to structure your Netdata ecosystem:

View file

@ -1,60 +1,38 @@
# Deployment strategies
# Deployment Examples
## Deployment Options Overview
This section provides a quick overview of a few common deployment options. The next sections go into configuration examples and further reading.
This section provides a quick overview for a few common deployment options for Netdata.
### Stand-alone Deployment
You can read about [Standalone Deployment](/docs/deployment-guides/standalone-deployment.md) and [Deployment with Centralization Points](/docs/deployment-guides/deployment-with-centralization-points.md) in the documentation inside this section.
To help our users have a complete experience of Netdata when they install it for the first time, a Netdata Agent with default configuration
is a complete monitoring solution out of the box, having all these features enabled and available.
The sections below go into configuration examples about these deployment concepts.
The Agent will act as a _stand-alone_ Agent by default, and this is great to start out with for small setups and home labs. By [connecting each Agent to Cloud](/src/claim/README.md), you can see an overview of all your nodes, with aggregated charts and centralized alerting, without setting up a Parent.
## Deployment Configuration Details
![image](https://github.com/netdata/netdata/assets/116741/6a638175-aec4-4d46-85a6-520c283ab6a8)
### Stand-alone
### Parent Child Deployment
The stand-alone setup is configured out of the box with reasonable defaults, but please consult our [configuration documentation](/docs/netdata-agent/configuration/README.md) for details, including the overview of [common configuration changes](/docs/netdata-agent/configuration/common-configuration-changes.md).
An Agent connected to a Parent is called a _Child_. It will _stream_ metrics to its Parent. The Parent can then take care of storing metrics on behalf of that node (with longer retention), handle metrics queries for showing dashboards, and provide alerting.
### Parent Child
When using Cloud, it is recommended that just the Parent is connected to Cloud. Child Agents can then be configured to have short retention, in RAM instead of on Disk, and have alerting and other features disabled. Because they don't need to connect to Cloud themselves, those children can then be further secured by not allowing outbound traffic.
For setups involving Parent and Child Agents, they need to be configured for [streaming](docs/observability-centralization-points/metrics-centralization-points/configuration.md), through the configuration file `stream.conf`.
![image](https://github.com/netdata/netdata/assets/116741/cb65698d-a6b7-43ee-a2d1-c30d0a46f084)
This will instruct the Child to stream data to the Parent and the Parent to accept streaming connections for one or more Child Agents. To secure this connection, both need a shared API key (to replace the string `API_KEY` in the examples below). Additionally, the Child can be configured with one or more addresses of Parent Agents (`PARENT_IP_ADDRESS`).
This setup allows for leaner Child nodes and is good for setups with more than a handful of nodes. Metrics data remains accessible if the Child node is temporarily unavailable or decommissioned, although there is no failover in case the Parent becomes unavailable.
### ActiveActive Parent Deployment
For high availability, Parents can be configured to stream data for their children between them, and keep the data sets in sync. Child Agents are configured with the addresses of both Parent Agents, but will only stream to one of them at a time. When that Parent becomes unavailable, it reconnects to another. When the first Parent becomes available again, that Parent will catch up by receiving the backlog from the second.
With both Parent Agents connected to Cloud, Cloud will route queries to either Parent transparently, depending on their availability. Alerts trigger on either Parent will stream to Cloud, and Cloud will deduplicate and debounce state changes to prevent spurious notifications.
![image](https://github.com/netdata/netdata/assets/116741/6ae2b10c-7f7d-4503-aac4-0a9381c6f80b)
## Configuration Details
### Stand-alone Deployment
The stand-alone setup is configured out of the box with reasonable defaults, but please consult our [configuration documentation](/docs/netdata-agent/configuration/cheatsheet.md) for details, including the overview of [common configuration changes](/docs/netdata-agent/configuration/common-configuration-changes.md).
### Parent Child Deployment
For setups involving Child and Parent Agents, the Agents need to be configured for [_streaming_](/src/streaming/README.md), through the configuration file `stream.conf`. This will instruct the Child to stream data to the Parent and the Parent to accept streaming connections for one or more Child Agents. To secure this connection, both need set up a shared API key (to replace the string `API_KEY` in the examples below). Additionally, the Child is configured with one or more addresses of Parent Agents (`PARENT_IP_ADDRESS`).
An API key is a key created with `uuidgen` and is used for authentication and/or customization in the Parent side. I.e. a Child will stream using the API key, and a Parent is configured to accept connections from Child, but can also apply different options for children by using multiple different API keys. The easiest setup uses just one API key for all Child Agents.
An API key is a key created with `uuidgen` and is used for authentication and/or customization on the Parent side. For example, a Child can stream using the API key, and a Parent can be configured to accept connections from the Child, but it can also apply different options for Children by using multiple different API keys. The easiest setup uses just one API key for all Child Agents.
#### Child config
As mentioned above, the recommendation is to not claim the Child to Cloud directly during your setup, avoiding establishing an [ACLK](/src/aclk/README.md) connection.
As mentioned above, we do not recommend to claim the Child to Cloud directly during your setup.
To reduce the footprint of the Netdata Agent on your production system, some capabilities can be switched OFF on the Child and kept ON on the Parent. In this example, Machine Learning and Alerting are disabled in the Child, so that the Parent can take the load. We also use RAM instead of disk to store metrics with limited retention, covering temporary network issues.
This is done in order to reduce the footprint of the Netdata Agent on your production system, as some capabilities can be switched OFF for the Child and kept ON for the Parent.
In this example, Machine Learning and Alerting are disabled for the Child, so that the Parent can take the load. We also use RAM instead of disk to store metrics with limited retention, covering temporary network issues.
##### netdata.conf
On the child node, edit `netdata.conf` by using the edit-config script: `/etc/netdata/edit-config netdata.conf` set the following parameters:
On the child node, edit `netdata.conf` by using the [edit-config](docs/netdata-agent/configuration/README.md#edit-netdataconf) script and set the following parameters:
```yaml
[db]
@ -85,9 +63,7 @@ On the child node, edit `netdata.conf` by using the edit-config script: `/etc/ne
##### stream.conf
To edit `stream.conf`, again use the edit-config script: `/etc/netdata/edit-config stream.conf`.
Set the following parameters:
To edit `stream.conf`, use again the [edit-config](docs/netdata-agent/configuration/README.md#edit-netdataconf) script and set the following parameters:
```yaml
[stream]
@ -101,7 +77,7 @@ Set the following parameters:
#### Parent config
For the Parent, besides setting up streaming, the example will also provide an example configuration of multiple [tiers](/src/database/engine/README.md#tiering) of metrics [storage](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md), for 10 children, with about 2k metrics each.
For the Parent, besides setting up streaming, this example also provides configuration for multiple [tiers of metrics storage](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md#calculate-the-system-resources-ram-disk-space-needed-to-store-metrics), for 10 Children, with about 2k metrics each. This allows for:
- 1s granularity at tier 0 for 1 week
- 1m granularity at tier 1 for 1 month
@ -114,7 +90,7 @@ Requiring:
##### netdata.conf
On the Parent, edit `netdata.conf` with `/etc/netdata/edit-config netdata.conf` and set the following parameters:
On the Parent, edit `netdata.conf` by using the [edit-config](docs/netdata-agent/configuration/README.md#edit-netdataconf) script and set the following parameters:
```yaml
[db]
@ -149,7 +125,7 @@ On the Parent, edit `netdata.conf` with `/etc/netdata/edit-config netdata.conf`
##### stream.conf
On the Parent node, edit `stream.conf` with `/etc/netdata/edit-config stream.conf`, and then set the following parameters:
On the Parent node, edit `stream.conf` by using the [edit-config](docs/netdata-agent/configuration/README.md#edit-netdataconf) script and set the following parameters:
```yaml
[API_KEY]
@ -157,13 +133,13 @@ On the Parent node, edit `stream.conf` with `/etc/netdata/edit-config stream.con
enabled = yes
```
### ActiveActive Parent Deployment
### ActiveActive Parents
In order to setup activeactive streaming between Parent 1 and Parent 2, Parent 1 needs to be instructed to stream data to Parent 2 and Parent 2 to stream data to Parent 1. The Child Agents need to be configured with the addresses of both Parent Agents. The Agent will only connect to one Parent at a time, falling back to the next if the previous failed. These examples use the same API key between Parent Agents as for connections from Child Agents.
In order to setup activeactive streaming between Parent 1 and Parent 2, Parent 1 needs to be instructed to stream data to Parent 2 and Parent 2 to stream data to Parent 1. The Child Agents need to be configured with the addresses of both Parent Agents. An Agent will only connect to one Parent at a time, falling back to the next upon failure. These examples use the same API key between Parent Agents and for connections for Child Agents.
On both Netdata Parent and all Child Agents, edit `stream.conf` with `/etc/netdata/edit-config stream.conf`:
On both Netdata Parent and all Child Agents, edit `stream.conf` by using the [edit-config](docs/netdata-agent/configuration/README.md#edit-netdataconf) script:
##### stream.conf on Parent 1
#### stream.conf on Parent 1
```yaml
[stream]
@ -178,7 +154,7 @@ On both Netdata Parent and all Child Agents, edit `stream.conf` with `/etc/netda
enabled = yes
```
##### stream.conf on Parent 2
#### stream.conf on Parent 2
```yaml
[stream]
@ -192,7 +168,7 @@ On both Netdata Parent and all Child Agents, edit `stream.conf` with `/etc/netda
enabled = yes
```
##### stream.conf on Child Agents
#### stream.conf on Child Agents
```yaml
[stream]
@ -208,19 +184,11 @@ On both Netdata Parent and all Child Agents, edit `stream.conf` with `/etc/netda
We strongly recommend the following configuration changes for production deployments:
1. Understand Netdata's [security and privacy design](/docs/security-and-privacy-design/README.md) and
[secure your nodes](/docs/netdata-agent/securing-netdata-agents.md)
1. Understand Netdata's [security and privacy design](/docs/security-and-privacy-design/README.md) and [secure your nodes](/docs/netdata-agent/securing-netdata-agents.md)
To safeguard your infrastructure and comply with your organization's security policies.
2. Set up [streaming and replication](/src/streaming/README.md) to:
- Offload Netdata Agents running on production systems and free system resources for the production applications running on them.
- Isolate production systems from the rest of the world and improve security.
- Increase data retention.
- Make your data highly available.
3. [Optimize the Netdata Agents system utilization and performance](/docs/netdata-agent/configuration/optimize-the-netdata-agents-performance.md)
2. [Optimize the Netdata Agents system utilization and performance](/docs/netdata-agent/configuration/optimize-the-netdata-agents-performance.md)
To save valuable system resources, especially when running on weak IoT devices.
@ -228,11 +196,11 @@ We also suggest that you:
1. [Use Netdata Cloud to access the dashboards](/docs/netdata-cloud/monitor-your-infrastructure.md)
For increased security, user management and access to our latest tools for advanced dashboarding and troubleshooting.
For increased security, user management and access to our latest features, tools and troubleshooting solutions.
2. [Change how long Netdata stores metrics](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md)
To control Netdata's memory use, when you have a lot of ephemeral metrics.
To control Netdata's memory use, when you have a lot of ephemeral metrics.
3. [Use host labels](/docs/netdata-agent/configuration/organize-systems-metrics-and-alerts.md)

View file

@ -14,7 +14,7 @@ When metrics and logs are centralized, the Children are never queried for metric
| Unified infrastructure dashboards for logs | All logs are accessible via the same dashboard at Netdata Cloud, although they are unified per Netdata Parent |
| Centrally configured alerts | Yes, at Netdata Parents |
| Centrally dispatched alert notifications | Yes, at Netdata Cloud |
| Data are exclusively on-prem | Yes, Netdata Cloud queries Netdata Agents to satisfy dashboard queries. |
| Data are exclusively on-prem | Yes, Netdata Cloud queries Netdata Agents to satisfy dashboard queries. |
A configuration with 2 observability centralization points, looks like this:
@ -24,7 +24,7 @@ flowchart LR
dashboard
for all nodes"]]
NC(["<b>Netdata Cloud</b>
decides which agents
decides which Agents
need to be queried"])
SA1["Netdata at AWS
A1"]
@ -93,16 +93,24 @@ flowchart LR
SB1 & SB2 & SBN ---|stream| PB
```
### Configuration steps for deploying Netdata with Observability Centralization Points
## ActiveActive Parent Deployment
For high availability, Parents can be configured to stream data for their Children between them, and keep their data sets in sync. Children are configured with the addresses of both Parents, but will only stream to one of them at a time. When one Parent becomes unavailable, the Child reconnects to the other. When the first Parent becomes available again, that Parent will catch up by receiving the backlog from the second.
With both Parent Agents connected to Netdata Cloud, it will route queries to either of them transparently, depending on their availability. Alerts trigger on either Parent will stream to Cloud, and Cloud will deduplicate and debounce state changes to prevent spurious notifications.
## Configuration steps for deploying Netdata with Observability Centralization Points
For Metrics:
- Install Netdata agents on all systems and the Netdata Parents.
- Install Netdata Agents on all systems and the Netdata Parents.
- Configure `stream.conf` at the Netdata Parents to enable streaming access with an API key.
- Configure `stream.conf` at the Netdata Children to enable streaming to the configured Netdata Parents.
Check the [related section in our documentation](/docs/observability-centralization-points/metrics-centralization-points/README.md) for more info
For Logs:
- Install `systemd-journal-remote` on all systems and the Netdata Parents.
@ -111,11 +119,4 @@ For Logs:
- Configure `systemd-journal-upload` at the Netdata Children to enable transmission of their logs to the Netdata Parents.
Optionally:
- Disable ML, health checks and dashboard access at Netdata Children to save resources and avoid duplicate notifications.
When using Netdata Cloud:
- Optionally: disable dashboard access on all Netdata agents (including Netdata Parents).
- Optionally: disable alert notifications on all Netdata agents (including Netdata Parents).
Check the [related section in our documentation](/docs/observability-centralization-points/logs-centralization-points-with-systemd-journald/README.md) for more info

View file

@ -1,22 +1,22 @@
# Standalone Deployment
To help our users have a complete experience of Netdata when they install it for the first time, a Netdata Agent with default configuration is a complete monitoring solution out of the box, having all its features enabled and available.
To help our users have a complete experience of Netdata when they install it for the first time, the Netdata Agent with default configuration is a complete monitoring solution out of the box, with features enabled and available.
So, each Netdata agent acts as a standalone monitoring system by default.
So, each Netdata Agent acts as a standalone monitoring system by default.
## Standalone agents, without Netdata Cloud
## Standalone Agents, without Netdata Cloud
| Feature | How it works |
|:---------------------------------------------:|:----------------------------------------------------:|
| Unified infrastructure dashboards for metrics | No, each Netdata agent provides its own dashboard |
| Unified infrastructure dashboards for logs | No, each Netdata agent exposes its own logs |
| Unified infrastructure dashboards for metrics | No, each Netdata Agent provides its own dashboard |
| Unified infrastructure dashboards for logs | No, each Netdata Agent exposes its own logs |
| Centrally configured alerts | No, each Netdata has its own alerts configuration |
| Centrally dispatched alert notifications | No, each Netdata agent sends notifications by itself |
| Centrally dispatched alert notifications | No, each Netdata Agent sends notifications by itself |
| Data are exclusively on-prem | Yes |
When using Standalone Netdata agents, each of them offers an API and a dashboard, at its own unique URL, that looks like `http://agent-ip:19999`.
When using Standalone Netdata Agents, each of them offers an API and a dashboard, at its own unique URL, that looks like `http://agent-ip:19999`.
So, each of the Netdata agents has to be accessed individually and independently of the others:
So, each of the Netdata Agents has to be accessed individually and independently of the others:
```mermaid
flowchart LR
@ -37,7 +37,7 @@ flowchart LR
WEB -->|URL N| SN
```
The same is true for alert notifications. Each of the Netdata agents runs its own alerts and sends notifications by itself, according to its configuration:
The same is true for alert notifications. Each of the Netdata Agents runs its own alerts and sends notifications by itself, according to its configuration:
```mermaid
flowchart LR
@ -61,23 +61,23 @@ flowchart LR
S1 & S2 & SN ==> OTHER
```
### Configuration steps for standalone Netdata agents without Netdata Cloud
### Configuration steps for standalone Netdata Agents without Netdata Cloud
No special configuration needed.
- Install Netdata agents on all your systems, then access each of them via its own unique URL, that looks like `http://agent-ip:19999/`.
- Install Netdata Agents on all your systems, then access each of them via its own unique URL, that looks like `http://agent-ip:19999/`.
## Standalone agents, with Netdata Cloud
## Standalone Agents, with Netdata Cloud
| Feature | How it works |
|:---------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| Unified infrastructure dashboards for metrics | Yes, via Netdata Cloud, all charts aggregate metrics from all servers. |
| Unified infrastructure dashboards for logs | All logs are accessible via the same dashboard at Netdata Cloud, although they are not unified (ie. logs from different servers are not multiplexed into a single view) |
| Centrally configured alerts | No, each Netdata has its own alerts configuration |
| Centrally configured alerts | No, each Netdata has its own alerts configuration |
| Centrally dispatched alert notifications | Yes, via Netdata Cloud |
| Data are exclusively on-prem | Yes, Netdata Cloud queries Netdata Agents to satisfy dashboard queries. |
By [connecting all Netdata agents to Netdata Cloud](/src/claim/README.md), you can have a unified infrastructure view of all your nodes, with aggregated charts, without configuring [observability centralization points](/docs/observability-centralization-points/README.md).
By [connecting all Netdata Agents to Netdata Cloud](/src/claim/README.md), you can have a unified infrastructure view of all your nodes, with aggregated charts, without configuring [observability centralization points](/docs/observability-centralization-points/README.md).
```mermaid
flowchart LR
@ -85,7 +85,7 @@ flowchart LR
dashboard
for all nodes"]]
NC(["<b>Netdata Cloud</b>
decides which agents
decides which Agents
need to be queried"])
S1["Standalone
Netdata
@ -100,7 +100,7 @@ flowchart LR
NC -->|queries| S1 & S2 & SN
```
Similarly for alerts, Netdata Cloud receives all alert transitions from all agents, decides which notifications should be sent and how, applies silencing rules, maintenance windows and based on each Netdata Cloud space and user settings, dispatches notifications:
Similarly for alerts, Netdata Cloud receives all alert transitions from all Agents, decides which notifications should be sent and how, applies silencing rules, maintenance windows and based on each Netdata Cloud space and user settings, dispatches notifications:
```mermaid
flowchart LR
@ -128,12 +128,14 @@ flowchart LR
S1 & S2 & SN -->|alert transition| NC
```
> Note that alerts are still triggered by Netdata agents. Netdata Cloud takes care of the notifications only.
> **Note**
>
> Alerts are still triggered by Netdata Agents. Netdata Cloud only takes care of the notifications.
### Configuration steps for standalone Netdata agents with Netdata Cloud
### Configuration steps for standalone Netdata Agents with Netdata Cloud
- Install Netdata agents using the commands given by Netdata Cloud, so that they will be automatically added to your Netdata Cloud space. Otherwise, install Netdata agents and then claim them via the command line or their dashboard.
- Install Netdata Agents using the commands given by Netdata Cloud, so that they will be automatically connected to your Netdata Cloud space. Otherwise, install Netdata Agents and then claim them via the command line or their dashboard.
- Optionally: disable their direct dashboard access to secure them.
- Optionally: disable their alert notifications to avoid receiving email notifications directly from them (email notifications are automatically enabled when a working MTA is found on the systems Netdata agents are installed).
- Optionally: disable their alert notifications to avoid receiving email notifications directly from them (email notifications are automatically enabled when a working MTA is found on the systems Netdata Agents are installed).