0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-02 20:48:06 +00:00

docs: grammar/format fixes to docs/netdata-agent/ ()

This commit is contained in:
Ilya Mashchenko 2024-11-05 14:04:11 +02:00 committed by GitHub
parent 8af00b9dbe
commit 20a280aeb4
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
14 changed files with 113 additions and 115 deletions

View file

@ -1,6 +1,6 @@
# Netdata Agent
The Netdata Agent is the main building block in a Netdata ecosystem. It is installed on all monitored systems to monitor system components, containers and applications.
The Netdata Agent is the main building block in the Netdata ecosystem. It is installed on all monitored systems to monitor system components, containers and applications.
The Netdata Agent is an **observability pipeline in a box** that can either operate standalone, or blend into a bigger pipeline made by more Netdata Agents (Children and Parents).
@ -53,7 +53,7 @@ stateDiagram-v2
1. **Discover**: auto-detect metric sources on localhost, auto-discover metric sources on Kubernetes.
2. **Collect**: query data sources to collect metric samples, using the optimal protocol for each data source. 800+ integrations supported, including dozens of native application protocols, OpenMetrics and StatsD.
3. **Detect Anomalies**: use the trained machine learning models for each metric, to detect in real-time if each sample collected is an outlier (an anomaly), or not.
3. **Detect Anomalies**: use the trained machine learning models for each metric to detect in real-time if each sample collected is an outlier (an anomaly), or not.
4. **Store**: keep collected samples and their anomaly status, in the time-series database (database mode `dbengine`) or a ring buffer (database modes `ram` and `alloc`).
5. **Learn**: train multiple machine learning models for each metric collected, learning behaviors and patterns for detecting anomalies.
6. **Check**: a health engine, triggering alerts and sending notifications. Netdata comes with hundreds of alert configurations that are automatically attached to metrics when they get collected, detecting errors, common configuration errors and performance issues.
@ -69,7 +69,7 @@ stateDiagram-v2
2. **Automation**: Netdata is designed to automate most of the process of setting up and running an observability solution. It is designed to instantly provide comprehensive dashboards and fully automated alerts, with zero configuration.
3. **High Fidelity Monitoring**: Netdata was born from our need to kill the console for observability. So, it provides metrics and logs in the same granularity and fidelity console tools do, but also comes with tools that go beyond metrics and logs, to provide a holistic view of the monitored infrastructure (e.g. check [Top Monitoring](/docs/top-monitoring-netdata-functions.md)).
3. **High Fidelity Monitoring**: Netdata was born from our need to kill the console for observability. So, it provides metrics and logs in the same granularity and fidelity console tools do, but also comes with tools that go beyond metrics and logs, to provide a holistic view of the monitored infrastructure (e.g., check [Top Monitoring](/docs/top-monitoring-netdata-functions.md)).
4. **Minimal impact on monitored systems and applications**: Netdata has been designed to have a minimal impact on the monitored systems and their applications. There are [independent studies](https://www.ivanomalavolta.com/files/papers/ICSOC_2023.pdf) reporting that Netdata excels in CPU usage, RAM utilization, Execution Time and the impact Netdata has on monitored applications and containers.
@ -77,8 +77,8 @@ stateDiagram-v2
## Dashboard Versions
The Netdata agents (Standalone, Children and Parents) **share the dashboard** of Netdata Cloud. However, when the user is logged-in and the Netdata agent is connected to Netdata Cloud, the following are enabled (which are otherwise disabled):
The Netdata agents (Standalone, Children and Parents) **share the dashboard** of Netdata Cloud. However, when the user is logged in and the Netdata agent is connected to Netdata Cloud, the following are enabled (which are otherwise disabled):
1. **Access to Sensitive Data**: Some data, like systemd-journal logs and several [Top Monitoring](/docs/top-monitoring-netdata-functions.md) features expose sensitive data, like IPs, ports, process command lines and more. To access all these when the dashboard is served directly from a Netdata agent, Netdata Cloud is required to verify that the user accessing the dashboard has the required permissions.
2. **Dynamic Configuration**: Netdata agents are configured via configuration files, manually or through some provisioning system. The latest Netdata includes a feature to allow users change some of the configuration (collectors, alerts) via the dashboard. This feature is only available to users of paid Netdata Cloud plan.
2. **Dynamic Configuration**: Netdata agents are configured via configuration files, manually or through some provisioning system. The latest Netdata includes a feature to allow users to change some configurations (collectors, alerts) via the dashboard. This feature is only available to users of paid Netdata Cloud plan.

View file

@ -6,7 +6,7 @@
## Introduction
When preparing to backup a Netdata Agent it is worth considering that there are different kinds of data that you may wish to backup independently or all together:
When planning a Netdata Agent backup, it's essential to recognize the types of data that can be backed up, either individually or collectively:
| Data type | Description | Location |
|---------------------|------------------------------------------------------|-----------------------------------------------------------------|
@ -18,15 +18,15 @@ When preparing to backup a Netdata Agent it is worth considering that there are
### Backing up to restore data in case of a node failure
In this standard scenario, you are backing up your Netdata Agent in case of a node failure or data corruption so that the metrics and the configuration can be recovered. The purpose is not to backup/restore the application itself.
In this standard scenario, youre backing up your Netdata Agent in case of a node failure or data corruption so that the metrics and the configuration can be recovered. The purpose is not to backup/restore the application itself.
1. Verify that the directory paths in the table above contain the information you expect.
1. Verify that the directory paths in the table above contain the information you expect.
> **Note**
> The specific paths may vary depending on installation method, Operating System, and whether it is a Docker/Kubernetes deployment.
2. It is recommended that you [stop the Netdata Agent](/docs/netdata-agent/start-stop-restart.md) when backing up the Metrics/database files.
Backing up the Agent configuration and Identity folders is straightforward as they should not be changing very frequently.
Backing up the Agent configuration and Identity folders is straightforward as they shouldnt be changing very frequently.
3. Using a backup tool such as `tar` you will need to run the backup as _root_ or as the _netdata_ user to access all the files in the directories.
@ -37,7 +37,7 @@ In this standard scenario, you are backing up your Netdata Agent in case of a no
Stopping the Netdata agent is typically necessary to back up the database files of the Netdata Agent.
If you want to minimize the gap in metrics caused by stopping the Netdata Agent, consider implementing a backup job or script that follows this sequence:
- Backup the Agent configuration Identity directories
- Stop the Netdata service
- Backup up the database files
@ -45,18 +45,18 @@ If you want to minimize the gap in metrics caused by stopping the Netdata Agent,
### Restoring Netdata
1. Ensure that the Netdata agent is installed and is [stopped](/docs/netdata-agent/start-stop-restart.md))
1. Ensure that the Netdata agent is installed and is [stopped](/docs/netdata-agent/start-stop-restart.md)
If you plan to deploy the Agent and restore a backup on top of it, then you might find it helpful to use the [`--dont-start-it`](/packaging/installer/methods/kickstart.md#other-options) option upon installation.
```bash
wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh && sh /tmp/netdata-kickstart.sh --dont-start-it
```
> **Note**
> If you are going to restore the database files then you should first ensure that the Metrics directory is empty.
>
> ```bash
> **Note**
> If you are going to restore the database files, then you should first ensure that the Metrics directory is empty.
>
> ```bash
> sudo rm -Rf /var/cache/netdata
> ```

View file

@ -1,6 +1,6 @@
# Anonymous telemetry events
By default, Netdata collects anonymous usage information from the open-source monitoring agent. For agent events like start,stop,crash etc we use our own cloud function in GCP. For frontend telemetry (page views etc.) on the agent dashboard itself we use the open-source
By default, Netdata collects anonymous usage information from the open-source monitoring agent. For agent events like start, stop, crash, etc. we use our own cloud function in GCP. For frontend telemetry (page views etc.) on the agent dashboard itself, we use the open-source
product analytics platform [PostHog](https://github.com/PostHog/posthog).
We are strongly committed to your [data privacy](https://netdata.cloud/privacy/).
@ -8,10 +8,10 @@ We are strongly committed to your [data privacy](https://netdata.cloud/privacy/)
We use the statistics gathered from this information for two purposes:
1. **Quality assurance**, to help us understand if Netdata behaves as expected, and to help us classify repeated
issues with certain distributions or environments.
issues with certain distributions or environments.
2. **Usage statistics**, to help us interpret how people use the Netdata agent in real-world environments, and to help
us identify how our development/design decisions influence the community.
us identify how our development/design decisions influence the community.
Netdata collects usage information via two different channels:
@ -24,7 +24,7 @@ You can opt-out from sending anonymous statistics to Netdata through three diffe
When you kick off an Agent dashboard session by visiting `http://NODE:19999`, Netdata initializes a PostHog session and masks various event attributes.
_Note_: You can see the relevant code in the [dashboard repository](https://github.com/netdata/dashboard/blob/master/src/domains/global/sagas.ts#L107) where the `window.posthog.register()` call is made.
_Note_: You can see the relevant code in the [dashboard repository](https://github.com/netdata/dashboard/blob/master/src/domains/global/sagas.ts#L107) where the `window.posthog.register()` call is made.
```JavaScript
window.posthog.register({
@ -44,7 +44,7 @@ variable is controlled via the [opt-out mechanism](#opt-out).
## Agent Backend - Anonymous Statistics Script
Every time the daemon is started or stopped and every time a fatal condition is encountered, Netdata uses the anonymous
statistics script to collect system information and send it to the Netdata telemetry cloud function via an http call. The information collected for all
statistics script to collect system information and send it to the Netdata telemetry cloud function via a http call. The information collected for all
events is:
- Netdata version
@ -53,7 +53,7 @@ events is:
- Virtualization technology
- Containerization technology
Furthermore, the FATAL event sends the Netdata process & thread name, along with the source code function, source code
Furthermore, the FATAL event sends the Netdata process and thread name, along with the source code function, source code
filename and source code line number of the fatal error.
Starting with v1.21, we additionally collect information about:
@ -61,7 +61,7 @@ Starting with v1.21, we additionally collect information about:
- Failures to build the dependencies required to use Cloud features.
- Unavailability of Cloud features in an agent.
- Failures to connect to the Cloud in case the [connection process](/src/claim/README.md) has been completed. This includes error codes
to inform the Netdata team about the reason why the connection failed.
to inform the Netdata team about the reason why the connection failed.
To see exactly what and how is collected, you can review the script template `daemon/anonymous-statistics.sh.in`. The
template is converted to a bash script called `anonymous-statistics.sh`, installed under the Netdata `plugins
@ -79,12 +79,12 @@ installation, including manual, offline, and macOS installations. Create the fil
**Pass the option `--disable-telemetry` to any of the installer scripts in the [installation
docs](/packaging/installer/README.md).** You can append this option during the initial installation or a manual
update. You can also export the environment variable `DISABLE_TELEMETRY` with a non-zero or non-empty value
(e.g: `export DISABLE_TELEMETRY=1`).
(e.g.,: `export DISABLE_TELEMETRY=1`).
When using Docker, **set your `DISABLE_TELEMETRY` environment variable to `1`.** You can set this variable with the following
command: `export DISABLE_TELEMETRY=1`. When creating a container using Netdata's [Docker
image](/packaging/docker/README.md#create-a-new-netdata-agent-container) for the first time, this variable will disable
the anonymous statistics script inside of the container.
the anonymous statistics script inside the container.
Each of these opt-out processes does the following:

View file

@ -1,6 +1,6 @@
# Useful management and configuration actions
Below you will find some of the most common actions that one can take while using Netdata. You can use this page as a quick reference for installing Netdata, connecting a node to the Cloud, properly editing the configuration, accessing Netdata's API, and more!
Below are some of the most common actions one can take while using Netdata. You can use this page as a quick reference for installing Netdata, connecting a node to the Cloud, properly editing the configuration, accessing Netdata's API, and more!
## Install Netdata
@ -45,8 +45,8 @@ sudo ./edit-config go.d.conf # edit a plugin's config
```yaml
modules:
activemq: no # disabled
cockroachdb: yes # enabled
activemq: no # disabled
cockroachdb: yes # enabled
```
### Edit a collector's config

View file

@ -139,6 +139,6 @@ The following restrictions apply to host label names:
- Names cannot start with `_`, but it can be present in other parts of the name.
- Names only accept alphabet letters, numbers, dots, and dashes.
The policy for values is more flexible, but you can not use exclamation marks (`!`), whitespaces (` `), single quotes
The policy for values is more flexible, but you cannot use exclamation marks (`!`), whitespaces (` `), single quotes
(`'`), double quotes (`"`), or asterisks (`*`), because they are used to compare label values in health alerts and
templates.

View file

@ -13,7 +13,7 @@ The Dynamic Configuration Manager allows direct configuration of collectors and
> **Info**
>
> To understand what actions users can perform based on their role, refer to the [Role Based Access documentation](/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md#dynamic-configuration-manager).
> To understand what actions users can perform based on their role, refer to the [Role-Based Access documentation](/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md#dynamic-configuration-manager).
## Collectors
@ -37,9 +37,9 @@ A job represents a running instance of a module with a specific configuration. T
Every job has a designated "source type" indicating its origin:
- **Stock**: Pre-installed with Netdata and provides basic data collection for common services.
- **User**: Originates from user-created files on the node.
- **User**: Created from user-defined configuration files on the node.
- **Discovered**: Automatically generated by Netdata upon discovering a service running on the node.
- **Dynamic Configuration**: Created and managed using the Dynamic Configuration Manager.
- **Dynamic Configuration**: Managed and created through the Dynamic Configuration Manager.
You can manage individual jobs using the following actions:
@ -53,7 +53,7 @@ You can manage individual jobs using the following actions:
## Health
Each entry in the Health tab contains an Alert template, that then is used to create Alerts.
Each entry in the Health tab contains an Alert template that then is used to create Alerts.
The functionality in the main view is the same as with the [Collectors tab](#collectors).

View file

@ -1,9 +1,9 @@
# How to optimize the Netdata Agent's performance
We designed the Netdata Agent to be incredibly lightweight, even when it's collecting a few thousand dimensions every
second and visualizing that data into hundreds of charts. However, the default settings of the Netdata Agent are not
optimized for performance, but for a simple, standalone setup. We want the first install to give you something you can
run without any configuration. Most of the settings and options are enabled, since we want you to experience the full
second and visualizing that data into hundreds of charts. However, the default settings of the Netdata Agent arent
optimized for performance, but for a simple, standalone setup. We want the first installation to give you something you can
run without any configuration. Most of the settings and options are enabled since we want you to experience the full
thing.
By default, Netdata will automatically detect applications running on the node it is installed to start collecting
@ -17,16 +17,16 @@ Netdata for production use.
The following table summarizes the effect of each optimization on the CPU, RAM and Disk IO utilization in production.
| Optimization | CPU | RAM | Disk IO |
|-------------------------------------------------------------------------------------------------------------------------------|--------------------|--------------------|--------------------|
| [Use streaming and replication](#use-streaming-and-replication) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| [Disable unneeded plugins or collectors](#disable-unneeded-plugins-or-collectors) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| [Reduce data collection frequency](#reduce-collection-frequency) | :heavy_check_mark: | | :heavy_check_mark: |
| Optimization | CPU | RAM | Disk IO |
|-----------------------------------------------------------------------------------------------------------------------------------|--------------------|--------------------|--------------------|
| [Use streaming and replication](#use-streaming-and-replication) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| [Disable unneeded plugins or collectors](#disable-unneeded-plugins-or-collectors) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| [Reduce data collection frequency](#reduce-collection-frequency) | :heavy_check_mark: | | :heavy_check_mark: |
| [Change how long Netdata stores metrics](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md) | | :heavy_check_mark: | :heavy_check_mark: |
| [Use a different metric storage database](/src/database/README.md) | | :heavy_check_mark: | :heavy_check_mark: |
| [Disable machine learning](#disable-machine-learning) | :heavy_check_mark: | | |
| [Use a reverse proxy](#run-netdata-behind-a-proxy) | :heavy_check_mark: | | |
| [Disable/lower gzip compression for the agent dashboard](#disablelower-gzip-compression-for-the-dashboard) | :heavy_check_mark: | | |
| [Use a different metric storage database](/src/database/README.md) | | :heavy_check_mark: | :heavy_check_mark: |
| [Disable machine learning](#disable-machine-learning) | :heavy_check_mark: | | |
| [Use a reverse proxy](#run-netdata-behind-a-proxy) | :heavy_check_mark: | | |
| [Disable/lower gzip compression for the agent dashboard](#disablelower-gzip-compression-for-the-dashboard) | :heavy_check_mark: | | |
## Resources required by a default Netdata installation
@ -39,15 +39,15 @@ You can configure almost all aspects of data collection/retention, and certain a
Expect about:
- 1-3% of a single core for the netdata core
- 1-3% of a single core for the various collectors (e.g. go.d.plugin, apps.plugin)
- 1-3% of a single core for the various collectors (e.g., go.d.plugin, apps.plugin)
- 5-10% of a single core, when ML training runs
Your experience may vary depending on the number of metrics collected, the collectors enabled and the specific
environment they run on, i.e. the work they have to do to collect these metrics.
environment they run on, i.e., the work they have to do to collect these metrics.
As a general rule, for modern hardware and VMs, the total CPU consumption of a standalone Netdata installation,
including all its components, should be below 5 - 15% of a single core. For example, on 8 core server it will use only
0.6% - 1.8% of a total CPU capacity, depending on the CPU characteristics.
0.6% - 1.8% of the total CPU capacity, depending on the CPU characteristics.
The Netdata Agent runs with the lowest
possible [process scheduling policy](/src/daemon/README.md#netdata-process-scheduling-policy),
@ -55,7 +55,7 @@ which is `nice 19`, and uses the `idle` process scheduler. Together, these setti
resources when the node has CPU resources to space. If the node reaches 100% CPU utilization, the Agent is stopped first
to ensure your applications get any available resources.
To reduce CPU usage you can (either one or a combination of the following actions):
To reduce CPU usage, you can (either one or a combination of the following actions):
1. [Disable machine learning](#disable-machine-learning),
2. [Use streaming and replication](#use-streaming-and-replication),
@ -77,16 +77,16 @@ To estimate and control memory consumption, you can (either one or a combination
### Disk footprint and I/O
By default, Netdata should not use more than 1GB of disk space, most of which is dedicated for storing metric data and
metadata. For typical installations collecting 2000 - 3000 metrics, this storage should provide a few days of
By default, Netdata shouldnt use more than 1GB of disk space, most of which is dedicated to storing metric data and
metadata. For typical installations collecting 20003000 metrics, this storage should provide a few days of
high-resolution retention (per second), about a month of mid-resolution retention (per minute) and more than a year of
low-resolution retention (per hour).
Netdata spreads I/O operations across time. For typical standalone installations there should be a few write operations
every 5-10 seconds of a few kilobytes each, occasionally up to 1MB. In addition, under heavy load, collectors that
Netdata spreads I/O operations across time. For typical standalone installations, there should be a few write operations
every 510 seconds of a few kilobytes each, occasionally up to 1MB. In addition, under a heavy load, collectors that
require disk I/O may stop and show gaps in charts.
To optimize your disk footprint in any aspect described below you can:
To optimize your disk footprint in any aspect described below, you can:
To configure retention, you can:
@ -129,8 +129,7 @@ See [using a different metric storage database](/src/database/README.md).
If you know that you don't need an [entire plugin or a specific
collector](/src/collectors/README.md#collector-architecture-and-terminology),
you can disable any of them. Keep in mind that if a plugin/collector has nothing to do, it simply shuts down and does
not consume system resources. You will only improve the Agent's performance by disabling plugins/collectors that are
you can disable any of them. Keep in mind that if a plugin/collector has nothing to do, it simply shuts down and doesnt consume system resources. You will only improve the Agent's performance by disabling plugins/collectors that are
actively collecting metrics.
Open `netdata.conf` and scroll down to the `[plugins]` section. To disable any plugin, uncomment it and set the value to

View file

@ -1,7 +1,7 @@
# Organize systems, metrics, and alerts
When you use Netdata to monitor and troubleshoot an entire infrastructure, you need sophisticated ways of keeping everything organized.
Netdata allows to organize your observability infrastructure with Spaces, Rooms, virtual nodes, host labels, and metric labels.
Netdata allows organizing your observability infrastructure with Spaces, Rooms, virtual nodes, host labels, and metric labels.
## Spaces and Rooms
@ -10,12 +10,12 @@ grouping of nodes and people. A node can only appear in a single space, while pe
The [Rooms](/docs/netdata-cloud/organize-your-infrastructure-invite-your-team.md#netdata-cloud-rooms) in a space bring together nodes and people in
collaboration areas. Rooms can also be used for fine-tuned
[role based access control](/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md).
[role-based access control](/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md).
## Virtual nodes
Netdatas virtual nodes functionality allows you to define nodes in configuration files and have them be treated as regular nodes
in all of the UI, dashboards, tabs, filters etc. For example, you can create a virtual node each for all your Windows machines
in all the UI, dashboards, tabs, filters, etc. For example, you can create a virtual node each for all your Windows machines
and monitor them as discrete entities. Virtual nodes can help you simplify your infrastructure monitoring and focus on the
individual node that matters.
@ -28,9 +28,9 @@ To define your windows server as a virtual node you need to:
guid: <value>
```
Just remember to use a valid guid (On Linux you can use `uuidgen` command to generate one, on Windows just use the `[guid]::NewGuid()` command in PowerShell)
Just remember to use a valid guid (On Linux you can use `uuidgen` command to generate one, on Windows just use the `[guid]::NewGuid()` command in PowerShell)
* Add the vnode config to the data collection job. e.g. in `go.d/windows.conf`:
* Add the vnode config to the data collection job. e.g., in `go.d/windows.conf`:
```yaml
jobs:
@ -44,7 +44,7 @@ To define your windows server as a virtual node you need to:
Host labels can be extremely useful when:
* You need alerts that adapt to the system's purpose
* You need properly-labeled metrics archiving so you can sort, correlate, and mash-up your data to your heart's content.
* You need properly labeled metrics archiving so you can sort, correlate, and mash-up your data to your heart's content.
* You need to keep tabs on ephemeral Docker containers in a Kubernetes cluster.
Let's take a peek into how to create host labels and apply them across a few of Netdata's features to give you more
@ -136,7 +136,7 @@ streamed from a child to its parent node, which concentrates an entire infrastru
and virtualization information in one place: the parent.
Now, if you'd like to remind yourself of how much RAM a certain child node has, you can access
`http://localhost:19999/host/CHILD_HOSTNAME/api/v1/info` and reference the automatically-generated host labels from the
`http://localhost:19999/host/CHILD_HOSTNAME/api/v1/info` and reference the automatically generated host labels from the
child system. It's a vastly simplified way of accessing critical information about your infrastructure.
> ⚠️ Because automatic labels for child nodes are accessible via API calls, and contain sensitive information like
@ -161,16 +161,16 @@ For example, let's use configuration example from earlier:
installed = 20200218
```
You could now create a new health entity (checking if disk space will run out soon) that applies only to any host
You could now create a new health entity (checking if disk space runs out soon) that applies only to any host
labeled `webserver`:
```yaml
template: disk_fill_rate
on: disk.space
lookup: max -1s at -30m unaligned of avail
calc: ($this - $avail) / (30 * 60)
every: 15s
host labels: type = webserver
on: disk.space
lookup: max -1s at -30m unaligned of avail
calc: ($this - $avail) / (30 * 60)
every: 15s
host labels: type = webserver
```
Or, by using one of the automatic labels, for only webserver systems running a specific OS:
@ -199,7 +199,7 @@ documentation](/src/health/REFERENCE.md#alert-line-host-labels) for more details
If you have enabled any metrics exporting via our experimental [exporters](/src/exporting/README.md), any new host
labels you created manually are sent to the destination database alongside metrics. You can change this behavior by
editing `exporting.conf`, and you can even send automatically-generated labels on with exported metrics.
editing `exporting.conf`, and you can even send automatically generated labels on with exported metrics.
```text
[exporting:global]
@ -228,8 +228,8 @@ more about exporting, read the [documentation](/src/exporting/README.md).
The Netdata aggregate charts allow you to filter and group metrics based on label name-value pairs.
All go.d plugin collectors support the specification of labels at the "collection job" level. Some collectors come with out of the box
labels (e.g. generic Prometheus collector, Kubernetes, Docker and more). But you can also add your own custom labels, by configuring
All go.d plugin collectors support the specification of labels at the "collection job" level. Some collectors come without of the box
labels (e.g. generic Prometheus collector, Kubernetes, Docker and more). But you can also add your own custom labels by configuring
the data collection jobs.
For example, suppose we have a single Netdata agent, collecting data from two remote Apache web servers, located in different data centers.
@ -251,4 +251,4 @@ jobs:
location: "New York"
```
Of course you may define as many custom label/value pairs as you like, in as many data collection jobs you need.
Of course, you may define as many custom label/value pairs as you like, in as many data collection jobs you need.

View file

@ -10,13 +10,12 @@ and secure their infrastructures.
Viewers will be able to get some information about the system Netdata is running. This information is everything the dashboard
provides. The dashboard includes a list of the services each system runs (the legends of the charts under the `Systemd Services`
section), the applications running (the legends of the charts under the `Applications` section), the disks of the system and
section), the applications running (the legends of the charts under the `Applications` section), the disks of the system and
their names, the user accounts of the system that are running processes (the `Users` and `User Groups` section of the dashboard),
the network interfaces and their names (not the IPs) and detailed information about the performance of the system and its applications.
This information is not sensitive (meaning that it is not your business data), but **it is important for possible attackers**.
It will give them clues on what to check, what to try and in the case of DDoS against your applications, they will know if they
are doing it right or not.
It will give them clues on what to check, what to try and in the case of DDoS against your applications, they will know if theyre doing it right or not.
Also, viewers could use Netdata itself to stress your servers. Although the Netdata daemon runs unprivileged, with the minimum
process priority (scheduling priority `idle` - lower than nice 19) and adjusts its OutOfMemory (OOM) score to 1000 (so that it
@ -29,7 +28,7 @@ that align with your goals and your organization's standards.
- [Disable the local dashboard](#disable-the-local-dashboard): **Simplest and recommended method** for those who have
added nodes to Netdata Cloud and view dashboards and metrics there.
- [Expose Netdata only in a private LAN](#expose-netdata-only-in-a-private-lan). Simplest and recommended method for those who do not use Netdata Cloud.
- [Expose Netdata only in a private LAN](#expose-netdata-only-in-a-private-lan). Simplest and recommended method for those who dont use Netdata Cloud.
- [Fine-grained access control](#fine-grained-access-control): Allow local dashboard access from
only certain IP addresses, such as a trusted static IP or connections from behind a management LAN. Full support for Netdata Cloud.
@ -67,7 +66,7 @@ that node no longer serves its local dashboard.
`netdata.conf` and use
> `edit-config`.
If you are using Netdata with Docker, make sure to set the `NETDATA_HEALTHCHECK_TARGET` environment variable to `cli`.
If youre using Netdata with Docker, make sure to set the `NETDATA_HEALTHCHECK_TARGET` environment variable to `cli`.
## Expose Netdata only in a private LAN
@ -85,7 +84,7 @@ You can bind Netdata to multiple IPs and ports. If you use hostnames, Netdata wi
**This is the best and the suggested way to protect Netdata**. Your systems **should** have a private administration and management
LAN, so that all management tasks are performed without any possibility of them being exposed on the internet.
For cloud based installations, if your cloud provider does not provide such a private LAN (or if you use multiple providers),
For Cloud-based installations, if your cloud provider doesnt provide such a private LAN (or if you use multiple providers),
you can create a virtual management and administration LAN with tools like `tincd` or `gvpe`. These tools create a mesh VPN
allowing all servers to communicate securely and privately. Your administration stations join this mesh VPN to get access to
management and administration tasks on all your cloud servers.
@ -122,7 +121,7 @@ patterns](/src/libnetdata/simple_pattern/README.md).
```
The `allow connections from` setting is global and restricts access to the dashboard, badges, streaming, API, and
`netdata.conf`, but you can also set each of those access lists more granularly if you choose:
`netdata.conf`, but you can also set each of those access lists in more detail if you want:
```text
[web]
@ -140,7 +139,7 @@ dashboard in transit. The connection to Netdata Cloud is always secured with TLS
## Use an authenticating web server in proxy mode
Use one web server to provide authentication in front of **all your Netdata servers**. So, you will be accessing all your Netdata with
URLs like `http://{HOST}/netdata/{NETDATA_HOSTNAME}/` and authentication will be shared among all of them (you will sign-in once for all your servers).
URLs like `http://{HOST}/netdata/{NETDATA_HOSTNAME}/` and authentication will be shared among all of them (you will sign in once for all your servers).
Instructions are provided on how to set the proxy configuration to have Netdata run behind
[nginx](/docs/netdata-agent/configuration/running-the-netdata-agent-behind-a-reverse-proxy/Running-behind-nginx.md),
[HAproxy](/docs/netdata-agent/configuration/running-the-netdata-agent-behind-a-reverse-proxy/Running-behind-haproxy.md),
@ -151,25 +150,25 @@ Instructions are provided on how to set the proxy configuration to have Netdata
## Use Netdata parents as Web Application Firewalls
The Netdata Agents you install on your production systems do not need direct access to the Internet. Even when you use
The Netdata Agents you install on your production systems dont need direct access to the Internet. Even when you use
Netdata Cloud, you can appoint one or more Netdata Parents to act as border gateways or application firewalls, isolating
your production systems from the rest of the world. Netdata
Parents receive metric data from Netdata Agents or other Netdata Parents on one side, and serve most queries using their own
copy of the data to satisfy dashboard requests on the other side.
For more information see [Streaming and replication](/docs/observability-centralization-points/README.md).
For more information, see [Streaming and replication](/docs/observability-centralization-points/README.md).
## Other methods
Of course, there are many more methods you could use to protect Netdata:
- Bind Netdata to localhost and use `ssh -L 19998:127.0.0.1:19999 remote.netdata.ip` to forward connections of local port 19998 to remote port 19999.
This way you can ssh to a Netdata server and then use `http://127.0.0.1:19998/` on your computer to access the remote Netdata dashboard.
This way you can ssh to a Netdata server and then use `http://127.0.0.1:19998/` on your computer to access the remote Netdata dashboard.
- If you are always under a static IP, you can use the script given above to allow direct access to your Netdata servers without authentication,
from all your static IPs.
- If youre always under a static IP, you can use the script given above to allow direct access to your Netdata servers without authentication,
from all your static IPs.
- Install all your Netdata in **headless data collector** mode, forwarding all metrics in real-time to a parent
Netdata server, which will be protected with authentication using an nginx server running locally at the parent
Netdata server. This requires more resources (you will need a bigger parent Netdata server), but does not require
any firewall changes, since all the child Netdata servers will not be listening for incoming connections.
Netdata server, which will be protected with authentication using a nginx server running locally at the parent
Netdata server. This requires more resources (you will need a bigger parent Netdata server), but doesnt require
any firewall changes, since all the child Netdata servers will not be listening for incoming connections.

View file

@ -1,16 +1,16 @@
# Bandwidth Requirements
## On Production Systems, Standalone Netdata
## Production Systems: Standalone Netdata
Standalone Netdata may use network bandwidth under the following conditions:
1. You configured data collection jobs that are fetching data from remote systems. There is no such jobs enabled by default.
1. You configured data collection jobs that are fetching data from remote systems. There are no such jobs enabled by default.
2. You use the dashboard of the Netdata.
3. [Netdata Cloud communication](#netdata-cloud-communication) (see below).
## On Metrics Centralization Points, between Netdata Children & Parents
## Metrics Centralization Points: Between Netdata Children & Parents
Netdata supports multiple compression algorithms for streaming communication. Netdata Children offer all their compression algorithms when connecting to a Netdata Parent, and the Netdata Parent decides which one to use based on algorithms availability and user configuration.
Netdata supports multiple compression algorithms for streaming communication. Netdata Children offer all their compression algorithms when connecting to a Netdata Parent, and the Netdata Parent decides which one to use based on algorithm availability and user configuration.
| Algorithm | Best for |
|:---------:|:-----------------------------------------------------------------------------------------------------------------------------------:|
@ -42,6 +42,6 @@ The information transferred to Netdata Cloud is:
3. Information about the **metrics available and their retention**.
4. Information about the **configured alerts and their transitions**.
This is not a constant stream of information. Netdata Agents update Netdata Cloud only about status changes on all the above (e.g. an alert being triggered, or a metric stopped being collected). So, there is an initial handshake and exchange of information when Netdata starts, and then there only updates when required.
This is not a constant stream of information. Netdata Agents update Netdata Cloud only about status changes on all the above (e.g., an alert being triggered, or a metric stopped being collected). So, there is an initial handshake and exchange of information when Netdata starts, and then there only updates when required.
Of course, when you view Netdata Cloud dashboards that need to query the database a Netdata agent maintains, this query is forwarded to an agent that can satisfy it. This means that Netdata Cloud receives metric samples only when a user is accessing a dashboard and the samples transferred are usually aggregations to allow rendering the dashboards.

View file

@ -16,11 +16,11 @@ With default settings on Children, CPU utilization typically falls within the ra
For Netdata Parents (Metrics Centralization Points), we estimate the following CPU utilization:
| Feature | Depends On | Expected Utilization (CPU cores per million) | Key Reasons |
|:--------------------:|:---------------------------------------------------:|:----------------------------------------------------------------:|:------------------------------------------------------------------------:|
| Metrics Ingest | Number of samples received per second | 2 | Decompress and decode received messages, update database |
| Metrics re-streaming | Number of samples resent per second | 2 | Encode and compress messages towards another Parent |
| Machine Learning | Number of unique time-series concurrently collected | 2 | Train machine learning models, query existing models to detect anomalies |
| Feature | Depends On | Expected Utilization (CPU cores per million) | Key Reasons |
|:--------------------:|:---------------------------------------------------:|:--------------------------------------------:|:------------------------------------------------------------------------:|
| Metrics Ingest | Number of samples received per second | 2 | Decompress and decode received messages, update database |
| Metrics re-streaming | Number of samples resent per second | 2 | Encode and compress messages towards another Parent |
| Machine Learning | Number of unique time-series concurrently collected | 2 | Train machine learning models, query existing models to detect anomalies |
To ensure optimal performance, keep total CPU utilization below 60% when the Parent is actively processing metrics, training models, and running health checks.

View file

@ -12,7 +12,7 @@ Netdata offers two database modes to suit your needs for performance and data pe
## `dbengine`
Netdata's `dbengine` mode efficiently stores data on disk using compression. The actual disk space used depends on how well the data compresses.
This mode utilizes a tiered storage approach: data is saved in multiple tiers on disk. Each tier retains data at a different resolution (detail level). Higher tiers store a down-sampled (less detailed) version of the data found in lower tiers.
This mode uses a tiered storage approach: data is saved in multiple tiers on disk. Each tier retains data at a different resolution (detail level). Higher tiers store a down-sampled (less detailed) version of the data found in lower tiers.
```mermaid
gantt
@ -25,7 +25,7 @@ gantt
tier2, 365d :a3, 2023-11-02, 59d
```
`dbengine` supports up to 5 tiers. By default, 3 tiers are used:
`dbengine` supports up to five tiers. By default, three tiers are used:
| Tier | Resolution | Uncompressed Sample Size | Usually On Disk |
|:-------:|:--------------------------------------------------------------------------------------------:|:------------------------:|:---------------:|
@ -40,11 +40,11 @@ gantt
## `ram`
`ram` mode can help when Netdata should not introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while they are collected.
`ram` mode can help when Netdata shouldnt introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while theyre collected.
When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes is used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance.
When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes are used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance.
The memory required per sample in these modes, is 4 bytes: `ram` mode uses `mmap()` behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode `ram` allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory.
The memory required per sample in these modes, is four bytes: `ram` mode uses `mmap()` behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode `ram` allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory.
**Configuring ram mode and retention**:

View file

@ -8,7 +8,7 @@ Netdata supports memory ballooning and automatically sizes and limits the memory
With default settings, Netdata should run with 100MB to 200MB of RAM, depending on the number of metrics being collected.
This number can be lowered by limiting the number of database tier or switching database modes. For more information check [Disk Requirements and Retention](/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md).
This number can be lowered by limiting the number of database tier or switching database modes. For more information, check [Disk Requirements and Retention](/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md).
## On Metrics Centralization Points, Netdata Parents
@ -20,7 +20,7 @@ memory = UNIQUE_METRICS x 16KiB + CONFIGURED_CACHES
The default `CONFIGURED_CACHES` is 32MiB.
For 1 million concurrently collected time-series (independently of their data collection frequency), the memory required is:
For one million concurrently collected time-series (independently of their data collection frequency), the memory required is:
```text
UNIQUE_METRICS = 1000000
@ -32,16 +32,16 @@ CONFIGURED_CACHES = 32MiB
about 16 GiB
```
There are 2 cache sizes that can be configured in `netdata.conf`:
There are two cache sizes that can be configured in `netdata.conf`:
1. `[db].dbengine page cache size`: this is the main cache that keeps metrics data into memory. When data are not found in it, the extent cache is consulted, and if not found in that either, they are loaded from disk.
2. `[db].dbengine extent cache size`: this is the compressed extent cache. It keeps in memory compressed data blocks, as they appear on disk, to avoid reading them again. Data found in the extend cache but not in the main cache have to be uncompressed to be queried.
1. `[db].dbengine page cache size`: this is the main cache that keeps metrics data into memory. When data is not found in it, the extent cache is consulted, and if not found in that too, they are loaded from the disk.
2. `[db].dbengine extent cache size`: this is the compressed extent cache. It keeps in memory compressed data blocks, as they appear on disk, to avoid reading them again. Data found in the extent cache but not in the main cache have to be uncompressed to be queried.
Both of them are dynamically adjusted to use some of the total memory computed above. The configuration in `netdata.conf` allows providing additional memory to them, increasing their caching efficiency.
## I have a Netdata Parent that is also a systemd-journal logs centralization point, what should I know?
Logs usually require significantly more disk space and I/O bandwidth than metrics. For optimal performance we recommend to store metrics and logs on separate, independent disks.
Logs usually require significantly more disk space and I/O bandwidth than metrics. For optimal performance, we recommend to store metrics and logs on separate, independent disks.
Netdata uses direct-I/O for its database, so that it does not pollute the system caches with its own data. We want Netdata to be a nice citizen when it runs side-by-side with production applications, so this was required to guarantee that Netdata does not affect the operation of databases or other sensitive applications running on the same servers.
@ -49,9 +49,9 @@ To optimize disk I/O, Netdata maintains its own private caches. The default sett
`systemd-journal` on the other hand, relies on operating system caches for improving the query performance of logs. When the system lacks free memory, querying logs leads to increased disk I/O.
If you are experiencing slow responses and increased disk reads when metrics queries run, we suggest to dedicate some more RAM to Netdata.
If you are experiencing slow responses and increased disk reads when metrics queries run, we suggest dedicating some more RAM to Netdata.
We frequently see that the following strategy gives best results:
We frequently see that the following strategy gives the best results:
1. Start the Netdata Parent, send all the load you expect it to have and let it stabilize for a few hours. Netdata will now use the minimum memory it believes is required for smooth operation.
2. Check the available system memory.

View file

@ -1,6 +1,6 @@
# Netdata Agent Versions & Platforms
Netdata is evolving rapidly and new features are added at a constant pace. Therefore we have a frequent release cadence to deliver all these features to use as soon as possible.
Netdata is evolving rapidly and new features are added at a constant pace. Therefore, we have a frequent release cadence to deliver all these features to use as soon as possible.
Netdata Agents are available in 2 versions:
@ -58,9 +58,9 @@ The following builds from source should usually work, although we don't regularl
## Static Builds and Unsupported Linux Versions
The static builds of Netdata can be used on any Linux platform of the supported architectures. The only requirement these static builds have is a working Linux kernel, any version. Everything else required for Netdata to run, is inside the package itself.
The static builds of Netdata can be used on any Linux platform of the supported architectures. The only requirement these static builds have is a working Linux kernel, any version. Everything else required for Netdata to run is inside the package itself.
Static builds usually miss certain features that require operating-system support and cannot be provided in a generic way. These features include:
Static builds usually miss certain features that require operating-system support and cant be provided generically. These features include:
- IPMI hardware sensors support
- systemd-journal features