mirror of
https://github.com/netdata/netdata.git
synced 2025-04-16 02:24:15 +00:00
remove stale docs, and update links and optimization documentation (#19089)
* remove stale docs, and update links and optimization documentation * typos * simplify --------- Co-authored-by: ilyam8 <ilya@netdata.cloud>
This commit is contained in:
parent
5247096c85
commit
cf73b36a97
5 changed files with 54 additions and 470 deletions
docs
deployment-guides
netdata-agent/configuration
|
@ -12,7 +12,7 @@ The sections below go into configuration examples about these deployment concept
|
|||
|
||||
### Stand-alone
|
||||
|
||||
The stand-alone setup is configured out of the box with reasonable defaults, but please consult our [configuration documentation](/docs/netdata-agent/configuration/README.md) for details, including the overview of [common configuration changes](/docs/netdata-agent/configuration/common-configuration-changes.md).
|
||||
The stand-alone setup is configured out of the box with reasonable defaults, but please consult our [configuration documentation](/docs/netdata-agent/configuration/README.md) for more details.
|
||||
|
||||
### Parent – Child
|
||||
|
||||
|
|
|
@ -1,113 +0,0 @@
|
|||
# Useful management and configuration actions
|
||||
|
||||
Below are some of the most common actions one can take while using Netdata. You can use this page as a quick reference for installing Netdata, connecting a node to the Cloud, properly editing the configuration, accessing Netdata's API, and more!
|
||||
|
||||
## Install Netdata
|
||||
|
||||
```bash
|
||||
wget -O /tmp/netdata-kickstart.sh https://get.netdata.cloud/kickstart.sh && sh /tmp/netdata-kickstart.sh
|
||||
|
||||
# Or, if you have cURL but not wget (such as on macOS):
|
||||
curl https://get.netdata.cloud/kickstart.sh > /tmp/netdata-kickstart.sh && sh /tmp/netdata-kickstart.sh
|
||||
```
|
||||
|
||||
### Connect a node to Netdata Cloud
|
||||
|
||||
To do so, sign in to Netdata Cloud, on your Space under the Nodes tab, click `Add Nodes` and paste the provided command into your node’s terminal and run it.
|
||||
You can also copy the Claim token and pass it to the installation script with `--claim-token` and re-run it.
|
||||
|
||||
## Configuration
|
||||
|
||||
**Netdata's config directory** is `/etc/netdata/` but in some operating systems it might be `/opt/netdata/etc/netdata/`.
|
||||
Look for the `# config directory =` line over at `http://NODE_IP:19999/netdata.conf` to find your config directory.
|
||||
|
||||
From within that directory you can run `sudo ./edit-config netdata.conf` **to edit Netdata's configuration.**
|
||||
You can edit other config files too, by specifying their filename after `./edit-config`.
|
||||
You are expected to use this method in all following configuration changes.
|
||||
|
||||
### Enable/disable plugins (groups of collectors)
|
||||
|
||||
```bash
|
||||
sudo ./edit-config netdata.conf
|
||||
```
|
||||
|
||||
```text
|
||||
[plugins]
|
||||
go.d = yes # enabled
|
||||
node.d = no # disabled
|
||||
```
|
||||
|
||||
### Enable/disable specific collectors
|
||||
|
||||
```bash
|
||||
sudo ./edit-config go.d.conf # edit a plugin's config
|
||||
```
|
||||
|
||||
```yaml
|
||||
modules:
|
||||
activemq: no # disabled
|
||||
cockroachdb: yes # enabled
|
||||
```
|
||||
|
||||
### Edit a collector's config
|
||||
|
||||
```bash
|
||||
sudo ./edit-config go.d/mysql.conf
|
||||
```
|
||||
|
||||
## Alerts & notifications
|
||||
|
||||
After any change, reload the Netdata health configuration:
|
||||
|
||||
```bash
|
||||
netdatacli reload-health
|
||||
#or if that command doesn't work on your installation, use:
|
||||
killall -USR2 netdata
|
||||
```
|
||||
|
||||
### Configure a specific alert
|
||||
|
||||
```bash
|
||||
sudo ./edit-config health.d/example-alert.conf
|
||||
```
|
||||
|
||||
### Silence a specific alert
|
||||
|
||||
```bash
|
||||
sudo ./edit-config health.d/example-alert.conf
|
||||
```
|
||||
|
||||
```text
|
||||
to: silent
|
||||
```
|
||||
|
||||
## Manage the daemon
|
||||
|
||||
| Intent | Action |
|
||||
|:----------------------------|------------------------------------------------------------:|
|
||||
| Start Netdata | `$ sudo service netdata start` |
|
||||
| Stop Netdata | `$ sudo service netdata stop` |
|
||||
| Restart Netdata | `$ sudo service netdata restart` |
|
||||
| Reload health configuration | `$ sudo netdatacli reload-health` `$ killall -USR2 netdata` |
|
||||
| View error logs | `less /var/log/netdata/error.log` |
|
||||
| View collectors logs | `less /var/log/netdata/collector.log` |
|
||||
|
||||
### Change the port Netdata listens to (example, set it to port 39999)
|
||||
|
||||
```text
|
||||
[web]
|
||||
default port = 39999
|
||||
```
|
||||
|
||||
## See metrics and dashboards
|
||||
|
||||
### Netdata Cloud: `https://app.netdata.cloud`
|
||||
|
||||
### Local dashboard: `https://NODE:19999`
|
||||
|
||||
> Replace `NODE` with the IP address or hostname of your node. Often `localhost`.
|
||||
|
||||
## Access the Netdata API
|
||||
|
||||
You can access the API like this: `http://NODE:19999/api/VERSION/REQUEST`.
|
||||
If you want to take a look at all the API requests, check our API page at <https://learn.netdata.cloud/api>
|
|
@ -1,115 +0,0 @@
|
|||
# Common configuration changes
|
||||
|
||||
The Netdata Agent requires no configuration upon installation to collect thousands of per-second metrics from most
|
||||
systems, containers, and applications, but there are hundreds of settings to tweak if you want to exercise more control
|
||||
over your monitoring platform.
|
||||
|
||||
This document assumes familiarity with
|
||||
using [`edit-config`](/docs/netdata-agent/configuration/README.md) from the Netdata config
|
||||
directory.
|
||||
|
||||
## Change dashboards and visualizations
|
||||
|
||||
The Netdata Agent's [local dashboard](/docs/dashboards-and-charts/README.md), accessible
|
||||
at `http://NODE:19999` is highly configurable. If
|
||||
you use [Netdata Cloud](/docs/netdata-cloud/README.md)
|
||||
for infrastructure monitoring, you
|
||||
will see many of these
|
||||
changes reflected in those visualizations due to the way Netdata Cloud proxies metric data and metadata to your browser.
|
||||
|
||||
### Increase the long-term metrics retention period
|
||||
|
||||
Read our doc on [increasing long-term metrics storage](/src/database/CONFIGURATION.md#tiers) for details.
|
||||
|
||||
## Modify alerts and notifications
|
||||
|
||||
Netdata's health monitoring watchdog uses hundreds of pre-configured health entities, with intelligent thresholds, to
|
||||
generate warning and critical alerts for most production systems and their applications without configuration. However,
|
||||
each alert and notification method is completely customizable.
|
||||
|
||||
### Add a new alert
|
||||
|
||||
To create a new alert configuration file, initiate an empty file, with a filename that ends in `.conf`, in the
|
||||
`health.d/` directory. The Netdata Agent loads any valid alert configuration file ending in `.conf` in that directory.
|
||||
Next, edit the new file with `edit-config`. For example, with a file called `example-alert.conf`.
|
||||
|
||||
```bash
|
||||
sudo touch health.d/example-alert.conf
|
||||
sudo ./edit-config health.d/example-alert.conf
|
||||
```
|
||||
|
||||
Or, append your new alert to an existing file by editing a relevant existing file in the `health.d/` directory.
|
||||
|
||||
Read more about [configuring alerts](/src/health/REFERENCE.md) to
|
||||
get started, and see
|
||||
the [health monitoring reference](/src/health/REFERENCE.md) for a full listing
|
||||
of options available in health entities.
|
||||
|
||||
### Configure a specific alert
|
||||
|
||||
Tweak existing alerts by editing files in the `health.d/` directory. For example, edit `health.d/cpu.conf` to change how
|
||||
the Agent responds to anomalies related to CPU utilization.
|
||||
|
||||
To see which configuration file you need to edit to configure a specific
|
||||
alert, [view your active alerts](/docs/dashboards-and-charts/alerts-tab.md) in
|
||||
Netdata Cloud or the local Agent dashboard and look for the **source** line. For example, it might
|
||||
read `source 4@/usr/lib/netdata/conf.d/health.d/cpu.conf`.
|
||||
|
||||
Because the source path contains `health.d/cpu.conf`, run `sudo edit-config health.d/cpu.conf` to configure that alert.
|
||||
|
||||
### Disable a specific alert
|
||||
|
||||
Open the configuration file for that alert and set the `to` line to `silent`.
|
||||
|
||||
```text
|
||||
template: disk_fill_rate
|
||||
on: disk.space
|
||||
lookup: max -1s at -30m unaligned of avail
|
||||
calc: ($this - $avail) / (30 * 60)
|
||||
every: 15s
|
||||
to: silent
|
||||
```
|
||||
|
||||
### Turn of all alerts and notifications
|
||||
|
||||
Set `enabled` to `no` in
|
||||
the [`[health]`](/src/daemon/config/README.md#health-section-options)
|
||||
section of `netdata.conf`.
|
||||
|
||||
### Enable alert notifications
|
||||
|
||||
Open `health_alarm_notify.conf` for editing. First, read the [enabling notifications](/src/health/notifications/README.md) doc
|
||||
for an example of the process using Slack, then
|
||||
click on the link to your preferred notification method to find documentation for that specific endpoint.
|
||||
|
||||
## Improve node security
|
||||
|
||||
While the Netdata Agent is both [open and secure by design](https://www.netdata.cloud/blog/netdata-agent-dashboard/), we
|
||||
recommend every user take some action to administer and secure their nodes.
|
||||
|
||||
Learn more about the available options in the [security design documentation](/docs/security-and-privacy-design/README.md).
|
||||
|
||||
## Reduce resource usage
|
||||
|
||||
Read
|
||||
our [performance optimization guide](/docs/netdata-agent/configuration/optimize-the-netdata-agents-performance.md)
|
||||
for a long list of specific changes
|
||||
that can reduce the Netdata Agent's CPU/memory footprint and IO requirements.
|
||||
|
||||
## Organize nodes with host labels
|
||||
|
||||
Beginning with v1.20, Netdata accepts user-defined **host labels**. These labels are sent during streaming, exporting,
|
||||
and as metadata to Netdata Cloud, and help you organize the metrics coming from complex infrastructure. Host labels are
|
||||
defined in the section `[host labels]`.
|
||||
|
||||
For a quick introduction, read
|
||||
the [host label guide](/docs/netdata-agent/configuration/organize-systems-metrics-and-alerts.md).
|
||||
|
||||
The following restrictions apply to host label names:
|
||||
|
||||
- Names cannot start with `_`, but it can be present in other parts of the name.
|
||||
- Names only accept alphabet letters, numbers, dots, and dashes.
|
||||
|
||||
The policy for values is more flexible, but you cannot use exclamation marks (`!`), whitespaces (` `), single quotes
|
||||
(`'`), double quotes (`"`), or asterisks (`*`), because they are used to compare label values in health alerts and
|
||||
templates.
|
|
@ -1,260 +1,71 @@
|
|||
# How to optimize the Netdata Agent's performance
|
||||
# Agent Performance Optimization Guide
|
||||
|
||||
We designed the Netdata Agent to be incredibly lightweight, even when it's collecting a few thousand dimensions every
|
||||
second and visualizing that data into hundreds of charts. However, the default settings of the Netdata Agent aren’t
|
||||
optimized for performance, but for a simple, standalone setup. We want the first installation to give you something you can
|
||||
run without any configuration. Most of the settings and options are enabled since we want you to experience the full
|
||||
thing.
|
||||
While Netdata Agents prioritize simplicity and out-of-the-box functionality, their default configuration focuses on comprehensive monitoring rather than performance optimization.
|
||||
|
||||
By default, Netdata will automatically detect applications running on the node it is installed to start collecting
|
||||
metrics in real-time, has health monitoring enabled to evaluate alerts and trains Machine Learning (ML) models for each
|
||||
metric, to detect anomalies.
|
||||
By default, Agents provide:
|
||||
|
||||
This document describes the resources required for the various default capabilities and the strategies to optimize
|
||||
Netdata for production use.
|
||||
- **Automatic Application Discovery**: Continuously detects and monitors applications running on your node without manual configuration.
|
||||
- **Real-time Metric Collection**: Collects metrics with one-second granularity.
|
||||
- **Health Monitoring**: Actively tracks the health status of your applications and system components with built-in alerting.
|
||||
- **Machine Learning**: Trains models for each metric to detect anomalies and unusual patterns in your system's behavior ([Anomaly Detection](/src/ml/README.md)).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> For details about Agent resource requirements, see [Resource Utilization](/docs/netdata-agent/sizing-netdata-agents/README.md).
|
||||
|
||||
This document describes various strategies to optimize Netdata's performance for your specific monitoring needs.
|
||||
|
||||
## Summary of performance optimizations
|
||||
|
||||
The following table summarizes the effect of each optimization on the CPU, RAM and Disk IO utilization in production.
|
||||
|
||||
| Optimization | CPU | RAM | Disk IO |
|
||||
|------------------------------------------------------------------------------------------------------------|--------------------|--------------------|--------------------|
|
||||
| [Use streaming and replication](#use-streaming-and-replication) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Disable unneeded plugins or collectors](#disable-unneeded-plugins-or-collectors) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Reduce data collection frequency](#reduce-collection-frequency) | :heavy_check_mark: | | :heavy_check_mark: |
|
||||
| [Change how long Netdata stores metrics](/src/database/CONFIGURATION.md#tiers) | | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Use a different metric storage database](/src/database/CONFIGURATION.md) | | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Disable machine learning](#disable-machine-learning) | :heavy_check_mark: | | |
|
||||
| [Use a reverse proxy](#run-netdata-behind-a-proxy) | :heavy_check_mark: | | |
|
||||
| [Disable/lower gzip compression for the Agent dashboard](#disablelower-gzip-compression-for-the-dashboard) | :heavy_check_mark: | | |
|
||||
| Optimization | CPU | RAM | Disk IO |
|
||||
|---------------------------------------------------------------------------|--------------------|--------------------|--------------------|
|
||||
| [Implement Centralization Points](#implement-centralization-points) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Disable Plugins or Collectors](#disable-plugins-or-collectors) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Adjust data collection frequency](#adjust-data-collection-frequency) | :heavy_check_mark: | | :heavy_check_mark: |
|
||||
| [Optimize metric retention settings](#optimize-metric-retention-settings) | | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Select appropriate Database Mode](#select-appropriate-database-mode) | | :heavy_check_mark: | :heavy_check_mark: |
|
||||
| [Disable ML on Children](#disable-machine-learning-on-children) | :heavy_check_mark: | | |
|
||||
|
||||
## Resources required by a default Netdata installation
|
||||
## Implement Centralization Points
|
||||
|
||||
Netdata's performance is primarily affected by **data collection/retention** and **clients accessing data**.
|
||||
In production environments, use Parent nodes as centralization points to collect and aggregate data from Child nodes across your infrastructure. This architecture follows our recommended [Centralization Points](/docs/observability-centralization-points/README.md) pattern.
|
||||
|
||||
You can configure almost all aspects of data collection/retention, and certain aspects of clients accessing data.
|
||||
## Disable Plugins or Collectors
|
||||
|
||||
### CPU consumption
|
||||
You can improve Agent performance by selectively disabling [Plugins or Collectors](/src/collectors/README.md) that you don't need for your monitoring requirements.
|
||||
|
||||
Expect about:
|
||||
> **Note**
|
||||
>
|
||||
> Inactive Plugins and Collectors automatically shut down and don't consume system resources. Performance benefits come only from disabling those that are actively collecting metrics.
|
||||
|
||||
- 1-3% of a single core for the netdata core
|
||||
- 1-3% of a single core for the various collectors (e.g., go.d.plugin, apps.plugin)
|
||||
- 5-10% of a single core, when ML training runs
|
||||
For detailed instructions on managing Plugins and Collectors, see [configuration guide](/src/collectors/REFERENCE.md).
|
||||
|
||||
Your experience may vary depending on the number of metrics collected, the collectors enabled and the specific
|
||||
environment they run on, i.e., the work they have to do to collect these metrics.
|
||||
## Adjust Data Collection Frequency
|
||||
|
||||
As a general rule, for modern hardware and VMs, the total CPU consumption of a standalone Netdata installation,
|
||||
including all its components, should be below 5 - 15% of a single core. For example, on 8 core server it will use only
|
||||
0.6% - 1.8% of the total CPU capacity, depending on the CPU characteristics.
|
||||
One of the most effective ways to reduce the Agent's resource consumption is to modify its data collection frequency.
|
||||
|
||||
The Netdata Agent runs with the lowest
|
||||
possible [process scheduling policy](/src/daemon/README.md#netdata-process-scheduling-policy),
|
||||
which is `nice 19`, and uses the `idle` process scheduler. Together, these settings ensure that the Agent only gets CPU
|
||||
resources when the node has CPU resources to space. If the node reaches 100% CPU utilization, the Agent is stopped first
|
||||
to ensure your applications get any available resources.
|
||||
If you don't require per-second precision, or if your Agent is consuming excessive CPU during periods of low dashboard activity, you can [reduce the collection frequency](/src/collectors/REFERENCE.md).
|
||||
This adjustment can significantly improve CPU utilization while maintaining meaningful monitoring capabilities.
|
||||
|
||||
To reduce CPU usage, you can (either one or a combination of the following actions):
|
||||
## Optimize Metric Retention Settings
|
||||
|
||||
1. [Disable machine learning](#disable-machine-learning),
|
||||
2. [Use streaming and replication](#use-streaming-and-replication),
|
||||
3. [Reduce the data collection frequency](#reduce-collection-frequency)
|
||||
4. [Disable unneeded plugins or collectors](#disable-unneeded-plugins-or-collectors)
|
||||
5. [Use a reverse proxy](#run-netdata-behind-a-proxy),
|
||||
6. [Disable/lower gzip compression for the Agent dashboard](#disablelower-gzip-compression-for-the-dashboard).
|
||||
You can reduce memory and disk usage by adjusting [how long the Agent stores metrics](/src/database/CONFIGURATION.md).
|
||||
|
||||
### Memory consumption
|
||||
## Select Appropriate Database Mode
|
||||
|
||||
The memory footprint of Netdata is mainly influenced by the number of metrics concurrently being collected. Expect about
|
||||
150MB of RAM for a typical 64-bit server collecting about 2000 to 3000 metrics.
|
||||
For IoT devices and Child nodes in [Centralization Point setups](/docs/observability-centralization-points/README.md), you can optimize performance by [switching to RAM mode](/src/database/CONFIGURATION.md). This can significantly reduce resource usage while maintaining essential monitoring capabilities.
|
||||
|
||||
To estimate and control memory consumption, you can (either one or a combination of the following actions):
|
||||
## Disable Machine Learning on Children
|
||||
|
||||
1. [Disable unneeded plugins or collectors](#disable-unneeded-plugins-or-collectors)
|
||||
2. [Change how long Netdata stores metrics](/src/database/CONFIGURATION.md#tiers)
|
||||
3. [Use a different metric storage database](/src/database/CONFIGURATION.md#modes).
|
||||
For optimal resource allocation, we recommend running Machine Learning only on Parent nodes, or on systems with sufficient CPU and memory capacity.
|
||||
|
||||
### Disk footprint and I/O
|
||||
|
||||
By default, Netdata shouldn’t use more than 1GB of disk space, most of which is dedicated to storing metric data and
|
||||
metadata. For typical installations collecting 2000–3000 metrics, this storage should provide a few days of
|
||||
high-resolution retention (per second), about a month of mid-resolution retention (per minute) and more than a year of
|
||||
low-resolution retention (per hour).
|
||||
|
||||
Netdata spreads I/O operations across time. For typical standalone installations, there should be a few write operations
|
||||
every 5–10 seconds of a few kilobytes each, occasionally up to 1MB. In addition, under a heavy load, collectors that
|
||||
require disk I/O may stop and show gaps in charts.
|
||||
|
||||
To optimize your disk footprint in any aspect described below, you can:
|
||||
|
||||
To configure retention, you can:
|
||||
|
||||
1. [Change how long Netdata stores metrics](/src/database/CONFIGURATION.md#tiers).
|
||||
|
||||
To control disk I/O:
|
||||
|
||||
1. [Use a different metric storage database](/src/database/CONFIGURATION.md),
|
||||
|
||||
Minimize deployment impact on the production system by optimizing disk footprint:
|
||||
|
||||
1. [Using streaming and replication](#use-streaming-and-replication)
|
||||
2. [Reduce the data collection frequency](#reduce-collection-frequency)
|
||||
3. [Disable unneeded plugins or collectors](#disable-unneeded-plugins-or-collectors).
|
||||
|
||||
## Use streaming and replication
|
||||
|
||||
For all production environments, parent Netdata nodes outside the production infrastructure should be receiving all
|
||||
collected data from children Netdata nodes running on the production infrastructure,
|
||||
using [streaming and replication](/docs/observability-centralization-points/README.md).
|
||||
|
||||
### Disable health checks on the child nodes
|
||||
|
||||
When you set up streaming, we recommend you run your health checks on the parent. This saves resources on the children
|
||||
and makes it easier to configure or disable alerts and Agent notifications.
|
||||
|
||||
The parents by default run health checks for each child, as long as the child is connected (the details are
|
||||
in `stream.conf`). On the child nodes you should add to `netdata.conf` the following:
|
||||
|
||||
```text
|
||||
[health]
|
||||
enabled = no
|
||||
```
|
||||
|
||||
### Use memory mode ram for the child nodes
|
||||
|
||||
See [using a different metric storage database](/src/database/README.md#modes).
|
||||
|
||||
## Disable unneeded plugins or collectors
|
||||
|
||||
If you know that you don't need an [entire plugin or a specific collector](/src/collectors/README.md),
|
||||
you can disable any of them. Keep in mind that if a plugin/collector has nothing to do, it simply shuts down and doesn’t consume system resources. You will only improve the Agent's performance by disabling plugins/collectors that are
|
||||
actively collecting metrics.
|
||||
|
||||
Open `netdata.conf` and scroll down to the `[plugins]` section. To disable any plugin, uncomment it and set the value to
|
||||
`no`. For example, to explicitly keep the `proc` and `go.d` plugins enabled while disabling `python.d` and `charts.d`.
|
||||
|
||||
```text
|
||||
[plugins]
|
||||
proc = yes
|
||||
python.d = no
|
||||
charts.d = no
|
||||
go.d = yes
|
||||
```
|
||||
|
||||
Disable specific collectors by opening their respective plugin configuration files, uncommenting the line for the
|
||||
collector, and setting its value to `no`.
|
||||
|
||||
```bash
|
||||
sudo ./edit-config go.d.conf
|
||||
sudo ./edit-config python.d.conf
|
||||
sudo ./edit-config charts.d.conf
|
||||
```
|
||||
|
||||
For example, to disable a few Python collectors:
|
||||
|
||||
```text
|
||||
modules:
|
||||
apache: no
|
||||
dockerd: no
|
||||
fail2ban: no
|
||||
```
|
||||
|
||||
## Reduce collection frequency
|
||||
|
||||
The fastest way to improve the Agent's resource utilization is to reduce how often it collects metrics.
|
||||
|
||||
### Global
|
||||
|
||||
If you don't need per-second metrics, or if the Netdata Agent uses a lot of CPU even when no one is viewing that node's
|
||||
dashboard, [configure the Agent](/docs/netdata-agent/configuration/README.md) to collect
|
||||
metrics less often.
|
||||
|
||||
Open `netdata.conf` and edit the `update every` setting. The default is `1`, meaning that the Agent collects metrics
|
||||
every second.
|
||||
|
||||
If you change this to `2`, Netdata enforces a minimum `update every` setting of 2 seconds, and collects metrics every
|
||||
other second, which will effectively halve CPU utilization. Set this to `5` or `10` to collect metrics every 5 or 10
|
||||
seconds, respectively.
|
||||
|
||||
```text
|
||||
[global]
|
||||
update every = 5
|
||||
```
|
||||
|
||||
### Specific plugin or collector
|
||||
|
||||
Every collector and plugin has its own `update every` setting, which you can also change in the `go.d.conf`,
|
||||
`python.d.conf`, or `charts.d.conf` files, or in individual collector configuration files. If the `update
|
||||
every` for an individual collector is less than the global, the Netdata Agent uses the global setting. See
|
||||
the [collectors configuration reference](/src/collectors/REFERENCE.md) for
|
||||
details.
|
||||
|
||||
To reduce the frequency of an [internal_plugin/collector](/src/collectors/README.md),
|
||||
open `netdata.conf` and find the appropriate section. For example, to reduce the frequency of the `apps` plugin, which
|
||||
collects and visualizes metrics on application resource utilization:
|
||||
|
||||
```text
|
||||
[plugin:apps]
|
||||
update every = 5
|
||||
```
|
||||
|
||||
To configure an individual collector,
|
||||
open its specific configuration file with `edit-config` and look for the `update_every` setting. For example, to reduce
|
||||
the frequency of the `nginx` collector, run `sudo ./edit-config go.d/nginx.conf`:
|
||||
|
||||
```text
|
||||
# [ GLOBAL ]
|
||||
update_every: 10
|
||||
```
|
||||
|
||||
## Lower memory usage for metrics retention
|
||||
|
||||
See how
|
||||
to [change how long Netdata stores metrics](/src/database/CONFIGURATION.md#tiers).
|
||||
|
||||
## Use a different metric storage database
|
||||
|
||||
Consider [using a different metric storage database](/src/database/README.md#modes)
|
||||
when running Netdata on IoT devices, and for children in a parent-child set up based
|
||||
on [streaming and replication](/docs/observability-centralization-points/README.md).
|
||||
|
||||
## Disable machine learning
|
||||
|
||||
Automated anomaly detection may be a powerful tool, but we recommend it to only be enabled on Netdata parents that sit
|
||||
outside your production infrastructure, or if you have cpu and memory to spare. You can disable ML with the following:
|
||||
To reduce resource usage on Child nodes or less powerful systems, you can disable ML by modifying `netdata.conf` using [`edit-config`](/docs/netdata-agent/configuration/README.md#edit-a-configuration-file-using-edit-config):
|
||||
|
||||
```text
|
||||
[ml]
|
||||
enabled = no
|
||||
```
|
||||
|
||||
## Run Netdata behind a proxy
|
||||
|
||||
A dedicated web server like nginx provides more robustness than the Agent's
|
||||
internal [web server](/src/web/README.md).
|
||||
Nginx can handle more concurrent connections, reuse idle connections, and use fast gzip compression to reduce payloads.
|
||||
|
||||
For details on installing another web server as a proxy for the local Agent dashboard,
|
||||
see [reverse proxies](/docs/netdata-agent/configuration/running-the-netdata-agent-behind-a-reverse-proxy/README.md).
|
||||
|
||||
## Disable/lower gzip compression for the dashboard
|
||||
|
||||
If you choose not to run the Agent behind Nginx, you can disable or lower the Agent's web server's gzip compression.
|
||||
While gzip compression does reduce the size of the HTML/CSS/JS payload, it does use additional CPU while a user is
|
||||
looking at the local Agent dashboard.
|
||||
|
||||
To disable gzip compression, open `netdata.conf` and find the `[web]` section:
|
||||
|
||||
```text
|
||||
[web]
|
||||
enable gzip compression = no
|
||||
```
|
||||
|
||||
Or to lower the default compression level:
|
||||
|
||||
```text
|
||||
[web]
|
||||
enable gzip compression = yes
|
||||
gzip compression level = 1
|
||||
```
|
||||
This configuration is particularly beneficial for Child nodes since their primary role is to collect and stream metrics to Parent nodes, where ML analysis can be performed centrally.
|
||||
|
|
|
@ -19,7 +19,7 @@ in all the UI, dashboards, tabs, filters, etc. For example, you can create a vir
|
|||
and monitor them as discrete entities. Virtual nodes can help you simplify your infrastructure monitoring and focus on the
|
||||
individual node that matters.
|
||||
|
||||
To define your windows server as a virtual node you need to:
|
||||
To define your Windows server as a Virtual Node, you need to:
|
||||
|
||||
* Define virtual nodes in `/etc/netdata/vnodes/vnodes.conf`
|
||||
|
||||
|
@ -28,7 +28,7 @@ To define your windows server as a virtual node you need to:
|
|||
guid: <value>
|
||||
```
|
||||
|
||||
Just remember to use a valid guid (On Linux you can use `uuidgen` command to generate one, on Windows just use the `[guid]::NewGuid()` command in PowerShell)
|
||||
Remember to use a valid guid (On Linux you can use `uuidgen` command to generate one, on Windows use the `[guid]::NewGuid()` command in PowerShell)
|
||||
|
||||
* Add the vnode config to the data collection job. e.g., in `go.d/windows.conf`:
|
||||
|
||||
|
@ -61,12 +61,12 @@ They capture the following:
|
|||
* Kernel version
|
||||
* Operating system name and version
|
||||
* CPU architecture, system cores, CPU frequency, RAM, and disk space
|
||||
* Whether Netdata is running inside of a container, and if so, the OS and hardware details about the container's host
|
||||
* Whether Netdata is running inside a container, and if so, the OS and hardware details about the container's host
|
||||
* Whether Netdata is running inside K8s node
|
||||
* What virtualization layer the system runs on top of, if any
|
||||
* Whether the system is a streaming parent or child
|
||||
|
||||
If you want to organize your systems without manually creating host labels, try the automatic labels in some of the
|
||||
If you want to organize your systems without manually creating host labels, try the automatic labels in some
|
||||
features below. You can see them under `http://HOST-IP:19999/api/v1/info`, beginning with an underscore `_`.
|
||||
|
||||
```json
|
||||
|
@ -87,8 +87,12 @@ cd /etc/netdata # Replace this path with your Netdata config directory, if dif
|
|||
sudo ./edit-config netdata.conf
|
||||
```
|
||||
|
||||
Create a new `[host labels]` section defining a new host label and its value for the system in question. Make sure not
|
||||
to violate any of the [host label naming rules](/docs/netdata-agent/configuration/common-configuration-changes.md#organize-nodes-with-host-labels).
|
||||
Create a new `[host labels]` section defining a new host label and its value for the system in question. Make sure not to violate any of the host label naming rules:
|
||||
|
||||
* Names can’t start with `_`, but it can be present in other parts of the name.
|
||||
* Names only accept alphabet letters, numbers, dots, and dashes.
|
||||
|
||||
The policy for values is more flexible, but you can’t use exclamation marks (`!`), whitespaces (` `), single quotes (`'`), double quotes (`"`), or asterisks (`*`), because they’re used to compare label values in health alerts and templates.
|
||||
|
||||
```text
|
||||
[host labels]
|
||||
|
@ -138,10 +142,7 @@ Now, if you'd like to remind yourself of how much RAM a certain child node has,
|
|||
`http://localhost:19999/host/CHILD_HOSTNAME/api/v1/info` and reference the automatically generated host labels from the
|
||||
child system. It's a vastly simplified way of accessing critical information about your infrastructure.
|
||||
|
||||
> ⚠️ Because automatic labels for child nodes are accessible via API calls, and contain sensitive information like
|
||||
> kernel and operating system versions, you should secure streaming connections with SSL. See the [streaming documentation](/src/streaming/README.md#securing-streaming-with-tlsssl) for details. You may also want to use
|
||||
> [access lists](/src/web/server/README.md#access-lists) or [expose the API only to LAN/localhost
|
||||
> connections](/docs/netdata-agent/securing-netdata-agents.md#expose-netdata-only-in-a-private-lan).
|
||||
> ⚠️ Because automatic labels for child nodes are accessible via API calls, and contain sensitive information like kernel and operating system versions, you should secure streaming connections with SSL. See the [streaming documentation](/src/streaming/README.md#securing-streaming-with-tlsssl) for details. You may also want to use [access lists](/src/web/server/README.md#access-lists) or [expose the API only to LAN/localhost connections](/docs/netdata-agent/securing-netdata-agents.md#restrict-dashboard-access-to-private-lan).
|
||||
|
||||
You can also use `_is_parent`, `_is_child`, and any other host labels in both health entities and metrics
|
||||
exporting. Speaking of which...
|
||||
|
@ -228,7 +229,7 @@ more about exporting, read the [documentation](/src/exporting/README.md).
|
|||
The Netdata aggregate charts allow you to filter and group metrics based on label name-value pairs.
|
||||
|
||||
All go.d plugin collectors support the specification of labels at the "collection job" level. Some collectors come without of the box
|
||||
labels (e.g. generic Prometheus collector, Kubernetes, Docker and more). But you can also add your own custom labels by configuring
|
||||
labels (e.g., generic Prometheus collector, Kubernetes, Docker and more). But you can also add your own custom labels by configuring
|
||||
the data collection jobs.
|
||||
|
||||
For example, suppose we have a single Netdata Agent, collecting data from two remote Apache web servers, located in different data centers.
|
||||
|
|
Loading…
Add table
Reference in a new issue