0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-06 22:38:55 +00:00

Docs: small fixes and pass on sizing Agents ()

* small fixes and pass on sizing Agents

* improvements

* grammar

* simplify innovations

* update title

---------

Co-authored-by: ilyam8 <ilya@netdata.cloud>
This commit is contained in:
Fotis Voutsas 2024-11-01 13:45:31 +02:00 committed by GitHub
parent 6cf1d17971
commit 4b8a945df9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 46 additions and 59 deletions
docs/netdata-agent/sizing-netdata-agents
packaging

View file

@ -1,89 +1,80 @@
# Sizing Netdata Agents
Netdata automatically adjusts its resources utilization based on the workload offered to it.
Netdata is designed to automatically adjust its resource consumption based on the specific workload.
This is a map of how Netdata **features impact resources utilization**:
This table shows the specific system resources affected by different Netdata features:
| Feature | CPU | RAM | Disk I/O | Disk Space | Retention | Bandwidth |
|-----------------------------:|:---:|:---:|:--------:|:----------:|:---------:|:---------:|
| Metrics collected | X | X | X | X | X | - |
| Samples collection frequency | X | - | X | X | X | - |
| Database mode and tiers | - | X | X | X | X | - |
| Machine learning | X | X | - | - | - | - |
| Streaming | X | X | - | - | - | X |
| Feature | CPU | RAM | Disk I/O | Disk Space | Network Traffic |
|------------------------:|:---:|:---:|:--------:|:----------:|:---------------:|
| Collected metrics | ✓ | ✓ | ✓ | ✓ | - |
| Sample frequency | ✓ | - | ✓ | ✓ | - |
| Database mode and tiers | - | ✓ | ✓ | ✓ | - |
| Machine learning | ✓ | ✓ | - | - | - |
| Streaming | ✓ | ✓ | - | - | ✓ |
1. **Metrics collected**: The number of metrics collected affects almost every aspect of resources utilization.
1. **Collected metrics**
When you need to lower the resources used by Netdata, this is an obvious first step.
- **Impact**: More metrics mean higher CPU, RAM, disk I/O, and disk space usage.
- **Optimization**: To reduce resource consumption, consider lowering the number of collected metrics by disabling unnecessary data collectors.
2. **Samples collection frequency**: By default Netdata collects metrics with 1-second granularity, unless the metrics collected are not updated that frequently, in which case Netdata collects them at the frequency they are updated. This is controlled per data collection job.
2. **Sample frequency**
Lowering the data collection frequency from every-second to every-2-seconds, will make Netdata use half the CPU utilization. So, CPU utilization is proportional to the data collection frequency.
- **Impact**: Netdata collects most metrics with 1-second granularity. This high frequency impacts CPU usage.
- **Optimization**: Lowering the sampling frequency (e.g., 1-second to 2-second intervals) can halve CPU usage. Balance the need for detailed data with resource efficiency.
3. **Database Mode and Tiers**: By default Netdata stores metrics in 3 database tiers: high-resolution, mid-resolution, low-resolution. All database tiers are updated in parallel during data collection, and depending on the query duration Netdata may consult one or more tiers to optimize the resources required to satisfy it.
3. **Database Mode and Tiers**
The number of database tiers affects the memory requirements of Netdata. Going from 3-tiers to 1-tier, will make Netdata use half the memory. Of course metrics retention will also be limited to 1 tier.
- **Impact**: The number of database tiers directly affects memory consumption. More tiers mean higher memory usage.
- **Optimization**: The default number of tiers is 3. Choose the appropriate number of tiers based on data retention requirements.
4. **Machine Learning**: Byt default Netdata trains multiple machine learning models for every metric collected, to learn its behavior and detect anomalies. Machine Learning is a CPU intensive process and affects the overall CPU utilization of Netdata.
4. **Machine Learning**
5. **Streaming Compression**: When using Netdata in Parent-Child configurations to create Metrics Centralization Points, the compression algorithm used greatly affects CPU utilization and bandwidth consumption.
- **Impact**: Machine learning model training is CPU-intensive, affecting overall CPU usage.
- **Optimization**: Consider disabling machine learning for less critical metrics or adjusting model training frequency.
Netdata supports multiple streaming compressions algorithms, allowing the optimization of either CPU utilization or Network Bandwidth. The default algorithm `zstd` provides the best balance among them.
5. **Streaming Compression**
- **Impact**: Compression algorithm choice affects CPU usage and network traffic.
- **Optimization**: Select an algorithm that balances CPU efficiency with network bandwidth requirements (e.g., zstd for a good balance).
## Minimizing the resources used by Netdata Agents
To minimize the resources used by Netdata Agents, we suggest to configure Netdata Parents for centralizing metric samples, and disabling most of the features on Netdata Children. This will provide minimal resources utilization at the edge, while all the features of Netdata are available at the Netdata Parents.
To optimize resource utilization, consider using a **Netdata Parent-Child** setup.
The following guides provide instructions on how to do this.
This approach involves centralizing the collection and processing of metrics on Netdata Parent nodes while running lightweight Netdata Child Agents on edge devices.
## Maximizing the scale of Netdata Parents
Netdata Parents automatically size resource utilization based on the workload they receive. The only possible option for improving query performance is to dedicate more RAM to them, by increasing their caches efficiency.
Netdata Parents dynamically adjust their resource usage based on the volume of metrics received. However, for optimal query performance, you may need to dedicate more RAM.
Check [RAM Requirements](/docs/netdata-agent/sizing-netdata-agents/ram-requirements.md) for more information.
## Innovations Netdata has for optimal performance and scalability
## Netdata's performance and scalability optimization techniques
The following are some of the innovations the open-source Netdata agent has, that contribute to its excellent performance, and scalability.
1. **Minimal Disk I/O**
1. **Minimal disk I/O**
Netdata directly writes metric data to disk, bypassing system caches and reducing I/O overhead. Additionally, its optimized data structures minimize disk space and memory usage through efficient compression and timestamping.
When Netdata saves data on-disk, it stores them at their final place, eliminating the need to reorganize this data.
2. **Compact Storage Engine**
Netdata is organizing its data structures in such a way that samples are committed to disk as evenly as possible across time, without affecting its memory requirements.
Netdata uses a custom 32-bit floating-point format tailored for efficient storage of time-series data, along with an anomaly bit. This, combined with a fixed-step database design, enables efficient storage and retrieval of data.
Furthermore, Netdata Agents use direct-I/O for saving and loading metric samples. This prevents Netdata from polluting system caches with metric data. Netdata maintains its own caches for this data.
| Tier | Approximate Sample Size (bytes) |
|-----------------------------------|---------------------------------|
| High-resolution tier (per-second) | 0.6 |
| Mid-resolution tier (per-minute) | 6 |
| Low-resolution tier (per-hour) | 18 |
All these features make Netdata an nice partner and a polite citizen for production applications running on the same systems Netdata runs.
Timestamp optimization further reduces storage overhead by storing timestamps at regular intervals.
2. **4 bytes per sample uncompressed**
3. **Intelligent Query Engine**
To achieve optimal memory and disk footprint, Netdata uses a custom 32-bit floating point number. This floating point number is used to store the samples collected, together with their anomaly bit. The database of Netdata is fixed-step, so it has predefined slots for every sample, allowing Netdata to store timestamps once every several hundreds samples, minimizing both its memory requirements and the disk footprint.
Netdata prioritizes interactive queries over background tasks like machine learning and replication, ensuring optimal user experience, especially under heavy load.
The final disk footprint of Netdata varies due to compression efficiency. It is usually about 0.6 bytes per sample for the high-resolution tier (per-second), 6 bytes per sample for the mid-resolution tier (per-minute) and 18 bytes per sample for the low-resolution tier (per-hour).
4. **Efficient Label Storage**
3. **Query priorities**
Netdata uses pointers to reference shared label key-value pairs, minimizing memory usage, especially in highly dynamic environments.
Alerting, Machine Learning, Streaming and Replication, rely on metric queries. When multiple queries are running in parallel, Netdata assigns priorities to all of them, favoring interactive queries over background tasks. This means that queries do not compete equally for resources. Machine learning or replication may slow down when interactive queries are running and the system starves for resources.
5. **Scalable Streaming Protocol**
4. **A pointer per label**
Apart from metric samples, metric labels and their cardinality is the biggest memory consumer, especially in highly ephemeral environments, like kubernetes. Netdata uses a single pointer for any label key-value pair that is reused. Keys and values are also deduplicated, providing the best possible memory footprint for metric labels.
5. **Streaming Protocol**
The streaming protocol of Netdata allows minimizing the resources consumed on production systems by delegating features of to other Netdata agents (Parents), without compromising monitoring fidelity or responsiveness, enabling the creation of a highly distributed observability platform.
## Netdata vs Prometheus
Netdata outperforms Prometheus in every aspect. -35% CPU Utilization, -49% RAM usage, -12% network bandwidth, -98% disk I/O, -75% in disk footprint for high resolution data, while providing more than a year of retention.
Read the [full comparison here](https://blog.netdata.cloud/netdata-vs-prometheus-performance-analysis/).
## Energy Efficiency
University of Amsterdam contacted a research on the impact monitoring systems have on docker based systems.
The study found that Netdata excels in CPU utilization, RAM usage, Execution Time and concluded that **Netdata is the most energy efficient tool**.
Read the [full study here](https://www.ivanomalavolta.com/files/papers/ICSOC_2023.pdf).
Netdata's streaming protocol enables the creation of distributed monitoring setups, where Netdata Agents (Children) offload data processing to Netdata Parents, optimizing resource utilization.

View file

@ -35,7 +35,7 @@ In most cases, these commands will guide you through the uninstallation process
If you installed Netdata with a custom prefix (different directory location), you may need to specify the original prefix during uninstallation with the `--old-install-prefix` option.
## Uninstalling manually
### Uninstalling manually
Most official installations of Netdata include an uninstaller script that can be manually invoked instead of using the kickstart script (internally, the kickstart script also uses this uninstaller script, it just handles the process outlined below for you).

View file

@ -62,7 +62,3 @@ Replace `<YOUR_TOKEN>` and `<YOUR_ROOM>` with your actual Netdata Cloud Space cl
> **Note**
>
> The Windows version of Netdata is intended for users on paid plans.
## Uninstalling
To uninstall Netdata, run the `uninstall.exe` file in your Netdata installation directory, typically `<YOUR_INSTALL_LOCATION>\Netdata`.