0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-16 10:31:07 +00:00

docs: simplify "Disk Requirements and Retention" ()

This commit is contained in:
Ilya Mashchenko 2024-07-21 20:47:32 +03:00 committed by GitHub
parent 67e4e7953e
commit d12c353ada
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -2,41 +2,17 @@
## Database Modes and Tiers
Netdata comes with 3 database modes:
Netdata offers two database modes to suit your needs for performance and data persistence:
1. `dbengine`: the default high-performance multi-tier database of Netdata. Metric samples are cached in memory and are saved to disk in multiple tiers, with compression.
2. `ram`: metric samples are stored in ring buffers in memory, with increments of 1024 samples. Metric samples are not committed to disk. Kernel-Same-Page (KSM) can be used to deduplicate Netdata's memory.
3. `alloc`: metric samples are stored in ring buffers in memory, with flexible increments. Metric samples are not committed to disk.
## `ram` and `alloc`
Modes `ram` and `alloc` can help when Netdata should not introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while they are collected.
When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes is used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance.
The memory required per sample in these modes, is 4 bytes:
- `ram` mode uses `mmap()` behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode `ram` allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory.
- `alloc` mode can be sized for any number of samples per metric. KSM cannot be used in this mode.
To configure database mode `ram` or `alloc`, in `netdata.conf`, set the following:
- `[db].mode` to either `ram` or `alloc`.
- `[db].retention` to the number of samples the ring buffers should maintain. For `ram` if the value set is not a multiple of 1024, the next multiple of 1024 will be used.
| Mode | Description |
|:------------------:|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| dbengine (default) | High-performance, multi-tier storage with compression. Metric samples are cached in memory and then written to disk in multiple tiers for efficient retrieval and long-term storage. |
| ram | In-memory storage. Metric samples are stored in memory only, and older data is overwritten as new data arrives. This mode prioritizes speed, making it ideal for Netdata Child instances that stream data to a central Netdata parent. |
## `dbengine`
`dbengine` supports up to 5 tiers. By default, 3 tiers are used, like this:
| Tier | Resolution | Uncompressed Sample Size | Usually On Disk |
|:--------:|:--------------------------------------------------------------------------------------------:|:------------------------:|:---------------:|
| `tier0` | native resolution (metrics collected per-second as stored per-second) | 4 bytes | 0.6 bytes |
| `tier1` | 60 iterations of `tier0`, so when metrics are collected per-second, this tier is per-minute. | 16 bytes | 6 bytes |
| `tier2` | 60 iterations of `tier1`, so when metrics are collected per second, this tier is per-hour. | 16 bytes | 18 bytes |
Data are saved to disk compressed, so the actual size on disk varies depending on compression efficiency.
`dbegnine` tiers are overlapping, so higher tiers include a down-sampled version of the samples in lower tiers:
Netdata's `dbengine` mode efficiently stores data on disk using compression. The actual disk space used depends on how well the data compresses.
This mode utilizes a tiered storage approach: data is saved in multiple tiers on disk. Each tier retains data at a different resolution (detail level). Higher tiers store a down-sampled (less detailed) version of the data found in lower tiers.
```mermaid
gantt
@ -49,83 +25,28 @@ gantt
tier2, 365d :a3, 2023-11-02, 59d
```
## Disk Space and Metrics Retention
`dbengine` supports up to 5 tiers. By default, 3 tiers are used:
You can find information about the current disk utilization of a Netdata Parent, at <http://agent-ip:19999/api/v2/info>. The output of this endpoint is like this:
| Tier | Resolution | Uncompressed Sample Size | Usually On Disk |
|:-------:|:--------------------------------------------------------------------------------------------:|:------------------------:|:---------------:|
| `tier0` | native resolution (metrics collected per-second as stored per-second) | 4 bytes | 0.6 bytes |
| `tier1` | 60 iterations of `tier0`, so when metrics are collected per-second, this tier is per-minute. | 16 bytes | 6 bytes |
| `tier2` | 60 iterations of `tier1`, so when metrics are collected per second, this tier is per-hour. | 16 bytes | 18 bytes |
```json
{
// more information about the agent
// then, near the end:
"db_size": [
{
"tier": 0,
"metrics": 43070,
"samples": 88078162001,
"disk_used": 41156409552,
"disk_max": 41943040000,
"disk_percent": 98.1245269,
"from": 1705033983,
"to": 1708856640,
"retention": 3822657,
"expected_retention": 3895720,
"currently_collected_metrics": 27424
},
{
"tier": 1,
"metrics": 72987,
"samples": 5155155269,
"disk_used": 20585157180,
"disk_max": 20971520000,
"disk_percent": 98.1576785,
"from": 1698287340,
"to": 1708856640,
"retention": 10569300,
"expected_retention": 10767675,
"currently_collected_metrics": 27424
},
{
"tier": 2,
"metrics": 148234,
"samples": 314919121,
"disk_used": 5957346684,
"disk_max": 10485760000,
"disk_percent": 56.8136853,
"from": 1667808000,
"to": 1708856640,
"retention": 41048640,
"expected_retention": 72251324,
"currently_collected_metrics": 27424
}
]
}
```
**Configuring dbengine mode and retention**:
In this example:
- Enable dbengine mode: The dbengine mode is already the default, so no configuration change is necessary. For reference, the dbengine mode can be configured by setting `[db].mode` to `dbengine` in `netdata.conf`.
- Adjust retention (optional): see [Change how long Netdata stores metrics](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md).
- `tier` is the database tier.
- `metrics` is the number of unique time-series in the database.
- `samples` is the number of samples in the database.
- `disk_used` is the currently used disk space in bytes.
- `disk_max` is the configured max disk space in bytes.
- `disk_percent` is the current disk space utilization for this tier.
- `from` is the first (oldest) timestamp in the database for this tier.
- `to` is the latest (newest) timestamp in the database for this tier.
- `retention` is the current retention of the database for this tier, in seconds (divide by 3600 for hours, divide by 86400 for days).
- `expected_retention` is the expected retention in seconds when `disk_percent` will be 100 (divide by 3600 for hours, divide by 86400 for days).
- `currently_collected_metrics` is the number of unique time-series currently being collected for this tier.
## `ram`
So, for our example above:
`ram` mode can help when Netdata should not introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while they are collected.
| Tier | # Of Metrics | # Of Samples | Disk Used | Disk Free | Current Retention | Expected Retention | Sample Size |
|-----:|-------------:|--------------:|----------:|----------:|------------------:|-------------------:|------------:|
| 0 | 43.1K | 88.1 billion | 38.4Gi | 1.88% | 44.2 days | 45.0 days | 0.46 B |
| 1 | 73.0K | 5.2 billion | 19.2Gi | 1.84% | 122.3 days | 124.6 days | 3.99 B |
| 2 | 148.3K | 315.0 million | 5.6Gi | 43.19% | 475.1 days | 836.2 days | 18.91 B |
When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes is used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance.
To configure retention, in `netdata.conf`, set the following:
The memory required per sample in these modes, is 4 bytes: `ram` mode uses `mmap()` behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode `ram` allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory.
- `[db].mode` to `dbengine`.
- `[db].dbengine multihost disk space MB`, this is the max disk size for `tier0`. The default is 256MiB.
- `[db].dbengine tier 1 multihost disk space MB`, this is the max disk space for `tier1`. The default is 50% of `tier0`.
- `[db].dbengine tier 2 multihost disk space MB`, this is the max disk space for `tier2`. The default is 50% of `tier1`.
**Configuring ram mode and retention**:
- Enable ram mode: To use in-memory storage, set `[db].mode` to ram in your `netdata.conf` file. Remember, this mode won't retain historical data after restarts.
- Adjust retention (optional): While ram mode focuses on real-time data, you can optionally control the number of samples stored in memory. Set `[db].retention` in `netdata.conf` to the desired number in seconds. Note: If the value you choose isn't a multiple of 1024, Netdata will automatically round it up to the nearest multiple.