mirror of
https://github.com/netdata/netdata.git
synced 2025-04-16 18:37:50 +00:00
Fix Remark Lint for READMEs in Database (#6942)
* fix remark lint Database engine * fix remark lint of database README * rewrap dbengine readme for consistency * rewrap database README * make character limit to 120 not 80
This commit is contained in:
parent
0063c2126d
commit
8982b9968e
2 changed files with 159 additions and 175 deletions
|
@ -1,59 +1,53 @@
|
|||
# Database
|
||||
|
||||
Although `netdata` does all its calculations using `long double`, it stores all values using
|
||||
a [custom-made 32-bit number](../libnetdata/storage_number/).
|
||||
Although `netdata` does all its calculations using `long double`, it stores all values using a [custom-made 32-bit
|
||||
number](../libnetdata/storage_number/).
|
||||
|
||||
So, for each dimension of a chart, Netdata will need: `4 bytes for the value * the entries
|
||||
of its history`. It will not store any other data for each value in the time series database.
|
||||
Since all its values are stored in a time series with fixed step, the time each value
|
||||
corresponds can be calculated at run time, using the position of a value in the round robin database.
|
||||
So, for each dimension of a chart, Netdata will need: `4 bytes for the value * the entries of its history`. It will not
|
||||
store any other data for each value in the time series database. Since all its values are stored in a time series with
|
||||
fixed step, the time each value corresponds can be calculated at run time, using the position of a value in the round
|
||||
robin database.
|
||||
|
||||
The default history is 3.600 entries, thus it will need 14.4KB for each chart dimension.
|
||||
If you need 1.000 dimensions, they will occupy just 14.4MB.
|
||||
The default history is 3.600 entries, thus it will need 14.4KB for each chart dimension. If you need 1.000 dimensions,
|
||||
they will occupy just 14.4MB.
|
||||
|
||||
Of course, 3.600 entries is a very short history, especially if data collection frequency is set
|
||||
to 1 second. You will have just one hour of data.
|
||||
Of course, 3.600 entries is a very short history, especially if data collection frequency is set to 1 second. You will
|
||||
have just one hour of data.
|
||||
|
||||
For a day of data and 1.000 dimensions, you will need: 86.400 seconds * 4 bytes * 1.000
|
||||
dimensions = 345MB of RAM.
|
||||
For a day of data and 1.000 dimensions, you will need: `86.400 seconds * 4 bytes * 1.000 dimensions = 345MB of RAM`.
|
||||
|
||||
One option you have to lower this number is to use
|
||||
**[Memory Deduplication - Kernel Same Page Merging - KSM](#ksm)**. Another possibility is to
|
||||
use the **[Database Engine](engine/)**.
|
||||
One option you have to lower this number is to use **[Memory Deduplication - Kernel Same Page Merging - KSM](#ksm)**.
|
||||
Another possibility is to use the **[Database Engine](engine/)**.
|
||||
|
||||
## Memory modes
|
||||
|
||||
Currently Netdata supports 6 memory modes:
|
||||
|
||||
1. `ram`, data are purely in memory. Data are never saved on disk. This mode uses `mmap()` and
|
||||
supports [KSM](#ksm).
|
||||
1. `ram`, data are purely in memory. Data are never saved on disk. This mode uses `mmap()` and supports [KSM](#ksm).
|
||||
|
||||
2. `save`, (the default) data are only in RAM while Netdata runs and are saved to / loaded from
|
||||
disk on Netdata restart. It also uses `mmap()` and supports [KSM](#ksm).
|
||||
2. `save`, (the default) data are only in RAM while Netdata runs and are saved to / loaded from disk on Netdata
|
||||
restart. It also uses `mmap()` and supports [KSM](#ksm).
|
||||
|
||||
3. `map`, data are in memory mapped files. This works like the swap. Keep in mind though, this
|
||||
will have a constant write on your disk. When Netdata writes data on its memory, the Linux kernel
|
||||
marks the related memory pages as dirty and automatically starts updating them on disk.
|
||||
Unfortunately we cannot control how frequently this works. The Linux kernel uses exactly the
|
||||
same algorithm it uses for its swap memory. Check below for additional information on running a
|
||||
dedicated central Netdata server. This mode uses `mmap()` but does not support [KSM](#ksm).
|
||||
3. `map`, data are in memory mapped files. This works like the swap. Keep in mind though, this will have a constant
|
||||
write on your disk. When Netdata writes data on its memory, the Linux kernel marks the related memory pages as dirty
|
||||
and automatically starts updating them on disk. Unfortunately we cannot control how frequently this works. The Linux
|
||||
kernel uses exactly the same algorithm it uses for its swap memory. Check below for additional information on
|
||||
running a dedicated central Netdata server. This mode uses `mmap()` but does not support [KSM](#ksm).
|
||||
|
||||
4. `none`, without a database (collected metrics can only be streamed to another Netdata).
|
||||
|
||||
5. `alloc`, like `ram` but it uses `calloc()` and does not support [KSM](#ksm). This mode is the
|
||||
fallback for all others except `none`.
|
||||
5. `alloc`, like `ram` but it uses `calloc()` and does not support [KSM](#ksm). This mode is the fallback for all
|
||||
others except `none`.
|
||||
|
||||
6. `dbengine`, data are in database files. The [Database Engine](engine/) works like a traditional
|
||||
database. There is some amount of RAM dedicated to data caching and indexing and the rest of
|
||||
the data reside compressed on disk. The number of history entries is not fixed in this case,
|
||||
but depends on the configured disk space and the effective compression ratio of the data stored.
|
||||
This is the **only mode** that supports changing the data collection update frequency
|
||||
(`update_every`) **without losing** the previously stored metrics.
|
||||
For more details see [here](engine/).
|
||||
6. `dbengine`, data are in database files. The [Database Engine](engine/) works like a traditional database. There is
|
||||
some amount of RAM dedicated to data caching and indexing and the rest of the data reside compressed on disk. The
|
||||
number of history entries is not fixed in this case, but depends on the configured disk space and the effective
|
||||
compression ratio of the data stored. This is the **only mode** that supports changing the data collection update
|
||||
frequency (`update_every`) **without losing** the previously stored metrics. For more details see [here](engine/).
|
||||
|
||||
You can select the memory mode by editing `netdata.conf` and setting:
|
||||
|
||||
```
|
||||
```conf
|
||||
[global]
|
||||
# ram, save (the default, save on exit, load on start), map (swap like)
|
||||
memory mode = save
|
||||
|
@ -71,62 +65,58 @@ There are 2 settings for you to tweak:
|
|||
1. `update every`, which controls the data collection frequency
|
||||
2. `history`, which controls the size of the database in RAM
|
||||
|
||||
By default `update every = 1` and `history = 3600`. This gives you an hour of data with per
|
||||
second updates.
|
||||
By default `update every = 1` and `history = 3600`. This gives you an hour of data with per second updates.
|
||||
|
||||
If you set `update every = 2` and `history = 1800`, you will still have an hour of data, but
|
||||
collected once every 2 seconds. This will **cut in half** both CPU and RAM resources consumed
|
||||
by Netdata. Of course experiment a bit. On very weak devices you might have to use
|
||||
`update every = 5` and `history = 720` (still 1 hour of data, but 1/5 of the CPU and RAM resources).
|
||||
If you set `update every = 2` and `history = 1800`, you will still have an hour of data, but collected once every 2
|
||||
seconds. This will **cut in half** both CPU and RAM resources consumed by Netdata. Of course experiment a bit. On very
|
||||
weak devices you might have to use `update every = 5` and `history = 720` (still 1 hour of data, but 1/5 of the CPU and
|
||||
RAM resources).
|
||||
|
||||
You can also disable [data collection plugins](../collectors) you don't need.
|
||||
Disabling such plugins will also free both CPU and RAM resources.
|
||||
You can also disable [data collection plugins](../collectors) you don't need. Disabling such plugins will also free both
|
||||
CPU and RAM resources.
|
||||
|
||||
## Running a dedicated central Netdata server
|
||||
|
||||
Netdata allows streaming data between Netdata nodes. This allows us to have a central Netdata
|
||||
server that will maintain the entire database for all nodes, and will also run health checks/alarms
|
||||
for all nodes.
|
||||
Netdata allows streaming data between Netdata nodes. This allows us to have a central Netdata server that will maintain
|
||||
the entire database for all nodes, and will also run health checks/alarms for all nodes.
|
||||
|
||||
For this central Netdata, memory size can be a problem. Fortunately, Netdata supports several
|
||||
memory modes. **One interesting option** for this setup is `memory mode = map`.
|
||||
For this central Netdata, memory size can be a problem. Fortunately, Netdata supports several memory modes. **One
|
||||
interesting option** for this setup is `memory mode = map`.
|
||||
|
||||
### map
|
||||
|
||||
In this mode, the database of Netdata is stored in memory mapped files. Netdata continues to read
|
||||
and write the database in memory, but the kernel automatically loads and saves memory pages from/to
|
||||
disk.
|
||||
In this mode, the database of Netdata is stored in memory mapped files. Netdata continues to read and write the database
|
||||
in memory, but the kernel automatically loads and saves memory pages from/to disk.
|
||||
|
||||
**We suggest _not_ to use this mode on nodes that run other applications.** There will always be
|
||||
dirty memory to be synced and this syncing process may influence the way other applications work.
|
||||
This mode however is useful when we need a central Netdata server that would normally need huge
|
||||
amounts of memory. Using memory mode `map` we can overcome all memory restrictions.
|
||||
**We suggest _not_ to use this mode on nodes that run other applications.** There will always be dirty memory to be
|
||||
synced and this syncing process may influence the way other applications work. This mode however is useful when we need
|
||||
a central Netdata server that would normally need huge amounts of memory. Using memory mode `map` we can overcome all
|
||||
memory restrictions.
|
||||
|
||||
There are a few kernel options that provide finer control on the way this syncing works. But before
|
||||
explaining them, a brief introduction of how Netdata database works is needed.
|
||||
There are a few kernel options that provide finer control on the way this syncing works. But before explaining them, a
|
||||
brief introduction of how Netdata database works is needed.
|
||||
|
||||
For each chart, Netdata maps the following files:
|
||||
|
||||
1. `chart/main.db`, this is the file that maintains chart information. Every time data are collected
|
||||
for a chart, this is updated.
|
||||
2. `chart/dimension_name.db`, this is the file for each dimension. At its beginning there is a
|
||||
header, followed by the round robin database where metrics are stored.
|
||||
1. `chart/main.db`, this is the file that maintains chart information. Every time data are collected for a chart, this
|
||||
is updated.
|
||||
2. `chart/dimension_name.db`, this is the file for each dimension. At its beginning there is a header, followed by the
|
||||
round robin database where metrics are stored.
|
||||
|
||||
So, every time Netdata collects data, the following pages will become dirty:
|
||||
|
||||
1. the chart file
|
||||
2. the header part of all dimension files
|
||||
3. if the collected metrics are stored far enough in the dimension file, another page will
|
||||
become dirty, for each dimension
|
||||
3. if the collected metrics are stored far enough in the dimension file, another page will become dirty, for each
|
||||
dimension
|
||||
|
||||
Each page in Linux is 4KB. So, with 200 charts and 1000 dimensions, there will be 1200 to 2200 4KB
|
||||
pages dirty pages every second. Of course 1200 of them will always be dirty (the chart header and
|
||||
the dimensions headers) and 1000 will be dirty for about 1000 seconds (4 bytes per metric, 4KB per
|
||||
page, so 1000 seconds, or 16 minutes per page).
|
||||
Each page in Linux is 4KB. So, with 200 charts and 1000 dimensions, there will be 1200 to 2200 4KB pages dirty pages
|
||||
every second. Of course 1200 of them will always be dirty (the chart header and the dimensions headers) and 1000 will be
|
||||
dirty for about 1000 seconds (4 bytes per metric, 4KB per page, so 1000 seconds, or 16 minutes per page).
|
||||
|
||||
Hopefully, the Linux kernel does not sync all these data every second. The frequency they are
|
||||
synced is controlled by `/proc/sys/vm/dirty_expire_centisecs` or the
|
||||
`sysctl` `vm.dirty_expire_centisecs`. The default on most systems is 3000 (30 seconds).
|
||||
Hopefully, the Linux kernel does not sync all these data every second. The frequency they are synced is controlled by
|
||||
`/proc/sys/vm/dirty_expire_centisecs` or the `sysctl` `vm.dirty_expire_centisecs`. The default on most systems is 3000
|
||||
(30 seconds).
|
||||
|
||||
On a busy server centralizing metrics from 20+ servers you will experience this:
|
||||
|
||||
|
@ -134,62 +124,59 @@ On a busy server centralizing metrics from 20+ servers you will experience this:
|
|||
|
||||
As you can see, there is quite some stress (this is `iowait`) every 30 seconds.
|
||||
|
||||
A simple solution is to increase this time to 10 minutes (60000). This is the same system
|
||||
with this setting in 10 minutes:
|
||||
A simple solution is to increase this time to 10 minutes (60000). This is the same system with this setting in 10
|
||||
minutes:
|
||||
|
||||

|
||||
|
||||
Of course, setting this to 10 minutes means that data on disk might be up to 10 minutes old if you
|
||||
get an abnormal shutdown.
|
||||
Of course, setting this to 10 minutes means that data on disk might be up to 10 minutes old if you get an abnormal
|
||||
shutdown.
|
||||
|
||||
There are 2 more options to tweak:
|
||||
|
||||
1. `dirty_background_ratio`, by default `10`.
|
||||
2. `dirty_ratio`, by default `20`.
|
||||
|
||||
These control the amount of memory that should be dirty for disk syncing to be triggered.
|
||||
On dedicated Netdata servers, you can use: `80` and `90` respectively, so that all RAM is given
|
||||
to Netdata.
|
||||
These control the amount of memory that should be dirty for disk syncing to be triggered. On dedicated Netdata servers,
|
||||
you can use: `80` and `90` respectively, so that all RAM is given to Netdata.
|
||||
|
||||
With these settings, you can expect a little `iowait` spike once every 10 minutes and in case
|
||||
of system crash, data on disk will be up to 10 minutes old.
|
||||
With these settings, you can expect a little `iowait` spike once every 10 minutes and in case of system crash, data on
|
||||
disk will be up to 10 minutes old.
|
||||
|
||||

|
||||
|
||||
To have these settings automatically applied on boot, create the file `/etc/sysctl.d/netdata-memory.conf` with these contents:
|
||||
To have these settings automatically applied on boot, create the file `/etc/sysctl.d/netdata-memory.conf` with these
|
||||
contents:
|
||||
|
||||
```
|
||||
```conf
|
||||
vm.dirty_expire_centisecs = 60000
|
||||
vm.dirty_background_ratio = 80
|
||||
vm.dirty_ratio = 90
|
||||
vm.dirty_writeback_centisecs = 0
|
||||
```
|
||||
|
||||
There is another memory mode to help overcome the memory size problem. What is **most interesting
|
||||
for this setup** is `memory mode = dbengine`.
|
||||
There is another memory mode to help overcome the memory size problem. What is **most interesting for this setup** is
|
||||
`memory mode = dbengine`.
|
||||
|
||||
### dbengine
|
||||
|
||||
In this mode, the database of Netdata is stored in database files. The [Database Engine](engine/)
|
||||
works like a traditional database. There is some amount of RAM dedicated to data caching and
|
||||
indexing and the rest of the data reside compressed on disk. The number of history entries is not
|
||||
fixed in this case, but depends on the configured disk space and the effective compression ratio
|
||||
of the data stored.
|
||||
In this mode, the database of Netdata is stored in database files. The [Database Engine](engine/) works like a
|
||||
traditional database. There is some amount of RAM dedicated to data caching and indexing and the rest of the data reside
|
||||
compressed on disk. The number of history entries is not fixed in this case, but depends on the configured disk space
|
||||
and the effective compression ratio of the data stored.
|
||||
|
||||
We suggest to use **this** mode on nodes that also run other applications. The Database Engine uses
|
||||
direct I/O to avoid polluting the OS filesystem caches and does not generate excessive I/O traffic
|
||||
so as to create the minimum possible interference with other applications. Using memory mode
|
||||
`dbengine` we can overcome most memory restrictions. For more details see [here](engine/).
|
||||
We suggest to use **this** mode on nodes that also run other applications. The Database Engine uses direct I/O to avoid
|
||||
polluting the OS filesystem caches and does not generate excessive I/O traffic so as to create the minimum possible
|
||||
interference with other applications. Using memory mode `dbengine` we can overcome most memory restrictions. For more
|
||||
details see [here](engine/).
|
||||
|
||||
## KSM
|
||||
|
||||
Netdata offers all its round robin database to kernel for deduplication
|
||||
(except for `memory mode = dbengine`).
|
||||
Netdata offers all its round robin database to kernel for deduplication (except for `memory mode = dbengine`).
|
||||
|
||||
In the past KSM has been criticized for consuming a lot of CPU resources.
|
||||
Although this is true when KSM is used for deduplicating certain applications, it is not true with
|
||||
netdata, since the Netdata memory is written very infrequently (if you have 24 hours of metrics in
|
||||
netdata, each byte at the in-memory database will be updated just once per day).
|
||||
In the past KSM has been criticized for consuming a lot of CPU resources. Although this is true when KSM is used for
|
||||
deduplicating certain applications, it is not true with netdata, since the Netdata memory is written very infrequently
|
||||
(if you have 24 hours of metrics in netdata, each byte at the in-memory database will be updated just once per day).
|
||||
|
||||
KSM is a solution that will provide 60+% memory savings to Netdata.
|
||||
|
||||
|
@ -203,15 +190,20 @@ CONFIG_KSM=y
|
|||
|
||||
When KSM is enabled at the kernel is just available for the user to enable it.
|
||||
|
||||
So, if you build a kernel with `CONFIG_KSM=y` you will just get a few files in `/sys/kernel/mm/ksm`. Nothing else happens. There is no performance penalty (apart I guess from the memory this code occupies into the kernel).
|
||||
So, if you build a kernel with `CONFIG_KSM=y` you will just get a few files in `/sys/kernel/mm/ksm`. Nothing else
|
||||
happens. There is no performance penalty (apart I guess from the memory this code occupies into the kernel).
|
||||
|
||||
The files that `CONFIG_KSM=y` offers include:
|
||||
|
||||
- `/sys/kernel/mm/ksm/run` by default `0`. You have to set this to `1` for the kernel to spawn `ksmd`.
|
||||
- `/sys/kernel/mm/ksm/sleep_millisecs`, by default `20`. The frequency ksmd should evaluate memory for deduplication.
|
||||
- `/sys/kernel/mm/ksm/pages_to_scan`, by default `100`. The amount of pages ksmd will evaluate on each run.
|
||||
- `/sys/kernel/mm/ksm/run` by default `0`. You have to set this to `1` for the
|
||||
kernel to spawn `ksmd`.
|
||||
- `/sys/kernel/mm/ksm/sleep_millisecs`, by default `20`. The frequency ksmd
|
||||
should evaluate memory for deduplication.
|
||||
- `/sys/kernel/mm/ksm/pages_to_scan`, by default `100`. The amount of pages
|
||||
ksmd will evaluate on each run.
|
||||
|
||||
So, by default `ksmd` is just disabled. It will not harm performance and the user/admin can control the CPU resources he/she is willing `ksmd` to use.
|
||||
So, by default `ksmd` is just disabled. It will not harm performance and the user/admin can control the CPU resources
|
||||
he/she is willing `ksmd` to use.
|
||||
|
||||
### Run `ksmd` kernel daemon
|
||||
|
||||
|
@ -222,7 +214,8 @@ echo 1 >/sys/kernel/mm/ksm/run
|
|||
echo 1000 >/sys/kernel/mm/ksm/sleep_millisecs
|
||||
```
|
||||
|
||||
With these settings ksmd does not even appear in the running process list (it will run once per second and evaluate 100 pages for de-duplication).
|
||||
With these settings ksmd does not even appear in the running process list (it will run once per second and evaluate 100
|
||||
pages for de-duplication).
|
||||
|
||||
Put the above lines in your boot sequence (`/etc/rc.local` or equivalent) to have `ksmd` run at boot.
|
||||
|
||||
|
@ -232,4 +225,4 @@ Netdata will create charts for kernel memory de-duplication performance, like th
|
|||
|
||||

|
||||
|
||||
[](<>)
|
||||
[](<>)
|
|
@ -1,18 +1,17 @@
|
|||
# Database engine
|
||||
|
||||
The Database Engine works like a traditional
|
||||
database. There is some amount of RAM dedicated to data caching and indexing and the rest of
|
||||
the data reside compressed on disk. The number of history entries is not fixed in this case,
|
||||
but depends on the configured disk space and the effective compression ratio of the data stored.
|
||||
This is the **only mode** that supports changing the data collection update frequency
|
||||
(`update_every`) **without losing** the previously stored metrics.
|
||||
The Database Engine works like a traditional database. There is some amount of RAM dedicated to data caching and
|
||||
indexing and the rest of the data reside compressed on disk. The number of history entries is not fixed in this case,
|
||||
but depends on the configured disk space and the effective compression ratio of the data stored. This is the **only
|
||||
mode** that supports changing the data collection update frequency (`update_every`) **without losing** the previously
|
||||
stored metrics.
|
||||
|
||||
## Files
|
||||
|
||||
With the DB engine memory mode the metric data are stored in database files. These files are
|
||||
organized in pairs, the datafiles and their corresponding journalfiles, e.g.:
|
||||
With the DB engine memory mode the metric data are stored in database files. These files are organized in pairs, the
|
||||
datafiles and their corresponding journalfiles, e.g.:
|
||||
|
||||
```
|
||||
```sh
|
||||
datafile-1-0000000001.ndf
|
||||
journalfile-1-0000000001.njf
|
||||
datafile-1-0000000002.ndf
|
||||
|
@ -22,21 +21,19 @@ journalfile-1-0000000003.njf
|
|||
...
|
||||
```
|
||||
|
||||
They are located under their host's cache directory in the directory `./dbengine`
|
||||
(e.g. for localhost the default location is `/var/cache/netdata/dbengine/*`). The higher
|
||||
numbered filenames contain more recent metric data. The user can safely delete some pairs
|
||||
of files when Netdata is stopped to manually free up some space.
|
||||
They are located under their host's cache directory in the directory `./dbengine` (e.g. for localhost the default
|
||||
location is `/var/cache/netdata/dbengine/*`). The higher numbered filenames contain more recent metric data. The user
|
||||
can safely delete some pairs of files when Netdata is stopped to manually free up some space.
|
||||
|
||||
_Users should_ **back up** _their `./dbengine` folders if they consider this data to be important._
|
||||
|
||||
## Configuration
|
||||
|
||||
There is one DB engine instance per Netdata host/node. That is, there is one `./dbengine` folder
|
||||
per node, and all charts of `dbengine` memory mode in such a host share the same storage space
|
||||
and DB engine instance memory state. You can select the memory mode for localhost by editing
|
||||
netdata.conf and setting:
|
||||
There is one DB engine instance per Netdata host/node. That is, there is one `./dbengine` folder per node, and all
|
||||
charts of `dbengine` memory mode in such a host share the same storage space and DB engine instance memory state. You
|
||||
can select the memory mode for localhost by editing netdata.conf and setting:
|
||||
|
||||
```
|
||||
```conf
|
||||
[global]
|
||||
memory mode = dbengine
|
||||
```
|
||||
|
@ -44,57 +41,52 @@ netdata.conf and setting:
|
|||
For setting the memory mode for the rest of the nodes you should look at
|
||||
[streaming](../../streaming/).
|
||||
|
||||
The `history` configuration option is meaningless for `memory mode = dbengine` and is ignored
|
||||
for any metrics being stored in the DB engine.
|
||||
The `history` configuration option is meaningless for `memory mode = dbengine` and is ignored for any metrics being
|
||||
stored in the DB engine.
|
||||
|
||||
All DB engine instances, for localhost and all other streaming recipient nodes inherit their
|
||||
configuration from `netdata.conf`:
|
||||
All DB engine instances, for localhost and all other streaming recipient nodes inherit their configuration from
|
||||
`netdata.conf`:
|
||||
|
||||
```
|
||||
```conf
|
||||
[global]
|
||||
page cache size = 32
|
||||
dbengine disk space = 256
|
||||
```
|
||||
|
||||
The above values are the default and minimum values for Page Cache size and DB engine disk space
|
||||
quota. Both numbers are in **MiB**. All DB engine instances will allocate the configured resources
|
||||
separately.
|
||||
The above values are the default and minimum values for Page Cache size and DB engine disk space quota. Both numbers are
|
||||
in **MiB**. All DB engine instances will allocate the configured resources separately.
|
||||
|
||||
The `page cache size` option determines the amount of RAM in **MiB** that is dedicated to caching
|
||||
Netdata metric values themselves.
|
||||
The `page cache size` option determines the amount of RAM in **MiB** that is dedicated to caching Netdata metric values
|
||||
themselves.
|
||||
|
||||
The `dbengine disk space` option determines the amount of disk space in **MiB** that is dedicated
|
||||
to storing Netdata metric values and all related metadata describing them.
|
||||
The `dbengine disk space` option determines the amount of disk space in **MiB** that is dedicated to storing Netdata
|
||||
metric values and all related metadata describing them.
|
||||
|
||||
## Operation
|
||||
|
||||
The DB engine stores chart metric values in 4096-byte pages in memory. Each chart dimension gets
|
||||
its own page to store consecutive values generated from the data collectors. Those pages comprise
|
||||
the **Page Cache**.
|
||||
The DB engine stores chart metric values in 4096-byte pages in memory. Each chart dimension gets its own page to store
|
||||
consecutive values generated from the data collectors. Those pages comprise the **Page Cache**.
|
||||
|
||||
When those pages fill up they are slowly compressed and flushed to disk.
|
||||
It can take `4096 / 4 = 1024 seconds = 17 minutes`, for a chart dimension that is being collected
|
||||
every 1 second, to fill a page. Pages can be cut short when we stop Netdata or the DB engine
|
||||
instance so as to not lose the data. When we query the DB engine for data we trigger disk read
|
||||
I/O requests that fill the Page Cache with the requested pages and potentially evict cold
|
||||
(not recently used) pages.
|
||||
When those pages fill up they are slowly compressed and flushed to disk. It can take `4096 / 4 = 1024 seconds = 17
|
||||
minutes`, for a chart dimension that is being collected every 1 second, to fill a page. Pages can be cut short when we
|
||||
stop Netdata or the DB engine instance so as to not lose the data. When we query the DB engine for data we trigger disk
|
||||
read I/O requests that fill the Page Cache with the requested pages and potentially evict cold (not recently used)
|
||||
pages.
|
||||
|
||||
When the disk quota is exceeded the oldest values are removed from the DB engine at real time, by
|
||||
automatically deleting the oldest datafile and journalfile pair. Any corresponding pages residing
|
||||
in the Page Cache will also be invalidated and removed. The DB engine logic will try to maintain
|
||||
between 10 and 20 file pairs at any point in time.
|
||||
When the disk quota is exceeded the oldest values are removed from the DB engine at real time, by automatically deleting
|
||||
the oldest datafile and journalfile pair. Any corresponding pages residing in the Page Cache will also be invalidated
|
||||
and removed. The DB engine logic will try to maintain between 10 and 20 file pairs at any point in time.
|
||||
|
||||
The Database Engine uses direct I/O to avoid polluting the OS filesystem caches and does not
|
||||
generate excessive I/O traffic so as to create the minimum possible interference with other
|
||||
applications.
|
||||
The Database Engine uses direct I/O to avoid polluting the OS filesystem caches and does not generate excessive I/O
|
||||
traffic so as to create the minimum possible interference with other applications.
|
||||
|
||||
## Memory requirements
|
||||
|
||||
Using memory mode `dbengine` we can overcome most memory restrictions and store a dataset that
|
||||
is much larger than the available memory.
|
||||
Using memory mode `dbengine` we can overcome most memory restrictions and store a dataset that is much larger than the
|
||||
available memory.
|
||||
|
||||
There are explicit memory requirements **per** DB engine **instance**, meaning **per** Netdata
|
||||
**node** (e.g. localhost and streaming recipient nodes):
|
||||
There are explicit memory requirements **per** DB engine **instance**, meaning **per** Netdata **node** (e.g. localhost
|
||||
and streaming recipient nodes):
|
||||
|
||||
- `page cache size` must be at least `#dimensions-being-collected x 4096 x 2` bytes.
|
||||
|
||||
|
@ -102,48 +94,47 @@ There are explicit memory requirements **per** DB engine **instance**, meaning *
|
|||
|
||||
- roughly speaking this is 3% of the uncompressed disk space taken by the DB files.
|
||||
|
||||
- for very highly compressible data (compression ratio > 90%) this RAM overhead
|
||||
is comparable to the disk space footprint.
|
||||
- for very highly compressible data (compression ratio > 90%) this RAM overhead is comparable to the disk space
|
||||
footprint.
|
||||
|
||||
An important observation is that RAM usage depends on both the `page cache size` and the
|
||||
`dbengine disk space` options.
|
||||
An important observation is that RAM usage depends on both the `page cache size` and the `dbengine disk space` options.
|
||||
|
||||
## File descriptor requirements
|
||||
|
||||
The Database Engine may keep a **significant** amount of files open per instance (e.g. per streaming
|
||||
slave or master server). When configuring your system you should make sure there are at least 50
|
||||
file descriptors available per `dbengine` instance.
|
||||
The Database Engine may keep a **significant** amount of files open per instance (e.g. per streaming slave or master
|
||||
server). When configuring your system you should make sure there are at least 50 file descriptors available per
|
||||
`dbengine` instance.
|
||||
|
||||
Netdata allocates 25% of the available file descriptors to its Database Engine instances. This means that only 25%
|
||||
of the file descriptors that are available to the Netdata service are accessible by dbengine instances.
|
||||
You should take that into account when configuring your service
|
||||
or system-wide file descriptor limits. You can roughly estimate that the Netdata service needs 2048 file
|
||||
descriptors for every 10 streaming slave hosts when streaming is configured to use `memory mode = dbengine`.
|
||||
Netdata allocates 25% of the available file descriptors to its Database Engine instances. This means that only 25% of
|
||||
the file descriptors that are available to the Netdata service are accessible by dbengine instances. You should take
|
||||
that into account when configuring your service or system-wide file descriptor limits. You can roughly estimate that the
|
||||
Netdata service needs 2048 file descriptors for every 10 streaming slave hosts when streaming is configured to use
|
||||
`memory mode = dbengine`.
|
||||
|
||||
If for example one wants to allocate 65536 file descriptors to the Netdata service on a systemd system
|
||||
one needs to override the Netdata service by running `sudo systemctl edit netdata` and creating a
|
||||
file with contents:
|
||||
If for example one wants to allocate 65536 file descriptors to the Netdata service on a systemd system one needs to
|
||||
override the Netdata service by running `sudo systemctl edit netdata` and creating a file with contents:
|
||||
|
||||
```
|
||||
```sh
|
||||
[Service]
|
||||
LimitNOFILE=65536
|
||||
```
|
||||
|
||||
For other types of services one can add the line:
|
||||
|
||||
```
|
||||
```sh
|
||||
ulimit -n 65536
|
||||
```
|
||||
|
||||
at the beginning of the service file. Alternatively you can change the system-wide limits of the kernel by changing `/etc/sysctl.conf`. For linux that would be:
|
||||
at the beginning of the service file. Alternatively you can change the system-wide limits of the kernel by changing
|
||||
`/etc/sysctl.conf`. For linux that would be:
|
||||
|
||||
```
|
||||
```conf
|
||||
fs.file-max = 65536
|
||||
```
|
||||
|
||||
In FreeBSD and OS X you change the lines like this:
|
||||
|
||||
```
|
||||
```conf
|
||||
kern.maxfilesperproc=65536
|
||||
kern.maxfiles=65536
|
||||
```
|
||||
|
|
Loading…
Add table
Reference in a new issue