.. | ||
python_modules | ||
apache.chart.py | ||
beanstalk.chart.py | ||
bind_rndc.chart.py | ||
boinc.chart.py | ||
ceph.chart.py | ||
chrony.chart.py | ||
couchdb.chart.py | ||
cpufreq.chart.py | ||
cpuidle.chart.py | ||
dns_query_time.chart.py | ||
dnsdist.chart.py | ||
dockerd.chart.py | ||
dovecot.chart.py | ||
elasticsearch.chart.py | ||
example.chart.py | ||
exim.chart.py | ||
fail2ban.chart.py | ||
freeradius.chart.py | ||
go_expvar.chart.py | ||
haproxy.chart.py | ||
hddtemp.chart.py | ||
httpcheck.chart.py | ||
icecast.chart.py | ||
ipfs.chart.py | ||
isc_dhcpd.chart.py | ||
litespeed.chart.py | ||
logind.chart.py | ||
Makefile.am | ||
mdstat.chart.py | ||
megacli.chart.py | ||
memcached.chart.py | ||
mongodb.chart.py | ||
monit.chart.py | ||
mysql.chart.py | ||
nginx.chart.py | ||
nginx_plus.chart.py | ||
nsd.chart.py | ||
ntpd.chart.py | ||
ovpn_status_log.chart.py | ||
phpfpm.chart.py | ||
portcheck.chart.py | ||
postfix.chart.py | ||
postgres.chart.py | ||
powerdns.chart.py | ||
puppet.chart.py | ||
rabbitmq.chart.py | ||
README.md | ||
redis.chart.py | ||
rethinkdbs.chart.py | ||
retroshare.chart.py | ||
samba.chart.py | ||
sensors.chart.py | ||
smartd_log.chart.py | ||
spigotmc.chart.py | ||
springboot.chart.py | ||
squid.chart.py | ||
tomcat.chart.py | ||
traefik.chart.py | ||
unbound.chart.py | ||
varnish.chart.py | ||
w1sensor.chart.py | ||
web_log.chart.py |
Disclaimer
Every module should be compatible with python2 and python3.
All third party libraries should be installed system-wide or in python_modules
directory.
Module configurations are written in YAML and pyYAML is required.
Every configuration file must have one of two formats:
- Configuration for only one job:
update_every : 2 # update frequency
retries : 1 # how many failures in update() is tolerated
priority : 20000 # where it is shown on dashboard
other_var1 : bla # variables passed to module
other_var2 : alb
- Configuration for many jobs (ex. mysql):
# module defaults:
update_every : 2
retries : 1
priority : 20000
local: # job name
update_every : 5 # job update frequency
other_var1 : some_val # module specific variable
other_job:
priority : 5 # job position on dashboard
retries : 20 # job retries
other_var2 : val # module specific variable
update_every
, retries
, and priority
are always optional.
The following python.d modules are supported:
apache
This module will monitor one or more Apache servers depending on configuration.
Requirements:
- apache with enabled
mod_status
It produces the following charts:
- Requests in requests/s
- requests
- Connections
- connections
- Async Connections
- keepalive
- closing
- writing
- Bandwidth in kilobytes/s
- sent
- Workers
- idle
- busy
- Lifetime Avg. Requests/s in requests/s
- requests_sec
- Lifetime Avg. Bandwidth/s in kilobytes/s
- size_sec
- Lifetime Avg. Response Size in bytes/request
- size_req
configuration
Needs only url
to server's server-status?auto
Here is an example for 2 servers:
update_every : 10
priority : 90100
local:
url : 'http://localhost/server-status?auto'
retries : 20
remote:
url : 'http://www.apache.org/server-status?auto'
update_every : 5
retries : 4
Without configuration, module attempts to connect to http://localhost/server-status?auto
apache_cache
Module monitors apache mod_cache log and produces only one chart:
cached responses in percent cached
- hit
- miss
- other
configuration
Sample:
update_every : 10
priority : 120000
retries : 5
log_path : '/var/log/apache2/cache.log'
If no configuration is given, module will attempt to read log file at /var/log/apache2/cache.log
beanstalk
Module provides server and tube-level statistics:
Requirements:
python-beanstalkc
Server statistics:
- Cpu usage in cpu time
- user
- system
- Jobs rate in jobs/s
- total
- timeouts
- Connections rate in connections/s
- connections
- Commands rate in commands/s
- put
- peek
- peek-ready
- peek-delayed
- peek-buried
- reserve
- use
- watch
- ignore
- delete
- release
- bury
- kick
- stats
- stats-job
- stats-tube
- list-tubes
- list-tube-used
- list-tubes-watched
- pause-tube
- Current tubes in tubes
- tubes
- Current jobs in jobs
- urgent
- ready
- reserved
- delayed
- buried
- Current connections in connections
- written
- producers
- workers
- waiting
- Binlog in records/s
- written
- migrated
- Uptime in seconds
- uptime
Per tube statistics:
- Jobs rate in jobs/s
- jobs
- Jobs in jobs
- using
- ready
- reserved
- delayed
- buried
- Connections in connections
- using
- waiting
- watching
- Commands in commands/s
- deletes
- pauses
- Pause in seconds
- since
- left
configuration
Sample:
host : '127.0.0.1'
port : 11300
If no configuration is given, module will attempt to connect to beanstalkd on 127.0.0.1:11300
address
bind_rndc
Module parses bind dump file to collect real-time performance metrics
Requirements:
- Version of bind must be 9.6 +
- Netdata must have permissions to run
rndc stats
It produces:
- Name server statistics
- requests
- responses
- success
- auth_answer
- nonauth_answer
- nxrrset
- failure
- nxdomain
- recursion
- duplicate
- rejections
- Incoming queries
- RESERVED0
- A
- NS
- CNAME
- SOA
- PTR
- MX
- TXT
- X25
- AAAA
- SRV
- NAPTR
- A6
- DS
- RSIG
- DNSKEY
- SPF
- ANY
- DLV
- Outgoing queries
- Same as Incoming queries
configuration
Sample:
local:
named_stats_path : '/var/log/bind/named.stats'
If no configuration is given, module will attempt to read named.stats file at /var/log/bind/named.stats
boinc
This module monitors task counts for the Berkely Open Infrastructure Networking Computing (BOINC) distributed computing client using the same RPC interface that the BOINC monitoring GUI does.
It provides charts tracking the total number of tasks and active tasks, as well as ones tracking each of the possible states for tasks.
configuration
BOINC requires use of a password to access it's RPC interface. You can
find this password in the gui_rpc_auth.cfg
file in your BOINC directory.
By default, the module will try to auto-detect the password by looking
in /var/lib/boinc
for this file (this is the location most Linux
distributions use for a system-wide BOINC installation), so things may
just work without needing configuration for the local system.
You can monitor remote systems as well:
remote:
hostname: some-host
password: some-password
chrony
This module monitors the precision and statistics of a local chronyd server.
It produces:
- frequency
- last offset
- RMS offset
- residual freq
- root delay
- root dispersion
- skew
- system time
Requirements:
Verify that user netdata can execute chronyc tracking
. If necessary, update /etc/chrony.conf
, cmdallow
.
Configuration
Sample:
# data collection frequency:
update_every: 1
# chrony query command:
local:
command: 'chronyc -n tracking'
ceph
This module monitors the ceph cluster usage and consuption data of a server.
It produces:
- Cluster statistics (usage, available, latency, objects, read/write rate)
- OSD usage
- OSD latency
- Pool usage
- Pool read/write operations
- Pool read/write rate
- number of objects per pool
Requirements:
rados
python module- Granting read permissions to ceph group from keyring file
# chmod 640 /etc/ceph/ceph.client.admin.keyring
Configuration
Sample:
local:
config_file: '/etc/ceph/ceph.conf'
keyring_file: '/etc/ceph/ceph.client.admin.keyring'
couchdb
This module monitors vital statistics of a local Apache CouchDB 2.x server, including:
- Overall server reads/writes
- HTTP traffic breakdown
- Request methods (
GET
,PUT
,POST
, etc.) - Response status codes (
200
,201
,4xx
, etc.)
- Request methods (
- Active server tasks
- Replication status (CouchDB 2.1 and up only)
- Erlang VM stats
- Optional per-database statistics: sizes, # of docs, # of deleted docs
Configuration
Sample for a local server running on port 5984:
local:
user: 'admin'
pass: 'password'
node: 'couchdb@127.0.0.1'
Be sure to specify a correct admin-level username and password.
You may also need to change the node
name; this should match the value of -name NODENAME
in your CouchDB's etc/vm.args
file. Typically this is of the form couchdb@fully.qualified.domain.name
in a cluster, or couchdb@127.0.0.1
/ couchdb@localhost
for a single-node server.
If you want per-database statistics, these need to be added to the configuration, separated by spaces:
local:
...
databases: 'db1 db2 db3 ...'
cpufreq
This module shows the current CPU frequency as set by the cpufreq kernel module.
Requirement:
You need to have CONFIG_CPU_FREQ
and (optionally) CONFIG_CPU_FREQ_STAT
enabled in your kernel.
This module tries to read from one of two possible locations. On
initialization, it tries to read the time_in_state
files provided by
cpufreq_stats. If this file does not exist, or doesn't contain valid data, it
falls back to using the more inaccurate scaling_cur_freq
file (which only
represents the current CPU frequency, and doesn't account for any state
changes which happen between updates).
It produces one chart with multiple lines (one line per core).
configuration
Sample:
sys_dir: "/sys/devices"
If no configuration is given, module will search for cpufreq files in /sys/devices
directory.
Directory is also prefixed with NETDATA_HOST_PREFIX
if specified.
cpuidle
This module monitors the usage of CPU idle states.
Requirement:
Your kernel needs to have CONFIG_CPU_IDLE
enabled.
It produces one stacked chart per CPU, showing the percentage of time spent in each state.
dns_query_time
This module provides DNS query time statistics.
Requirement:
python-dnspython
package
It produces one aggregate chart or one chart per DNS server, showing the query time.
dnsdist
Module monitor dnsdist performance and health metrics.
Following charts are drawn:
- Response latency
- latency-slow
- latency100-1000
- latency50-100
- latency10-50
- latency1-10
- latency0-1
- Cache performance
- cache-hits
- cache-misses
- ACL events
- acl-drops
- rule-drop
- rule-nxdomain
- rule-refused
- Noncompliant data
- empty-queries
- no-policy
- noncompliant-queries
- noncompliant-responses
- Queries
- queries
- rdqueries
- rdqueries
- Health
- downstream-send-errors
- downstream-timeouts
- servfail-responses
- trunc-failures
configuration
localhost:
name : 'local'
url : 'http://127.0.0.1:5053/jsonstat?command=stats'
user : 'username'
pass : 'password'
header:
X-API-Key: 'dnsdist-api-key'
docker
Module monitor docker health metrics.
Requirement:
docker
package
Following charts are drawn:
- running containers
- count
- healthy containers
- count
- unhealthy containers
- count
configuration
update_every : 1
priority : 60000
dovecot
This module provides statistics information from Dovecot server.
Statistics are taken from dovecot socket by executing EXPORT global
command.
More information about dovecot stats can be found on project wiki page.
Requirement: Dovecot UNIX socket with R/W permissions for user netdata or Dovecot with configured TCP/IP socket.
Module gives information with following charts:
- sessions
- active sessions
- logins
- logins
- commands - number of IMAP commands
- commands
- Faults
- minor
- major
- Context Switches
- volountary
- involountary
- disk in bytes/s
- read
- write
- bytes in bytes/s
- read
- write
- number of syscalls in syscalls/s
- read
- write
- lookups - number of lookups per second
- path
- attr
- hits - number of cache hits
- hits
- attempts - authorization attempts
- success
- failure
- cache - cached authorization hits
- hit
- miss
configuration
Sample:
localtcpip:
name : 'local'
host : '127.0.0.1'
port : 24242
localsocket:
name : 'local'
socket : '/var/run/dovecot/stats'
If no configuration is given, module will attempt to connect to dovecot using unix socket localized in /var/run/dovecot/stats
elasticsearch
This module monitors Elasticsearch performance and health metrics.
It produces:
- Search performance charts:
- Number of queries, fetches
- Time spent on queries, fetches
- Query and fetch latency
- Indexing performance charts:
- Number of documents indexed, index refreshes, flushes
- Time spent on indexing, refreshing, flushing
- Indexing and flushing latency
- Memory usage and garbace collection charts:
- JVM heap currently in use, committed
- Count of garbage collections
- Time spent on garbage collections
- Host metrics charts:
- Available file descriptors in percent
- Opened HTTP connections
- Cluster communication transport metrics
- Queues and rejections charts:
- Number of queued/rejected threads in thread pool
- Fielddata cache charts:
- Fielddata cache size
- Fielddata evictions and circuit breaker tripped count
- Cluster health API charts:
- Cluster status
- Nodes and tasks statistics
- Shards statistics
- Cluster stats API charts:
- Nodes statistics
- Query cache statistics
- Docs statistics
- Store statistics
- Indices and shards statistics
configuration
Sample:
local:
host : 'ipaddress' # Server ip address or hostname
port : 'password' # Port on which elasticsearch listed
cluster_health : True/False # Calls to cluster health elasticsearch API. Enabled by default.
cluster_stats : True/False # Calls to cluster stats elasticsearch API. Enabled by default.
If no configuration is given, module will fail to run.
exim
Simple module executing exim -bpc
to grab exim queue.
This command can take a lot of time to finish its execution thus it is not recommended to run it every second.
It produces only one chart:
- Exim Queue Emails
- emails
Configuration is not needed.
fail2ban
Module monitor fail2ban log file to show all bans for all active jails
Requirements:
- fail2ban.log file MUST BE readable by netdata (A good idea is to add create 0640 root netdata to fail2ban conf at logrotate.d)
It produces one chart with multiple lines (one line per jail)
configuration
Sample:
local:
log_path: '/var/log/fail2ban.log'
conf_path: '/etc/fail2ban/jail.local'
exclude: 'dropbear apache'
If no configuration is given, module will attempt to read log file at /var/log/fail2ban.log
and conf file at /etc/fail2ban/jail.local
.
If conf file is not found default jail is ssh
.
freeradius
Uses the radclient
command to provide freeradius statistics. It is not recommended to run it every second.
It produces:
- Authentication counters:
- access-accepts
- access-rejects
- auth-dropped-requests
- auth-duplicate-requests
- auth-invalid-requests
- auth-malformed-requests
- auth-unknown-types
- Accounting counters: [optional]
- accounting-requests
- accounting-responses
- acct-dropped-requests
- acct-duplicate-requests
- acct-invalid-requests
- acct-malformed-requests
- acct-unknown-types
- Proxy authentication counters: [optional]
- proxy-access-accepts
- proxy-access-rejects
- proxy-auth-dropped-requests
- proxy-auth-duplicate-requests
- proxy-auth-invalid-requests
- proxy-auth-malformed-requests
- proxy-auth-unknown-types
- Proxy accounting counters: [optional]
- proxy-accounting-requests
- proxy-accounting-responses
- proxy-acct-dropped-requests
- proxy-acct-duplicate-requests
- proxy-acct-invalid-requests
- proxy-acct-malformed-requests
- proxy-acct-unknown-typesa
configuration
Sample:
local:
host : 'localhost'
port : '18121'
secret : 'adminsecret'
acct : False # Freeradius accounting statistics.
proxy_auth : False # Freeradius proxy authentication statistics.
proxy_acct : False # Freeradius proxy accounting statistics.
Freeradius server configuration:
The configuration for the status server is automatically created in the sites-available directory. By default, server is enabled and can be queried from every client. FreeRADIUS will only respond to status-server messages, if the status-server virtual server has been enabled.
To do this, create a link from the sites-enabled directory to the status file in the sites-available directory:
- cd sites-enabled
- ln -s ../sites-available/status status
and restart/reload your FREERADIUS server.
go_expvar
The go_expvar
module can monitor any Go application that exposes its metrics with the use of expvar
package from the Go standard library.
go_expvar
produces charts for Go runtime memory statistics and optionally any number of custom charts. Please see the wiki page for more info.
For the memory statistics, it produces the following charts:
- Heap allocations in kB
- alloc: size of objects allocated on the heap
- inuse: size of allocated heap spans
- Stack allocations in kB
- inuse: size of allocated stack spans
- MSpan allocations in kB
- inuse: size of allocated mspan structures
- MCache allocations in kB
- inuse: size of allocated mcache structures
- Virtual memory in kB
- sys: size of reserved virtual address space
- Live objects
- live: number of live objects in memory
- GC pauses average in ns
- avg: average duration of all GC stop-the-world pauses
configuration
Please see the wiki page for detailed info about module configuration.
haproxy
Module monitors frontend and backend metrics such as bytes in, bytes out, sessions current, sessions in queue current. And health metrics such as backend servers status (server check should be used).
Plugin can obtain data from url OR unix socket.
Requirement: Socket MUST be readable AND writable by netdata user.
It produces:
- Frontend family charts
- Kilobytes in/s
- Kilobytes out/s
- Sessions current
- Sessions in queue current
- Backend family charts
- Kilobytes in/s
- Kilobytes out/s
- Sessions current
- Sessions in queue current
- Health chart
- number of failed servers for every backend (in DOWN state)
configuration
Sample:
via_url:
user : 'username' # ONLY IF stats auth is used
pass : 'password' # # ONLY IF stats auth is used
url : 'http://ip.address:port/url;csv;norefresh'
OR
via_socket:
socket : 'path/to/haproxy/sock'
If no configuration is given, module will fail to run.
hddtemp
Module monitors disk temperatures from one or more hddtemp daemons.
Requirement:
Running hddtemp
in daemonized mode with access on tcp port
It produces one chart Temperature with dynamic number of dimensions (one per disk)
configuration
Sample:
update_every: 3
host: "127.0.0.1"
port: 7634
If no configuration is given, module will attempt to connect to hddtemp daemon on 127.0.0.1:7634
address
httpcheck
Module monitors remote http server for availability and response time.
Following charts are drawn per job:
- Response time ms
- Time in 0.1 ms resolution in which the server responds. If the connection failed, the value is missing.
- Status boolean
- Connection successful
- Unexpected content: No Regex match found in the response
- Unexpected status code: Do we get 500 errors?
- Connection failed: port not listening or blocked
- Connection timed out: host or port unreachable
configuration
Sample configuration and their default values.
server:
url: 'http://host:port/path' # required
status_accepted: # optional
- 200
timeout: 1 # optional, supports decimals (e.g. 0.2)
update_every: 3 # optional
regex: 'REGULAR_EXPRESSION' # optional, see https://docs.python.org/3/howto/regex.html
redirect: yes # optional
notes
- The status chart is primarily intended for alarms, badges or for access via API.
- A system/service/firewall might block netdata's access if a portscan or similar is detected.
- This plugin is meant for simple use cases. Currently, the accuracy of the response time is low and should be used as reference only.
icecast
This module will monitor number of listeners for active sources.
Requirements:
- icecast version >= 2.4.0
It produces the following charts:
- Listeners in listeners
- source number
configuration
Needs only url
to server's /status-json.xsl
Here is an example for remote server:
remote:
url : 'http://1.2.3.4:8443/status-json.xsl'
Without configuration, module attempts to connect to http://localhost:8443/status-json.xsl
IPFS
Module monitors IPFS basic information.
- Bandwidth in kbits/s
- in
- out
- Peers
- peers
configuration
Only url to IPFS server is needed.
Sample:
localhost:
name : 'local'
url : 'http://localhost:5001'
isc_dhcpd
Module monitor leases database to show all active leases for given pools.
Requirements:
- dhcpd leases file MUST BE readable by netdata
- pools MUST BE in CIDR format
It produces:
- Pools utilization Aggregate chart for all pools.
- utilization in percent
- Total leases
- leases (overall number of leases for all pools)
- Active leases for every pools
- leases (number of active leases in pool)
configuration
Sample:
local:
leases_path : '/var/lib/dhcp/dhcpd.leases'
pools : '192.168.3.0/24 192.168.4.0/24 192.168.5.0/24'
In case of python2 you need to install py2-ipaddress
to make plugin work.
The module will not work If no configuration is given.
litespeed
Module monitor litespeed web server performance metrics.
It produces:
- Network Throughput HTTP in kilobits/s
- in
- out
- Network Throughput HTTPS in kilobits/s
- in
- out
- Connections HTTP in connections
- free
- used
- Connections HTTPS in connections
- free
- used
- Requests in requests/s
- requests
- Requests In Processing in requests
- processing
- Public Cache Hits in hits/s
- hits
- Private Cache Hits in hits/s
- hits
- Static Hits in hits/s
- hits
configuration
local:
path : 'PATH'
If no configuration is given, module will use "/tmp/lshttpd/".
logind
This module monitors active sessions, users, and seats tracked by systemd-logind or elogind.
It provides the following charts:
- Sessions Tracks the total number of sessions.
- Graphical: Local graphical sessions (running X11, or Wayland, or something else).
- Console: Local console sessions.
- Remote: Remote sessions.
- Users Tracks total number of unique user logins of each type.
- Graphical
- Console
- Remote
- Seats Total number of seats in use.
- Seats
configuration
This module needs no configuration. Just make sure the netdata user
can run the loginctl
command and get a session list without having to
specify a path.
This will work with any command that can output data in the exact
same format as loginctl list-sessions --no-legend
. If you have some
other command you want to use that outputs data in this format, you can
specify it using the command
key like so:
command: '/path/to/other/command'
notes
-
This module's ability to track logins is dependent on what PAM services are configured to register sessions with logind. In particular, for most systems, it will only track TTY logins, local desktop logins, and logins through remote shell connections.
-
The users chart counts usernames not UID's. This is potentially important in configurations where multiple users have the same UID.
-
The users chart counts any given user name up to once for each type of login. So if the same user has a graphical and a console login on a system, they will show up once in the graphical count, and once in the console count.
-
Because the data collection process is rather expensive, this plugin is currently disabled by default, and needs to be explicitly enabled in
/etc/netdata/python.d.conf
before it will run.
mdstat
Module monitor /proc/mdstat
It produces:
-
Health Number of failed disks in every array (aggregate chart).
-
Disks stats
- total (number of devices array ideally would have)
- inuse (number of devices currently are in use)
- Current status
- resync in percent
- recovery in percent
- reshape in percent
- check in percent
- Operation status (if resync/recovery/reshape/check is active)
- finish in minutes
- speed in megabytes/s
configuration
No configuration is needed.
megacli
Module collects adapter, physical drives and battery stats.
Requirements:
netdata
user needs to be able to be able to sudo themegacli
program without password
To grab stats it executes:
sudo -n megacli -LDPDInfo -aAll
sudo -n megacli -AdpBbuCmd -a0
It produces:
-
Adapter State
-
Physical Drives Media Errors
-
Physical Drives Predictive Failures
-
Battery Relative State of Charge
-
Battery Cycle Count
configuration
Battery stats disabled by default in the module configuration file.
memcached
Memcached monitoring module. Data grabbed from stats interface.
- Network in kilobytes/s
- read
- written
- Connections per second
- current
- rejected
- total
- Items in cluster
- current
- total
- Evicted and Reclaimed items
- evicted
- reclaimed
- GET requests/s
- hits
- misses
- GET rate rate in requests/s
- rate
- SET rate rate in requests/s
- rate
- DELETE requests/s
- hits
- misses
- CAS requests/s
- hits
- misses
- bad value
- Increment requests/s
- hits
- misses
- Decrement requests/s
- hits
- misses
- Touch requests/s
- hits
- misses
- Touch rate rate in requests/s
- rate
configuration
Sample:
localtcpip:
name : 'local'
host : '127.0.0.1'
port : 24242
If no configuration is given, module will attempt to connect to memcached instance on 127.0.0.1:11211
address.
mongodb
Module monitor mongodb performance and health metrics
Requirements:
python-pymongo
package.
You need to install it manually.
Number of charts depends on mongodb version, storage engine and other features (replication):
- Read requests:
- query
- getmore (operation the cursor executes to get additional data from query)
- Write requests:
- insert
- delete
- update
- Active clients:
- readers (number of clients with read operations in progress or queued)
- writers (number of clients with write operations in progress or queued)
- Journal transactions:
- commits (count of transactions that have been written to the journal)
- Data written to the journal:
- volume (volume of data)
- Background flush (MMAPv1):
- average ms (average time taken by flushes to execute)
- last ms (time taken by the last flush)
- Read tickets (WiredTiger):
- in use (number of read tickets in use)
- available (number of available read tickets remaining)
- Write tickets (WiredTiger):
- in use (number of write tickets in use)
- available (number of available write tickets remaining)
- Cursors:
- opened (number of cursors currently opened by MongoDB for clients)
- timedOut (number of cursors that have timed)
- noTimeout (number of open cursors with timeout disabled)
- Connections:
- connected (number of clients currently connected to the database server)
- unused (number of unused connections available for new clients)
- Memory usage metrics:
- virtual
- resident (amount of memory used by the database process)
- mapped
- non mapped
- Page faults:
- page faults (number of times MongoDB had to request from disk)
- Cache metrics (WiredTiger):
- percentage of bytes currently in the cache (amount of space taken by cached data)
- percantage of tracked dirty bytes in the cache (amount of space taken by dirty data)
- Pages evicted from cache (WiredTiger):
- modified
- unmodified
- Queued requests:
- readers (number of read request currently queued)
- writers (number of write request currently queued)
- Errors:
- msg (number of message assertions raised)
- warning (number of warning assertions raised)
- regular (number of regular assertions raised)
- user (number of assertions corresponding to errors generated by users)
- Storage metrics (one chart for every database)
- dataSize (size of all documents + padding in the database)
- indexSize (size of all indexes in the database)
- storageSize (size of all extents in the database)
- Documents in the database (one chart for all databases)
- documents (number of objects in the database among all the collections)
- tcmalloc metrics
- central cache free
- current total thread cache
- pageheap free
- pageheap unmapped
- thread cache free
- transfer cache free
- heap size
- Commands total/failed rate
- count
- createIndex
- delete
- eval
- findAndModify
- insert
- Locks metrics (acquireCount metrics - number of times the lock was acquired in the specified mode)
- Global lock
- Database lock
- Collection lock
- Metadata lock
- oplog lock
- Replica set members state
- state
- Oplog window
- window (interval of time between the oldest and the latest entries in the oplog)
- Replication lag
- member (time when last entry from the oplog was applied for every member)
- Replication set member heartbeat latency
- member (time when last heartbeat was received from replica set member)
configuration
Sample:
local:
name : 'local'
host : '127.0.0.1'
port : 27017
user : 'netdata'
pass : 'netdata'
If no configuration is given, module will attempt to connect to mongodb daemon on 127.0.0.1:27017
address
monit
Monit monitoring module. Data is grabbed from stats XML interface (exists for a long time, but not mentioned in official documentation). Mostly this plugin shows statuses of monit targets, i.e. statuses of specified checks.
- Filesystems
- Filesystems
- Directories
- Files
- Pipes
- Applications
- Processes (+threads/childs)
- Programs
- Network
- Hosts (+latency)
- Network interfaces
configuration
Sample:
local:
name : 'local'
url : 'http://localhost:2812'
user: : admin
pass: : monit
If no configuration is given, module will attempt to connect to monit as http://localhost:2812
.
mysql
Module monitors one or more mysql servers
Requirements:
It will produce following charts (if data is available):
- Bandwidth in kbps
- in
- out
- Queries in queries/sec
- queries
- questions
- slow queries
- Operations in operations/sec
- opened tables
- flush
- commit
- delete
- prepare
- read first
- read key
- read next
- read prev
- read random
- read random next
- rollback
- save point
- update
- write
- Table Locks in locks/sec
- immediate
- waited
- Select Issues in issues/sec
- full join
- full range join
- range
- range check
- scan
- Sort Issues in issues/sec
- merge passes
- range
- scan
configuration
You can provide, per server, the following:
- username which have access to database (defaults to 'root')
- password (defaults to none)
- mysql my.cnf configuration file
- mysql socket (optional)
- mysql host (ip or hostname)
- mysql port (defaults to 3306)
Here is an example for 3 servers:
update_every : 10
priority : 90100
retries : 5
local:
'my.cnf' : '/etc/mysql/my.cnf'
priority : 90000
local_2:
user : 'root'
pass : 'blablablabla'
socket : '/var/run/mysqld/mysqld.sock'
update_every : 1
remote:
user : 'admin'
pass : 'bla'
host : 'example.org'
port : 9000
retries : 20
If no configuration is given, module will attempt to connect to mysql server via unix socket at /var/run/mysqld/mysqld.sock
without password and with username root
nginx
This module will monitor one or more nginx servers depending on configuration. Servers can be either local or remote.
Requirements:
- nginx with configured 'ngx_http_stub_status_module'
- 'location /stub_status'
Example nginx configuration can be found in 'python.d/nginx.conf'
It produces following charts:
- Active Connections
- active
- Requests in requests/s
- requests
- Active Connections by Status
- reading
- writing
- waiting
- Connections Rate in connections/s
- accepts
- handled
configuration
Needs only url
to server's stub_status
Here is an example for local server:
update_every : 10
priority : 90100
local:
url : 'http://localhost/stub_status'
retries : 10
Without configuration, module attempts to connect to http://localhost/stub_status
nginx_plus
This module will monitor one or more nginx_plus servers depending on configuration. Servers can be either local or remote.
Example nginx_plus configuration can be found in 'python.d/nginx_plus.conf'
It produces following charts:
- Requests total in requests/s
- total
- Requests current in requests
- current
- Connection Statistics in connections/s
- accepted
- dropped
- Workers Statistics in workers
- idle
- active
- SSL Handshakes in handshakes/s
- successful
- failed
- SSL Session Reuses in sessions/s
- reused
- SSL Memory Usage in percent
- usage
- Processes in processes
- respawned
For every server zone:
- Processing in requests
- processing
- Requests in requests/s
- requests
- Responses in requests/s
- 1xx
- 2xx
- 3xx
- 4xx
- 5xx
- Traffic in kilobits/s
- received
- sent
For every upstream:
- Peers Requests in requests/s
- peer name (dimension per peer)
- All Peers Responses in responses/s
- 1xx
- 2xx
- 3xx
- 4xx
- 5xx
- Peer Responses in requests/s (for every peer)
- 1xx
- 2xx
- 3xx
- 4xx
- 5xx
- Peers Connections in active
- peer name (dimension per peer)
- Peers Connections Usage in percent
- peer name (dimension per peer)
- All Peers Traffic in KB
- received
- sent
- Peer Traffic in KB/s (for every peer)
- received
- sent
- Peer Timings in ms (for every peer)
- header
- response
- Memory Usage in percent
- usage
- Peers Status in state
- peer name (dimension per peer)
- Peers Total Downtime in seconds
- peer name (dimension per peer)
For every cache:
- Traffic in KB
- served
- written
- bypass
- Memory Usage in percent
- usage
configuration
Needs only url
to server's status
Here is an example for local server:
local:
url : 'http://localhost/status'
Without configuration, module fail to start.
nsd
Module uses the nsd-control stats_noreset
command to provide nsd
statistics.
Requirements:
- Version of
nsd
must be 4.0+ - Netdata must have permissions to run
nsd-control stats_noreset
It produces:
- Queries
- queries
- Zones
- master
- slave
- Protocol
- udp
- udp6
- tcp
- tcp6
- Query Type
- A
- NS
- CNAME
- SOA
- PTR
- HINFO
- MX
- NAPTR
- TXT
- AAAA
- SRV
- ANY
- Transfer
- NOTIFY
- AXFR
- Return Code
- NOERROR
- FORMERR
- SERVFAIL
- NXDOMAIN
- NOTIMP
- REFUSED
- YXDOMAIN
Configuration is not needed.
ntpd
Module monitors the system variables of the local ntpd
daemon (optional incl. variables of the polled peers) using the NTP Control Message Protocol via UDP socket, similar to ntpq
, the standard NTP query program.
Requirements:
- Version:
NTPv4
- Local interrogation allowed in
/etc/ntp.conf
(default):
# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1
It produces:
- system
- offset
- jitter
- frequency
- delay
- dispersion
- stratum
- tc
- precision
- peers
- offset
- delay
- dispersion
- jitter
- rootdelay
- rootdispersion
- stratum
- hmode
- pmode
- hpoll
- ppoll
- precision
configuration
Sample:
update_every: 10
host: 'localhost'
port: '123'
show_peers: yes
# hide peers with source address in ranges 127.0.0.0/8 and 192.168.0.0/16
peer_filter: '(127\..*)|(192\.168\..*)'
# check for new/changed peers every 60 updates
peer_rescan: 60
Sample (multiple jobs):
Note: ntp.conf
on the host otherhost
must be configured to allow queries from our local host by including a line like restrict <IP> nomodify notrap nopeer
.
local:
host: 'localhost'
otherhost:
host: 'otherhost'
If no configuration is given, module will attempt to connect to ntpd
on ::1:123
or 127.0.0.1:123
and show charts for the systemvars. Use show_peers: yes
to also show the charts for configured peers. Local peers in the range 127.0.0.0/8
are hidden by default, use peer_filter: ''
to show all peers.
ovpn_status_log
Module monitor openvpn-status log file.
Requirements:
-
If you are running multiple OpenVPN instances out of the same directory, MAKE SURE TO EDIT DIRECTIVES which create output files so that multiple instances do not overwrite each other's output files.
-
Make sure NETDATA USER CAN READ openvpn-status.log
-
Update_every interval MUST MATCH interval on which OpenVPN writes operational status to log file.
It produces:
- Users OpenVPN active users
- users
- Traffic OpenVPN overall bandwidth usage in kilobit/s
- in
- out
configuration
Sample:
default
log_path : '/var/log/openvpn-status.log'
phpfpm
This module will monitor one or more php-fpm instances depending on configuration.
Requirements:
- php-fpm with enabled
status
page - access to
status
page via web server
It produces following charts:
- Active Connections
- active
- maxActive
- idle
- Requests in requests/s
- requests
- Performance
- reached
- slow
configuration
Needs only url
to server's status
Here is an example for local instance:
update_every : 3
priority : 90100
local:
url : 'http://localhost/status'
retries : 10
Without configuration, module attempts to connect to http://localhost/status
portcheck
Module monitors a remote TCP service.
Following charts are drawn per host:
- Latency ms
- Time required to connect to a TCP port. Displays latency in 0.1 ms resolution. If the connection failed, the value is missing.
- Status boolean
- Connection successful
- Could not create socket: possible DNS problems
- Connection refused: port not listening or blocked
- Connection timed out: host or port unreachable
configuration
server:
host: 'dns or ip' # required
port: 22 # required
timeout: 1 # optional
update_every: 1 # optional
notes
- The error chart is intended for alarms, badges or for access via API.
- A system/service/firewall might block netdata's access if a portscan or similar is detected.
- Currently, the accuracy of the latency is low and should be used as reference only.
postfix
Simple module executing postfix -p
to grab postfix queue.
It produces only two charts:
- Postfix Queue Emails
- emails
- Postfix Queue Emails Size in KB
- size
Configuration is not needed.
postgres
Module monitors one or more postgres servers.
Requirements:
python-psycopg2
package. You have to install it manually.
Following charts are drawn:
- Database size MB
- size
- Current Backend Processes processes
- active
- Write-Ahead Logging Statistics files/s
- total
- ready
- done
- Checkpoints writes/s
- scheduled
- requested
- Current connections to db count
- connections
- Tuples returned from db tuples/s
- sequential
- bitmap
- Tuple reads from db reads/s
- disk
- cache
- Transactions on db transactions/s
- committed
- rolled back
- Tuples written to db writes/s
- inserted
- updated
- deleted
- conflicts
- Locks on db count per type
- locks
configuration
socket:
name : 'socket'
user : 'postgres'
database : 'postgres'
tcp:
name : 'tcp'
user : 'postgres'
database : 'postgres'
host : 'localhost'
port : 5432
When no configuration file is found, module tries to connect to TCP/IP socket: localhost:5432
.
powerdns
Module monitor powerdns performance and health metrics.
Powerdns charts:
- Queries and Answers
- udp-queries
- udp-answers
- tcp-queries
- tcp-answers
- Cache Usage
- query-cache-hit
- query-cache-miss
- packetcache-hit
- packetcache-miss
- Cache Size
- query-cache-size
- packetcache-size
- key-cache-size
- meta-cache-size
- Latency
- latency
Powerdns Recursor charts:
- Questions In
- questions
- ipv6-questions
- tcp-queries
- Questions Out
- all-outqueries
- ipv6-outqueries
- tcp-outqueries
- throttled-outqueries
- Answer Times
- answers-slow
- answers0-1
- answers1-10
- answers10-100
- answers100-1000
- Timeouts
- outgoing-timeouts
- outgoing4-timeouts
- outgoing6-timeouts
- Drops
- over-capacity-drops
- Cache Usage
- cache-hits
- cache-misses
- packetcache-hits
- packetcache-misses
- Cache Size
- cache-entries
- packetcache-entries
- negcache-entries
configuration
local:
name : 'local'
url : 'http://127.0.0.1:8081/api/v1/servers/localhost/statistics'
header :
X-API-Key: 'change_me'
puppet
Monitor status of Puppet Server and Puppet DB.
Following charts are drawn:
- JVM Heap
- committed (allocated from OS)
- used (actual use)
- JVM Non-Heap
- committed (allocated from OS)
- used (actual use)
- CPU Usage
- execution
- GC (taken by garbage collection)
- File Descriptors
- max
- used
configuration
puppetdb:
url: 'https://fqdn.example.com:8081'
tls_cert_file: /path/to/client.crt
tls_key_file: /path/to/client.key
autodetection_retry: 1
retries: 3600
puppetserver:
url: 'https://fqdn.example.com:8140'
autodetection_retry: 1
retries: 3600
When no configuration is given then https://fqdn.example.com:8140
is
tried without any retries.
notes
- Exact Fully Qualified Domain Name of the node should be used.
- Usually Puppet Server/DB startup time is VERY long. So, there should be quite reasonable retry count.
- Secure PuppetDB config may require client certificate. Not applies to default PuppetDB configuration though.
rabbitmq
Module monitor rabbitmq performance and health metrics.
Following charts are drawn:
- Queued Messages
- ready
- unacknowledged
- Message Rates
- ack
- redelivered
- deliver
- publish
- Global Counts
- channels
- consumers
- connections
- queues
- exchanges
- File Descriptors
- used descriptors
- Socket Descriptors
- used descriptors
- Erlang processes
- used processes
- Erlang run queue
- Erlang run queue
- Memory
- free memory in megabytes
- Disk Space
- free disk space in gigabytes
configuration
socket:
name : 'local'
host : '127.0.0.1'
port : 15672
user : 'guest'
pass : 'guest'
When no configuration file is found, module tries to connect to: localhost:15672
.
redis
Get INFO data from redis instance.
Following charts are drawn:
- Operations per second
- operations
- Hit rate in percent
- rate
- Memory utilization in kilobytes
- total
- lua
- Database keys
- lines are creates dynamically based on how many databases are there
- Clients
- connected
- blocked
- Slaves
- connected
configuration
socket:
name : 'local'
socket : '/var/lib/redis/redis.sock'
localhost:
name : 'local'
host : 'localhost'
port : 6379
When no configuration file is found, module tries to connect to TCP/IP socket: localhost:6379
.
rethinkdb
Module monitor rethinkdb health metrics.
Following charts are drawn:
- Connected Servers
- connected
- missing
- Active Clients
- active
- Queries per second
- queries
- Documents per second
- documents
configuration
localhost:
name : 'local'
host : '127.0.0.1'
port : 28015
user : "user"
password : "pass"
When no configuration file is found, module tries to connect to 127.0.0.1:28015
.
samba
Performance metrics of Samba file sharing.
It produces the following charts:
- Syscall R/Ws in kilobytes/s
- sendfile
- recvfle
- Smb2 R/Ws in kilobytes/s
- readout
- writein
- readin
- writeout
- Smb2 Create/Close in operations/s
- create
- close
- Smb2 Info in operations/s
- getinfo
- setinfo
- Smb2 Find in operations/s
- find
- Smb2 Notify in operations/s
- notify
- Smb2 Lesser Ops as counters
- tcon
- negprot
- tdis
- cancel
- logoff
- flush
- lock
- keepalive
- break
- sessetup
configuration
Requires that smbd has been compiled with profiling enabled. Also required
that smbd
was started either with the -P 1
option or inside smb.conf
using smbd profiling level
.
This plugin uses smbstatus -P
which can only be executed by root. It uses
sudo and assumes that it is configured such that the netdata
user can
execute smbstatus as root without password.
For example:
netdata ALL=(ALL) NOPASSWD: /usr/bin/smbstatus -P
update_every : 5 # update frequency
sensors
System sensors information.
Charts are created dynamically.
configuration
For detailed configuration information please read sensors.conf
file.
possible issues
There have been reports from users that on certain servers, ACPI ring buffer errors are printed by the kernel (dmesg
) when ACPI sensors are being accessed.
We are tracking such cases in issue #827.
Please join this discussion for help.
spigotmc
This module does some really basic monitoring for Spigot Minecraft servers.
It provides two charts, one tracking server-side ticks-per-second in 1, 5 and 15 minute averages, and one tracking the number of currently active users.
This is not compatible with Spigot plugins which change the format of
the data returned by the tps
or list
console commands.
configuration
host: localhost
port: 25575
password: pass
By default, a connection to port 25575 on the local system is attempted with an empty password.
springboot
This module will monitor one or more Java Spring-boot applications depending on configuration.
It produces following charts:
- Response Codes in requests/s
- 1xx
- 2xx
- 3xx
- 4xx
- 5xx
- others
- Threads
- daemon
- total
- GC Time in milliseconds and GC Operations in operations/s
- Copy
- MarkSweep
- ...
- Heap Mmeory Usage in KB
- used
- committed
configuration
Please see the Monitoring Java Spring Boot Applications page for detailed info about module configuration.
squid
This module will monitor one or more squid instances depending on configuration.
It produces following charts:
- Client Bandwidth in kilobits/s
- in
- out
- hits
- Client Requests in requests/s
- requests
- hits
- errors
- Server Bandwidth in kilobits/s
- in
- out
- Server Requests in requests/s
- requests
- errors
configuration
priority : 50000
local:
request : 'cache_object://localhost:3128/counters'
host : 'localhost'
port : 3128
Without any configuration module will try to autodetect where squid presents its counters
data
smartd_log
Module monitor smartd
log files to collect HDD/SSD S.M.A.R.T attributes.
It produces following charts (you can add additional attributes in the module configuration file):
-
Read Error Rate attribute 1
-
Start/Stop Count attribute 4
-
Reallocated Sectors Count attribute 5
-
Seek Error Rate attribute 7
-
Power-On Hours Count attribute 9
-
Power Cycle Count attribute 12
-
Load/Unload Cycles attribute 193
-
Temperature attribute 194
-
Current Pending Sectors attribute 197
-
Off-Line Uncorrectable attribute 198
-
Write Error Rate attribute 200
configuration
local:
log_path : '/var/log/smartd/'
If no configuration is given, module will attempt to read log files in /var/log/smartd/ directory.
tomcat
Present tomcat containers memory utilization.
Charts:
- Requests per second
- accesses
- Volume in KB/s
- volume
- Threads
- current
- busy
- JVM Free Memory in MB
- jvm
configuration
localhost:
name : 'local'
url : 'http://127.0.0.1:8080/manager/status?XML=true'
user : 'tomcat_username'
pass : 'secret_tomcat_password'
Without configuration, module attempts to connect to http://localhost:8080/manager/status?XML=true
, without any credentials.
So it will probably fail.
Traefik
Module uses the health
API to provide statistics.
It produces:
- Responses by statuses
- success (1xx, 2xx, 304)
- error (5xx)
- redirect (3xx except 304)
- bad (4xx)
- other (all other responses)
- Responses by codes
- 2xx (successful)
- 5xx (internal server errors)
- 3xx (redirect)
- 4xx (bad)
- 1xx (informational)
- other (non-standart responses)
-
Detailed Response Codes requests/s (number of responses for each response code family individually)
-
Requests/s
- request statistics
- Total response time
- sum of all response time
-
Average response time
-
Average response time per iteration
-
Uptime
- Traefik server uptime
configuration
Needs only url
to server's health
Here is an example for local server:
update_every : 1
priority : 60000
local:
url : 'http://localhost:8080/health'
retries : 10
Without configuration, module attempts to connect to http://localhost:8080/health
.
Unbound
Monitoring uses the remote control interface to fetch statistics.
Provides the following charts:
- Queries Processed
- Ratelimited
- Cache Misses
- Cache Hits
- Expired
- Prefetched
- Recursive
- Request List
- Average Size
- Max Size
- Overwritten Requests
- Overruns
- Current Size
- User Requests
- Recursion Timings
- Average recursion processing time
- Median recursion processing time
If extended stats are enabled, also provides:
- Cache Sizes
- Message Cache
- RRset Cache
- Infra Cache
- DNSSEC Key Cache
- DNSCrypt Shared Secret Cache
- DNSCrypt Nonce Cache
configuration
Unbound must be manually configured to enable the remote-control protocol.
Check the Unbound documentation for info on how to do this. Additionally,
if you want to take advantage of the autodetection this plugin offers,
you will need to make sure your unbound.conf
file only uses spaces for
indentation (the default config shipped by most distributions uses tabs
instead of spaces).
Once you have the Unbound control protocol enabled, you need to make sure that either the certificate and key are readable by Netdata (if you're using the regular control interface), or that the socket is accessible to Netdata (if you're using a UNIX socket for the contorl interface).
By default, for the local system, everything can be auto-detected
assuming Unbound is configured correctly and has been told to listen
on the loopback interface or a UNIX socket. This is done by looking
up info in the Unbound config file specified by the ubconf
key.
To enable extended stats for a given job, add extended: yes
to the
definition.
You can also enable per-thread charts for a given job by adding
per_thread: yes
to the definition. Note that the numbe rof threads
is only checked on startup.
A basic local configuration with extended statistics and per-thread charts looks like this:
local:
ubconf: /etc/unbound/unbound.conf
extended: yes
per_thread: yes
While it's a bit more complicated to set up correctly, it is recommended that you use a UNIX socket as it provides far better performance.
varnish cache
Module uses the varnishstat
command to provide varnish cache statistics.
It produces:
- Connections Statistics in connections/s
- accepted
- dropped
- Client Requests in requests/s
- received
- All History Hit Rate Ratio in percent
- hit
- miss
- hitpass
- Current Poll Hit Rate Ratio in percent
- hit
- miss
- hitpass
- Expired Objects in expired/s
- objects
- Least Recently Used Nuked Objects in nuked/s
- objects
- Number Of Threads In All Pools in threads
- threads
- Threads Statistics in threads/s
- created
- failed
- limited
- Current Queue Length in requests
- in queue
- Backend Connections Statistics in connections/s
- successful
- unhealthy
- reused
- closed
- resycled
- failed
- Requests To The Backend in requests/s
- received
- ESI Statistics in problems/s
- errors
- warnings
- Memory Usage in MB
- free
- allocated
- Uptime in seconds
- uptime
configuration
No configuration is needed.
w1sensor
Data from 1-Wire sensors. On Linux these are supported by the wire, w1_gpio, and w1_therm modules. Currently temperature sensors are supported and automatically detected.
Charts are created dynamically based on the number of detected sensors.
configuration
For detailed configuration information please read w1sensor.conf
file.
web_log
Tails the apache/nginx/lighttpd/gunicorn log files to collect real-time web-server statistics.
It produces following charts:
- Response by type requests/s
- success (1xx, 2xx, 304)
- error (5xx)
- redirect (3xx except 304)
- bad (4xx)
- other (all other responses)
- Response by code family requests/s
- 1xx (informational)
- 2xx (successful)
- 3xx (redirect)
- 4xx (bad)
- 5xx (internal server errors)
- other (non-standart responses)
- unmatched (the lines in the log file that are not matched)
-
Detailed Response Codes requests/s (number of responses for each response code family individually)
-
Bandwidth KB/s
- received (bandwidth of requests)
- send (bandwidth of responses)
- Timings ms (request processing time)
- min (bandwidth of requests)
- max (bandwidth of responses)
- average (bandwidth of responses)
-
Request per url requests/s (configured by user)
-
Http Methods requests/s (requests per http method)
-
Http Versions requests/s (requests per http version)
-
IP protocols requests/s (requests per ip protocol version)
-
Current Poll Unique Client IPs unique ips/s (unique client IPs per data collection iteration)
-
All Time Unique Client IPs unique ips/s (unique client IPs since the last restart of netdata)
configuration
nginx_log:
name : 'nginx_log'
path : '/var/log/nginx/access.log'
apache_log:
name : 'apache_log'
path : '/var/log/apache/other_vhosts_access.log'
categories:
cacti : 'cacti.*'
observium : 'observium'
Module has preconfigured jobs for nginx, apache and gunicorn on various distros.