mirror of
https://github.com/netdata/netdata.git
synced 2025-04-06 22:38:55 +00:00
New logging layer (#16357)
* cleanup of logging - wip
* first working iteration
* add errno annotator
* replace old logging functions with netdata_logger()
* cleanup
* update error_limit
* fix remanining error_limit references
* work on fatal()
* started working on structured logs
* full cleanup
* default logging to files; fix all plugins initialization
* fix formatting of numbers
* cleanup and reorg
* fix coverity issues
* cleanup obsolete code
* fix formatting of numbers
* fix log rotation
* fix for older systems
* add detection of systemd journal via stderr
* finished on access.log
* remove left-over transport
* do not add empty fields to the logs
* journal get compact uuids; X-Transaction-ID header is added in web responses
* allow compiling on systems without memfd sealing
* added libnetdata/uuid directory
* move datetime formatters to libnetdata
* add missing files
* link the makefiles in libnetdata
* added uuid_parse_flexi() to parse UUIDs with and without hyphens; the web server now read X-Transaction-ID and uses it for functions and web responses
* added stream receiver, sender, proc plugin and pluginsd log stack
* iso8601 advanced usage; line_splitter module in libnetdata; code cleanup
* add message ids to streaming inbound and outbound connections
* cleanup line_splitter between lines to avoid logging garbage; when killing children, kill them with SIGABRT if internal checks is enabled
* send SIGABRT to external plugins only if we are not shutting down
* fix cross cleanup in pluginsd parser
* fatal when there is a stack error in logs
* compile netdata with -fexceptions
* do not kill external plugins with SIGABRT
* metasync info logs to debug level
* added severity to logs
* added json output; added options per log output; added documentation; fixed issues mentioned
* allow memfd only on linux
* moved journal low level functions to journal.c/h
* move health logs to daemon.log with proper priorities
* fixed a couple of bugs; health log in journal
* updated docs
* systemd-cat-native command to push structured logs to journal from the command line
* fix makefiles
* restored NETDATA_LOG_SEVERITY_LEVEL
* fix makefiles
* systemd-cat-native can also work as the logger of Netdata scripts
* do not require a socket to systemd-journal to log-as-netdata
* alarm notify logs in native format
* properly compare log ids
* fatals log alerts; alarm-notify.sh working
* fix overflow warning
* alarm-notify.sh now logs the request (command line)
* anotate external plugins logs with the function cmd they run
* added context, component and type to alarm-notify.sh; shell sanitization removes control character and characters that may be expanded by bash
* reformatted alarm-notify logs
* unify cgroup-network-helper.sh
* added quotes around params
* charts.d.plugin switched logging to journal native
* quotes for logfmt
* unify the status codes of streaming receivers and senders
* alarm-notify: dont log anything, if there is nothing to do
* all external plugins log to stderr when running outside netdata; alarm-notify now shows an error when notifications menthod are needed but are not available
* migrate cgroup-name.sh to new logging
* systemd-cat-native now supports messages with newlines
* socket.c logs use priority
* cleanup log field types
* inherit the systemd set INVOCATION_ID if found
* allow systemd-cat-native to send messages to a systemd-journal-remote URL
* log2journal command that can convert structured logs to journal export format
* various fixes and documentation of log2journal
* updated log2journal docs
* updated log2journal docs
* updated documentation of fields
* allow compiling without libcurl
* do not use socket as format string
* added version information to newly added tools
* updated documentation and help messages
* fix the namespace socket path
* print errno with error
* do not timeout
* updated docs
* updated docs
* updated docs
* log2journal updated docs and params
* when talking to a remote journal, systemd-cat-native batches the messages
* enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote
* Revert "enable lz4 compression for systemd-cat-native when sending messages to a systemd-journal-remote"
This reverts commit b079d53c11
.
* note about uncompressed traffic
* log2journal: code reorg and cleanup to make modular
* finished rewriting log2journal
* more comments
* rewriting rules support
* increased limits
* updated docs
* updated docs
* fix old log call
* use journal only when stderr is connected to journal
* update netdata.spec for libcurl, libpcre2 and log2journal
* pcre2-devel
* do not require pcre2 in centos < 8, amazonlinux < 2023, open suse
* log2journal only on systems pcre2 is available
* ignore log2journal in .gitignore
* avoid log2journal on centos 7, amazonlinux 2 and opensuse
* add pcre2-8 to static build
* undo last commit
* Bundle to static
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Add build deps for deb packages
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Add dependencies; build from source
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Test build for amazon linux and centos expect to fail for suse
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* fix minor oversight
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
* Reorg code
* Add the install from source (deps) as a TODO
* Not enable the build on suse ecosystem
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
---------
Signed-off-by: Tasos Katsoulas <tasos@netdata.cloud>
Co-authored-by: Tasos Katsoulas <tasos@netdata.cloud>
This commit is contained in:
parent
8f31356a0c
commit
3e508c8f95
120 changed files with 8651 additions and 3037 deletions
.gitignoreMakefile.amconfigure.ac
aclk
claim
cli
collectors
apps.plugin
cgroups.plugin
charts.d.plugin
cups.plugin
debugfs.plugin
ebpf.plugin
freeipmi.plugin
nfacct.plugin
perf.plugin
plugins.d
proc.plugin
slabinfo.plugin
statsd.plugin
systemd-journal.plugin
xenstat.plugin
contrib/debian
daemon
database
exporting/aws_kinesis
health
libnetdata
Makefile.am
buffer
buffered_reader
clocks
datetime
functions_evloop
inlined.hlibnetdata.clibnetdata.hline_splitter
log
Makefile.amREADME.mdjournal.cjournal.hlog.clog.hlog2journal.clog2journal.mdsystemd-cat-native.csystemd-cat-native.h
socket
threads
uuid
2
.gitignore
vendored
2
.gitignore
vendored
|
@ -41,6 +41,8 @@ sha256sums.txt
|
|||
# netdata binaries
|
||||
netdata
|
||||
netdatacli
|
||||
systemd-cat-native
|
||||
log2journal
|
||||
!netdata/
|
||||
upload/
|
||||
artifacts/
|
||||
|
|
39
Makefile.am
39
Makefile.am
|
@ -128,6 +128,7 @@ AM_CFLAGS = \
|
|||
$(OPTIONAL_CUPS_CFLAGS) \
|
||||
$(OPTIONAL_XENSTAT_CFLAGS) \
|
||||
$(OPTIONAL_BPF_CFLAGS) \
|
||||
$(OPTIONAL_SYSTEMD_CFLAGS) \
|
||||
$(OPTIONAL_GTEST_CFLAGS) \
|
||||
$(NULL)
|
||||
|
||||
|
@ -145,12 +146,18 @@ LIBNETDATA_FILES = \
|
|||
libnetdata/avl/avl.h \
|
||||
libnetdata/buffer/buffer.c \
|
||||
libnetdata/buffer/buffer.h \
|
||||
libnetdata/buffered_reader/buffered_reader.c \
|
||||
libnetdata/buffered_reader/buffered_reader.h \
|
||||
libnetdata/circular_buffer/circular_buffer.c \
|
||||
libnetdata/circular_buffer/circular_buffer.h \
|
||||
libnetdata/clocks/clocks.c \
|
||||
libnetdata/clocks/clocks.h \
|
||||
libnetdata/completion/completion.c \
|
||||
libnetdata/completion/completion.h \
|
||||
libnetdata/datetime/iso8601.c \
|
||||
libnetdata/datetime/iso8601.h \
|
||||
libnetdata/datetime/rfc7231.c \
|
||||
libnetdata/datetime/rfc7231.h \
|
||||
libnetdata/dictionary/dictionary.c \
|
||||
libnetdata/dictionary/dictionary.h \
|
||||
libnetdata/eval/eval.c \
|
||||
|
@ -167,8 +174,12 @@ LIBNETDATA_FILES = \
|
|||
libnetdata/libnetdata.c \
|
||||
libnetdata/libnetdata.h \
|
||||
libnetdata/required_dummies.h \
|
||||
libnetdata/line_splitter/line_splitter.c \
|
||||
libnetdata/line_splitter/line_splitter.h \
|
||||
libnetdata/locks/locks.c \
|
||||
libnetdata/locks/locks.h \
|
||||
libnetdata/log/journal.c \
|
||||
libnetdata/log/journal.h \
|
||||
libnetdata/log/log.c \
|
||||
libnetdata/log/log.h \
|
||||
libnetdata/onewayalloc/onewayalloc.c \
|
||||
|
@ -195,6 +206,8 @@ LIBNETDATA_FILES = \
|
|||
libnetdata/threads/threads.h \
|
||||
libnetdata/url/url.c \
|
||||
libnetdata/url/url.h \
|
||||
libnetdata/uuid/uuid.c \
|
||||
libnetdata/uuid/uuid.h \
|
||||
libnetdata/json/json.c \
|
||||
libnetdata/json/json.h \
|
||||
libnetdata/json/jsmn.c \
|
||||
|
@ -323,6 +336,16 @@ SYSTEMD_JOURNAL_PLUGIN_FILES = \
|
|||
$(LIBNETDATA_FILES) \
|
||||
$(NULL)
|
||||
|
||||
SYSTEMD_CAT_NATIVE_FILES = \
|
||||
libnetdata/log/systemd-cat-native.c \
|
||||
libnetdata/log/systemd-cat-native.h \
|
||||
$(LIBNETDATA_FILES) \
|
||||
$(NULL)
|
||||
|
||||
LOG2JOURNAL_FILES = \
|
||||
libnetdata/log/log2journal.c \
|
||||
$(NULL)
|
||||
|
||||
CUPS_PLUGIN_FILES = \
|
||||
collectors/cups.plugin/cups_plugin.c \
|
||||
$(LIBNETDATA_FILES) \
|
||||
|
@ -1179,6 +1202,7 @@ NETDATA_COMMON_LIBS = \
|
|||
$(OPTIONAL_MQTT_LIBS) \
|
||||
$(OPTIONAL_UV_LIBS) \
|
||||
$(OPTIONAL_LZ4_LIBS) \
|
||||
$(OPTIONAL_CURL_LIBS) \
|
||||
$(OPTIONAL_ZSTD_LIBS) \
|
||||
$(OPTIONAL_BROTLIENC_LIBS) \
|
||||
$(OPTIONAL_BROTLIDEC_LIBS) \
|
||||
|
@ -1190,6 +1214,7 @@ NETDATA_COMMON_LIBS = \
|
|||
$(OPTIONAL_YAML_LIBS) \
|
||||
$(OPTIONAL_ATOMIC_LIBS) \
|
||||
$(OPTIONAL_DL_LIBS) \
|
||||
$(OPTIONAL_SYSTEMD_LIBS) \
|
||||
$(OPTIONAL_GTEST_LIBS) \
|
||||
$(NULL)
|
||||
|
||||
|
@ -1290,6 +1315,14 @@ if ENABLE_PLUGIN_FREEIPMI
|
|||
$(NULL)
|
||||
endif
|
||||
|
||||
if ENABLE_LOG2JOURNAL
|
||||
sbin_PROGRAMS += log2journal
|
||||
log2journal_SOURCES = $(LOG2JOURNAL_FILES)
|
||||
log2journal_LDADD = \
|
||||
$(OPTIONAL_PCRE2_LIBS) \
|
||||
$(NULL)
|
||||
endif
|
||||
|
||||
if ENABLE_PLUGIN_SYSTEMD_JOURNAL
|
||||
plugins_PROGRAMS += systemd-journal.plugin
|
||||
systemd_journal_plugin_SOURCES = $(SYSTEMD_JOURNAL_PLUGIN_FILES)
|
||||
|
@ -1299,6 +1332,12 @@ if ENABLE_PLUGIN_SYSTEMD_JOURNAL
|
|||
$(NULL)
|
||||
endif
|
||||
|
||||
sbin_PROGRAMS += systemd-cat-native
|
||||
systemd_cat_native_SOURCES = $(SYSTEMD_CAT_NATIVE_FILES)
|
||||
systemd_cat_native_LDADD = \
|
||||
$(NETDATA_COMMON_LIBS) \
|
||||
$(NULL)
|
||||
|
||||
if ENABLE_PLUGIN_EBPF
|
||||
plugins_PROGRAMS += ebpf.plugin
|
||||
ebpf_plugin_SOURCES = $(EBPF_PLUGIN_FILES)
|
||||
|
|
90
aclk/aclk.c
90
aclk/aclk.c
|
@ -154,7 +154,9 @@ biofailed:
|
|||
|
||||
static int wait_till_cloud_enabled()
|
||||
{
|
||||
netdata_log_info("Waiting for Cloud to be enabled");
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO,
|
||||
"Waiting for Cloud to be enabled");
|
||||
|
||||
while (!netdata_cloud_enabled) {
|
||||
sleep_usec(USEC_PER_SEC * 1);
|
||||
if (!service_running(SERVICE_ACLK))
|
||||
|
@ -236,14 +238,19 @@ void aclk_mqtt_wss_log_cb(mqtt_wss_log_type_t log_type, const char* str)
|
|||
case MQTT_WSS_LOG_WARN:
|
||||
error_report("%s", str);
|
||||
return;
|
||||
|
||||
case MQTT_WSS_LOG_INFO:
|
||||
netdata_log_info("%s", str);
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO,
|
||||
"%s",
|
||||
str);
|
||||
return;
|
||||
|
||||
case MQTT_WSS_LOG_DEBUG:
|
||||
netdata_log_debug(D_ACLK, "%s", str);
|
||||
return;
|
||||
|
||||
default:
|
||||
netdata_log_error("Unknown log type from mqtt_wss");
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR,
|
||||
"Unknown log type from mqtt_wss");
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -297,7 +304,9 @@ static void puback_callback(uint16_t packet_id)
|
|||
#endif
|
||||
|
||||
if (aclk_shared_state.mqtt_shutdown_msg_id == (int)packet_id) {
|
||||
netdata_log_info("Shutdown message has been acknowledged by the cloud. Exiting gracefully");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Shutdown message has been acknowledged by the cloud. Exiting gracefully");
|
||||
|
||||
aclk_shared_state.mqtt_shutdown_msg_rcvd = 1;
|
||||
}
|
||||
}
|
||||
|
@ -335,9 +344,11 @@ static int handle_connection(mqtt_wss_client client)
|
|||
}
|
||||
|
||||
if (disconnect_req || aclk_kill_link) {
|
||||
netdata_log_info("Going to restart connection due to disconnect_req=%s (cloud req), aclk_kill_link=%s (reclaim)",
|
||||
disconnect_req ? "true" : "false",
|
||||
aclk_kill_link ? "true" : "false");
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"Going to restart connection due to disconnect_req=%s (cloud req), aclk_kill_link=%s (reclaim)",
|
||||
disconnect_req ? "true" : "false",
|
||||
aclk_kill_link ? "true" : "false");
|
||||
|
||||
disconnect_req = 0;
|
||||
aclk_kill_link = 0;
|
||||
aclk_graceful_disconnect(client);
|
||||
|
@ -390,7 +401,9 @@ static inline void mqtt_connected_actions(mqtt_wss_client client)
|
|||
|
||||
void aclk_graceful_disconnect(mqtt_wss_client client)
|
||||
{
|
||||
netdata_log_info("Preparing to gracefully shutdown ACLK connection");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Preparing to gracefully shutdown ACLK connection");
|
||||
|
||||
aclk_queue_lock();
|
||||
aclk_queue_flush();
|
||||
|
||||
|
@ -403,17 +416,22 @@ void aclk_graceful_disconnect(mqtt_wss_client client)
|
|||
break;
|
||||
}
|
||||
if (aclk_shared_state.mqtt_shutdown_msg_rcvd) {
|
||||
netdata_log_info("MQTT App Layer `disconnect` message sent successfully");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"MQTT App Layer `disconnect` message sent successfully");
|
||||
break;
|
||||
}
|
||||
}
|
||||
netdata_log_info("ACLK link is down");
|
||||
netdata_log_access("ACLK DISCONNECTED");
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "ACLK link is down");
|
||||
nd_log(NDLS_ACCESS, NDLP_WARNING, "ACLK DISCONNECTED");
|
||||
|
||||
aclk_stats_upd_online(0);
|
||||
last_disconnect_time = now_realtime_sec();
|
||||
aclk_connected = 0;
|
||||
|
||||
netdata_log_info("Attempting to gracefully shutdown the MQTT/WSS connection");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Attempting to gracefully shutdown the MQTT/WSS connection");
|
||||
|
||||
mqtt_wss_disconnect(client, 1000);
|
||||
}
|
||||
|
||||
|
@ -455,7 +473,9 @@ static int aclk_block_till_recon_allowed() {
|
|||
next_connection_attempt = now_realtime_sec() + (recon_delay / MSEC_PER_SEC);
|
||||
last_backoff_value = (float)recon_delay / MSEC_PER_SEC;
|
||||
|
||||
netdata_log_info("Wait before attempting to reconnect in %.3f seconds", recon_delay / (float)MSEC_PER_SEC);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Wait before attempting to reconnect in %.3f seconds", recon_delay / (float)MSEC_PER_SEC);
|
||||
|
||||
// we want to wake up from time to time to check netdata_exit
|
||||
while (recon_delay)
|
||||
{
|
||||
|
@ -593,7 +613,9 @@ static int aclk_attempt_to_connect(mqtt_wss_client client)
|
|||
return 1;
|
||||
}
|
||||
|
||||
netdata_log_info("Attempting connection now");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Attempting connection now");
|
||||
|
||||
memset(&base_url, 0, sizeof(url_t));
|
||||
if (url_parse(aclk_cloud_base_url, &base_url)) {
|
||||
aclk_status = ACLK_STATUS_INVALID_CLOUD_URL;
|
||||
|
@ -680,7 +702,9 @@ static int aclk_attempt_to_connect(mqtt_wss_client client)
|
|||
error_report("Can't use encoding=proto without at least \"proto\" capability.");
|
||||
continue;
|
||||
}
|
||||
netdata_log_info("New ACLK protobuf protocol negotiated successfully (/env response).");
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"New ACLK protobuf protocol negotiated successfully (/env response).");
|
||||
|
||||
memset(&auth_url, 0, sizeof(url_t));
|
||||
if (url_parse(aclk_env->auth_endpoint, &auth_url)) {
|
||||
|
@ -750,9 +774,9 @@ static int aclk_attempt_to_connect(mqtt_wss_client client)
|
|||
|
||||
if (!ret) {
|
||||
last_conn_time_mqtt = now_realtime_sec();
|
||||
netdata_log_info("ACLK connection successfully established");
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO, "ACLK connection successfully established");
|
||||
aclk_status = ACLK_STATUS_CONNECTED;
|
||||
netdata_log_access("ACLK CONNECTED");
|
||||
nd_log(NDLS_ACCESS, NDLP_INFO, "ACLK CONNECTED");
|
||||
mqtt_connected_actions(client);
|
||||
return 0;
|
||||
}
|
||||
|
@ -798,7 +822,9 @@ void *aclk_main(void *ptr)
|
|||
netdata_thread_disable_cancelability();
|
||||
|
||||
#if defined( DISABLE_CLOUD ) || !defined( ENABLE_ACLK )
|
||||
netdata_log_info("Killing ACLK thread -> cloud functionality has been disabled");
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO,
|
||||
"Killing ACLK thread -> cloud functionality has been disabled");
|
||||
|
||||
static_thread->enabled = NETDATA_MAIN_THREAD_EXITED;
|
||||
return NULL;
|
||||
#endif
|
||||
|
@ -857,7 +883,7 @@ void *aclk_main(void *ptr)
|
|||
aclk_stats_upd_online(0);
|
||||
last_disconnect_time = now_realtime_sec();
|
||||
aclk_connected = 0;
|
||||
netdata_log_access("ACLK DISCONNECTED");
|
||||
nd_log(NDLS_ACCESS, NDLP_WARNING, "ACLK DISCONNECTED");
|
||||
}
|
||||
} while (service_running(SERVICE_ACLK));
|
||||
|
||||
|
@ -924,7 +950,9 @@ void aclk_host_state_update(RRDHOST *host, int cmd)
|
|||
rrdhost_aclk_state_unlock(localhost);
|
||||
create_query->data.bin_payload.topic = ACLK_TOPICID_CREATE_NODE;
|
||||
create_query->data.bin_payload.msg_name = "CreateNodeInstance";
|
||||
netdata_log_info("Registering host=%s, hops=%u", host->machine_guid, host->system_info->hops);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Registering host=%s, hops=%u", host->machine_guid, host->system_info->hops);
|
||||
|
||||
aclk_queue_query(create_query);
|
||||
return;
|
||||
}
|
||||
|
@ -947,8 +975,10 @@ void aclk_host_state_update(RRDHOST *host, int cmd)
|
|||
query->data.bin_payload.payload = generate_node_instance_connection(&query->data.bin_payload.size, &node_state_update);
|
||||
rrdhost_aclk_state_unlock(localhost);
|
||||
|
||||
netdata_log_info("Queuing status update for node=%s, live=%d, hops=%u",(char*)node_state_update.node_id, cmd,
|
||||
host->system_info->hops);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Queuing status update for node=%s, live=%d, hops=%u",
|
||||
(char*)node_state_update.node_id, cmd, host->system_info->hops);
|
||||
|
||||
freez((void*)node_state_update.node_id);
|
||||
query->data.bin_payload.msg_name = "UpdateNodeInstanceConnection";
|
||||
query->data.bin_payload.topic = ACLK_TOPICID_NODE_CONN;
|
||||
|
@ -990,9 +1020,10 @@ void aclk_send_node_instances()
|
|||
node_state_update.claim_id = localhost->aclk_state.claimed_id;
|
||||
query->data.bin_payload.payload = generate_node_instance_connection(&query->data.bin_payload.size, &node_state_update);
|
||||
rrdhost_aclk_state_unlock(localhost);
|
||||
netdata_log_info("Queuing status update for node=%s, live=%d, hops=%d",(char*)node_state_update.node_id,
|
||||
list->live,
|
||||
list->hops);
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Queuing status update for node=%s, live=%d, hops=%d",
|
||||
(char*)node_state_update.node_id, list->live, list->hops);
|
||||
|
||||
freez((void*)node_state_update.capabilities);
|
||||
freez((void*)node_state_update.node_id);
|
||||
|
@ -1014,8 +1045,11 @@ void aclk_send_node_instances()
|
|||
node_instance_creation.claim_id = localhost->aclk_state.claimed_id,
|
||||
create_query->data.bin_payload.payload = generate_node_instance_creation(&create_query->data.bin_payload.size, &node_instance_creation);
|
||||
rrdhost_aclk_state_unlock(localhost);
|
||||
netdata_log_info("Queuing registration for host=%s, hops=%d",(char*)node_instance_creation.machine_guid,
|
||||
list->hops);
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Queuing registration for host=%s, hops=%d",
|
||||
(char*)node_instance_creation.machine_guid, list->hops);
|
||||
|
||||
freez((void *)node_instance_creation.machine_guid);
|
||||
aclk_queue_query(create_query);
|
||||
}
|
||||
|
|
|
@ -90,6 +90,12 @@ static bool aclk_web_client_interrupt_cb(struct web_client *w __maybe_unused, vo
|
|||
}
|
||||
|
||||
static int http_api_v2(struct aclk_query_thread *query_thr, aclk_query_t query) {
|
||||
ND_LOG_STACK lgs[] = {
|
||||
ND_LOG_FIELD_TXT(NDF_SRC_TRANSPORT, "aclk"),
|
||||
ND_LOG_FIELD_END(),
|
||||
};
|
||||
ND_LOG_STACK_PUSH(lgs);
|
||||
|
||||
int retval = 0;
|
||||
BUFFER *local_buffer = NULL;
|
||||
size_t size = 0;
|
||||
|
@ -110,7 +116,7 @@ static int http_api_v2(struct aclk_query_thread *query_thr, aclk_query_t query)
|
|||
usec_t t;
|
||||
web_client_timeout_checkpoint_set(w, query->timeout);
|
||||
if(web_client_timeout_checkpoint_and_check(w, &t)) {
|
||||
netdata_log_access("QUERY CANCELED: QUEUE TIME EXCEEDED %llu ms (LIMIT %d ms)", t / USEC_PER_MS, query->timeout);
|
||||
nd_log(NDLS_ACCESS, NDLP_ERR, "QUERY CANCELED: QUEUE TIME EXCEEDED %llu ms (LIMIT %d ms)", t / USEC_PER_MS, query->timeout);
|
||||
retval = 1;
|
||||
w->response.code = HTTP_RESP_SERVICE_UNAVAILABLE;
|
||||
aclk_http_msg_v2_err(query_thr->client, query->callback_topic, query->msg_id, w->response.code, CLOUD_EC_SND_TIMEOUT, CLOUD_EMSG_SND_TIMEOUT, NULL, 0);
|
||||
|
@ -217,25 +223,8 @@ static int http_api_v2(struct aclk_query_thread *query_thr, aclk_query_t query)
|
|||
// send msg.
|
||||
w->response.code = aclk_http_msg_v2(query_thr->client, query->callback_topic, query->msg_id, t, query->created, w->response.code, local_buffer->buffer, local_buffer->len);
|
||||
|
||||
struct timeval tv;
|
||||
|
||||
cleanup:
|
||||
now_monotonic_high_precision_timeval(&tv);
|
||||
netdata_log_access("%llu: %d '[ACLK]:%d' '%s' (sent/all = %zu/%zu bytes %0.0f%%, prep/sent/total = %0.2f/%0.2f/%0.2f ms) %d '%s'",
|
||||
w->id
|
||||
, gettid()
|
||||
, query_thr->idx
|
||||
, "DATA"
|
||||
, sent
|
||||
, size
|
||||
, size > sent ? -(((size - sent) / (double)size) * 100.0) : ((size > 0) ? (((sent - size ) / (double)size) * 100.0) : 0.0)
|
||||
, dt_usec(&w->timings.tv_ready, &w->timings.tv_in) / 1000.0
|
||||
, dt_usec(&tv, &w->timings.tv_ready) / 1000.0
|
||||
, dt_usec(&tv, &w->timings.tv_in) / 1000.0
|
||||
, w->response.code
|
||||
, strip_control_characters((char *)buffer_tostring(w->url_as_received))
|
||||
);
|
||||
|
||||
web_client_log_completed_request(w, false);
|
||||
web_client_release_to_cache(w);
|
||||
|
||||
pending_req_list_rm(query->msg_id);
|
||||
|
|
|
@ -455,7 +455,7 @@ int cancel_pending_req(const char *msg, size_t msg_len)
|
|||
return 1;
|
||||
}
|
||||
|
||||
netdata_log_access("ACLK CancelPendingRequest REQ: %s, cloud trace-id: %s", cmd.request_id, cmd.trace_id);
|
||||
nd_log(NDLS_ACCESS, NDLP_NOTICE, "ACLK CancelPendingRequest REQ: %s, cloud trace-id: %s", cmd.request_id, cmd.trace_id);
|
||||
|
||||
if (mark_pending_req_cancelled(cmd.request_id))
|
||||
error_report("CancelPending Request for %s failed. No such pending request.", cmd.request_id);
|
||||
|
|
|
@ -323,11 +323,11 @@ static bool check_claim_param(const char *s) {
|
|||
}
|
||||
|
||||
void claim_reload_all(void) {
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
load_claiming_state();
|
||||
registry_update_cloud_base_url();
|
||||
rrdpush_send_claimed_id(localhost);
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_reset();
|
||||
}
|
||||
|
||||
int api_v2_claim(struct web_client *w, char *url) {
|
||||
|
|
17
cli/cli.c
17
cli/cli.c
|
@ -3,25 +3,18 @@
|
|||
#include "cli.h"
|
||||
#include "daemon/pipename.h"
|
||||
|
||||
void error_int(int is_collector __maybe_unused, const char *prefix __maybe_unused, const char *file __maybe_unused, const char *function __maybe_unused, const unsigned long line __maybe_unused, const char *fmt, ... ) {
|
||||
FILE *fp = stderr;
|
||||
|
||||
void netdata_logger(ND_LOG_SOURCES source, ND_LOG_FIELD_PRIORITY priority, const char *file, const char *function, unsigned long line, const char *fmt, ... ) {
|
||||
va_list args;
|
||||
va_start( args, fmt );
|
||||
vfprintf(fp, fmt, args );
|
||||
va_end( args );
|
||||
va_start(args, fmt);
|
||||
vfprintf(stderr, fmt, args );
|
||||
va_end(args);
|
||||
}
|
||||
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
|
||||
uint64_t debug_flags;
|
||||
|
||||
void debug_int( const char *file __maybe_unused , const char *function __maybe_unused , const unsigned long line __maybe_unused, const char *fmt __maybe_unused, ... )
|
||||
{
|
||||
|
||||
}
|
||||
|
||||
void fatal_int( const char *file __maybe_unused, const char *function __maybe_unused, const unsigned long line __maybe_unused, const char *fmt __maybe_unused, ... )
|
||||
void netdata_logger_fatal( const char *file __maybe_unused, const char *function __maybe_unused, const unsigned long line __maybe_unused, const char *fmt __maybe_unused, ... )
|
||||
{
|
||||
abort();
|
||||
};
|
||||
|
|
|
@ -5234,25 +5234,11 @@ static void function_processes(const char *transaction, char *function __maybe_u
|
|||
static bool apps_plugin_exit = false;
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
// debug_flags = D_PROCFILE;
|
||||
stderror = stderr;
|
||||
|
||||
clocks_init();
|
||||
nd_log_initialize_for_external_plugins("apps.plugin");
|
||||
|
||||
pagesize = (size_t)sysconf(_SC_PAGESIZE);
|
||||
|
||||
// set the name for logging
|
||||
program_name = "apps.plugin";
|
||||
|
||||
// disable syslog for apps.plugin
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
|
||||
bool send_resource_usage = true;
|
||||
{
|
||||
const char *s = getenv("NETDATA_INTERNALS_MONITORING");
|
||||
|
|
|
@ -12,57 +12,106 @@
|
|||
export PATH="${PATH}:/sbin:/usr/sbin:/usr/local/sbin"
|
||||
export LC_ALL=C
|
||||
|
||||
cmd_line="'${0}' $(printf "'%s' " "${@}")"
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# logging
|
||||
|
||||
PROGRAM_NAME="$(basename "${0}")"
|
||||
|
||||
LOG_LEVEL_ERR=1
|
||||
LOG_LEVEL_WARN=2
|
||||
LOG_LEVEL_INFO=3
|
||||
LOG_LEVEL="$LOG_LEVEL_INFO"
|
||||
# these should be the same with syslog() priorities
|
||||
NDLP_EMERG=0 # system is unusable
|
||||
NDLP_ALERT=1 # action must be taken immediately
|
||||
NDLP_CRIT=2 # critical conditions
|
||||
NDLP_ERR=3 # error conditions
|
||||
NDLP_WARN=4 # warning conditions
|
||||
NDLP_NOTICE=5 # normal but significant condition
|
||||
NDLP_INFO=6 # informational
|
||||
NDLP_DEBUG=7 # debug-level messages
|
||||
|
||||
set_log_severity_level() {
|
||||
case ${NETDATA_LOG_SEVERITY_LEVEL,,} in
|
||||
"info") LOG_LEVEL="$LOG_LEVEL_INFO";;
|
||||
"warn" | "warning") LOG_LEVEL="$LOG_LEVEL_WARN";;
|
||||
"err" | "error") LOG_LEVEL="$LOG_LEVEL_ERR";;
|
||||
# the max (numerically) log level we will log
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
|
||||
set_log_min_priority() {
|
||||
case "${NETDATA_LOG_PRIORITY_LEVEL,,}" in
|
||||
"emerg" | "emergency")
|
||||
LOG_LEVEL=$NDLP_EMERG
|
||||
;;
|
||||
|
||||
"alert")
|
||||
LOG_LEVEL=$NDLP_ALERT
|
||||
;;
|
||||
|
||||
"crit" | "critical")
|
||||
LOG_LEVEL=$NDLP_CRIT
|
||||
;;
|
||||
|
||||
"err" | "error")
|
||||
LOG_LEVEL=$NDLP_ERR
|
||||
;;
|
||||
|
||||
"warn" | "warning")
|
||||
LOG_LEVEL=$NDLP_WARN
|
||||
;;
|
||||
|
||||
"notice")
|
||||
LOG_LEVEL=$NDLP_NOTICE
|
||||
;;
|
||||
|
||||
"info")
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
;;
|
||||
|
||||
"debug")
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
set_log_severity_level
|
||||
|
||||
logdate() {
|
||||
date "+%Y-%m-%d %H:%M:%S"
|
||||
}
|
||||
set_log_min_priority
|
||||
|
||||
log() {
|
||||
local status="${1}"
|
||||
shift
|
||||
local level="${1}"
|
||||
shift 1
|
||||
|
||||
echo >&2 "$(logdate): ${PROGRAM_NAME}: ${status}: ${*}"
|
||||
[[ -n "$level" && -n "$LOG_LEVEL" && "$level" -gt "$LOG_LEVEL" ]] && return
|
||||
|
||||
systemd-cat-native --log-as-netdata --newline="{NEWLINE}" <<EOFLOG
|
||||
INVOCATION_ID=${NETDATA_INVOCATION_ID}
|
||||
SYSLOG_IDENTIFIER=${PROGRAM_NAME}
|
||||
PRIORITY=${level}
|
||||
THREAD_TAG="cgroup-name"
|
||||
ND_LOG_SOURCE=collector
|
||||
ND_REQUEST=${cmd_line}
|
||||
MESSAGE=${*//[$'\r\n']/{NEWLINE}}
|
||||
|
||||
EOFLOG
|
||||
# AN EMPTY LINE IS NEEDED ABOVE
|
||||
}
|
||||
|
||||
info() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_INFO" -gt "$LOG_LEVEL" ]] && return
|
||||
log INFO "${@}"
|
||||
log "$NDLP_INFO" "${@}"
|
||||
}
|
||||
|
||||
warning() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_WARN" -gt "$LOG_LEVEL" ]] && return
|
||||
log WARNING "${@}"
|
||||
log "$NDLP_WARN" "${@}"
|
||||
}
|
||||
|
||||
error() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_ERR" -gt "$LOG_LEVEL" ]] && return
|
||||
log ERROR "${@}"
|
||||
log "$NDLP_ERR" "${@}"
|
||||
}
|
||||
|
||||
fatal() {
|
||||
log FATAL "${@}"
|
||||
log "$NDLP_ALERT" "${@}"
|
||||
exit 1
|
||||
}
|
||||
|
||||
debug() {
|
||||
log "$NDLP_DEBUG" "${@}"
|
||||
}
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
function parse_docker_like_inspect_output() {
|
||||
local output="${1}"
|
||||
eval "$(grep -E "^(NOMAD_NAMESPACE|NOMAD_JOB_NAME|NOMAD_TASK_NAME|NOMAD_SHORT_ALLOC_ID|CONT_NAME|IMAGE_NAME)=" <<<"$output")"
|
||||
|
|
|
@ -29,65 +29,117 @@
|
|||
|
||||
export LC_ALL=C
|
||||
|
||||
cmd_line="'${0}' $(printf "'%s' " "${@}")"
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# logging
|
||||
|
||||
PROGRAM_NAME="$(basename "${0}")"
|
||||
|
||||
LOG_LEVEL_ERR=1
|
||||
LOG_LEVEL_WARN=2
|
||||
LOG_LEVEL_INFO=3
|
||||
LOG_LEVEL="$LOG_LEVEL_INFO"
|
||||
# these should be the same with syslog() priorities
|
||||
NDLP_EMERG=0 # system is unusable
|
||||
NDLP_ALERT=1 # action must be taken immediately
|
||||
NDLP_CRIT=2 # critical conditions
|
||||
NDLP_ERR=3 # error conditions
|
||||
NDLP_WARN=4 # warning conditions
|
||||
NDLP_NOTICE=5 # normal but significant condition
|
||||
NDLP_INFO=6 # informational
|
||||
NDLP_DEBUG=7 # debug-level messages
|
||||
|
||||
set_log_severity_level() {
|
||||
case ${NETDATA_LOG_SEVERITY_LEVEL,,} in
|
||||
"info") LOG_LEVEL="$LOG_LEVEL_INFO";;
|
||||
"warn" | "warning") LOG_LEVEL="$LOG_LEVEL_WARN";;
|
||||
"err" | "error") LOG_LEVEL="$LOG_LEVEL_ERR";;
|
||||
# the max (numerically) log level we will log
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
|
||||
set_log_min_priority() {
|
||||
case "${NETDATA_LOG_PRIORITY_LEVEL,,}" in
|
||||
"emerg" | "emergency")
|
||||
LOG_LEVEL=$NDLP_EMERG
|
||||
;;
|
||||
|
||||
"alert")
|
||||
LOG_LEVEL=$NDLP_ALERT
|
||||
;;
|
||||
|
||||
"crit" | "critical")
|
||||
LOG_LEVEL=$NDLP_CRIT
|
||||
;;
|
||||
|
||||
"err" | "error")
|
||||
LOG_LEVEL=$NDLP_ERR
|
||||
;;
|
||||
|
||||
"warn" | "warning")
|
||||
LOG_LEVEL=$NDLP_WARN
|
||||
;;
|
||||
|
||||
"notice")
|
||||
LOG_LEVEL=$NDLP_NOTICE
|
||||
;;
|
||||
|
||||
"info")
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
;;
|
||||
|
||||
"debug")
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
set_log_severity_level
|
||||
|
||||
logdate() {
|
||||
date "+%Y-%m-%d %H:%M:%S"
|
||||
}
|
||||
set_log_min_priority
|
||||
|
||||
log() {
|
||||
local status="${1}"
|
||||
shift
|
||||
local level="${1}"
|
||||
shift 1
|
||||
|
||||
echo >&2 "$(logdate): ${PROGRAM_NAME}: ${status}: ${*}"
|
||||
[[ -n "$level" && -n "$LOG_LEVEL" && "$level" -gt "$LOG_LEVEL" ]] && return
|
||||
|
||||
systemd-cat-native --log-as-netdata --newline="{NEWLINE}" <<EOFLOG
|
||||
INVOCATION_ID=${NETDATA_INVOCATION_ID}
|
||||
SYSLOG_IDENTIFIER=${PROGRAM_NAME}
|
||||
PRIORITY=${level}
|
||||
THREAD_TAG="cgroup-network-helper.sh"
|
||||
ND_LOG_SOURCE=collector
|
||||
ND_REQUEST=${cmd_line}
|
||||
MESSAGE=${*//[$'\r\n']/{NEWLINE}}
|
||||
|
||||
EOFLOG
|
||||
# AN EMPTY LINE IS NEEDED ABOVE
|
||||
}
|
||||
|
||||
info() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_INFO" -gt "$LOG_LEVEL" ]] && return
|
||||
log INFO "${@}"
|
||||
log "$NDLP_INFO" "${@}"
|
||||
}
|
||||
|
||||
warning() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_WARN" -gt "$LOG_LEVEL" ]] && return
|
||||
log WARNING "${@}"
|
||||
log "$NDLP_WARN" "${@}"
|
||||
}
|
||||
|
||||
error() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_ERR" -gt "$LOG_LEVEL" ]] && return
|
||||
log ERROR "${@}"
|
||||
log "$NDLP_ERR" "${@}"
|
||||
}
|
||||
|
||||
fatal() {
|
||||
log FATAL "${@}"
|
||||
exit 1
|
||||
log "$NDLP_ALERT" "${@}"
|
||||
exit 1
|
||||
}
|
||||
|
||||
debug=${NETDATA_CGROUP_NETWORK_HELPER_DEBUG=0}
|
||||
debug() {
|
||||
[ "${debug}" = "1" ] && log DEBUG "${@}"
|
||||
log "$NDLP_DEBUG" "${@}"
|
||||
}
|
||||
|
||||
debug=0
|
||||
if [ "${NETDATA_CGROUP_NETWORK_HELPER_DEBUG-0}" = "1" ]; then
|
||||
debug=1
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# check for BASH v4+ (required for associative arrays)
|
||||
|
||||
[ $(( BASH_VERSINFO[0] )) -lt 4 ] && \
|
||||
fatal "BASH version 4 or later is required (this is ${BASH_VERSION})."
|
||||
if [ ${BASH_VERSINFO[0]} -lt 4 ]; then
|
||||
echo >&2 "BASH version 4 or later is required (this is ${BASH_VERSION})."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# parse the arguments
|
||||
|
@ -99,7 +151,10 @@ do
|
|||
case "${1}" in
|
||||
--cgroup) cgroup="${2}"; shift 1;;
|
||||
--pid|-p) pid="${2}"; shift 1;;
|
||||
--debug|debug) debug=1;;
|
||||
--debug|debug)
|
||||
debug=1
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
;;
|
||||
*) fatal "Cannot understand argument '${1}'";;
|
||||
esac
|
||||
|
||||
|
|
|
@ -649,12 +649,11 @@ void usage(void) {
|
|||
}
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
stderror = stderr;
|
||||
pid_t pid = 0;
|
||||
|
||||
program_name = argv[0];
|
||||
program_version = VERSION;
|
||||
error_log_syslog = 0;
|
||||
clocks_init();
|
||||
nd_log_initialize_for_external_plugins("cgroup-network");
|
||||
|
||||
// since cgroup-network runs as root, prevent it from opening symbolic links
|
||||
procfile_open_flags = O_RDONLY|O_NOFOLLOW;
|
||||
|
@ -687,8 +686,6 @@ int main(int argc, char **argv) {
|
|||
|
||||
if(argc != 3)
|
||||
usage();
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
|
||||
int arg = 1;
|
||||
int helper = 1;
|
||||
|
|
|
@ -16,24 +16,111 @@
|
|||
export PATH="${PATH}:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin"
|
||||
|
||||
PROGRAM_FILE="$0"
|
||||
PROGRAM_NAME="$(basename $0)"
|
||||
PROGRAM_NAME="${PROGRAM_NAME/.plugin/}"
|
||||
MODULE_NAME="main"
|
||||
|
||||
LOG_LEVEL_ERR=1
|
||||
LOG_LEVEL_WARN=2
|
||||
LOG_LEVEL_INFO=3
|
||||
LOG_LEVEL="$LOG_LEVEL_INFO"
|
||||
# -----------------------------------------------------------------------------
|
||||
# logging
|
||||
|
||||
set_log_severity_level() {
|
||||
case ${NETDATA_LOG_SEVERITY_LEVEL,,} in
|
||||
"info") LOG_LEVEL="$LOG_LEVEL_INFO";;
|
||||
"warn" | "warning") LOG_LEVEL="$LOG_LEVEL_WARN";;
|
||||
"err" | "error") LOG_LEVEL="$LOG_LEVEL_ERR";;
|
||||
PROGRAM_NAME="$(basename "${0}")"
|
||||
|
||||
# these should be the same with syslog() priorities
|
||||
NDLP_EMERG=0 # system is unusable
|
||||
NDLP_ALERT=1 # action must be taken immediately
|
||||
NDLP_CRIT=2 # critical conditions
|
||||
NDLP_ERR=3 # error conditions
|
||||
NDLP_WARN=4 # warning conditions
|
||||
NDLP_NOTICE=5 # normal but significant condition
|
||||
NDLP_INFO=6 # informational
|
||||
NDLP_DEBUG=7 # debug-level messages
|
||||
|
||||
# the max (numerically) log level we will log
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
|
||||
set_log_min_priority() {
|
||||
case "${NETDATA_LOG_PRIORITY_LEVEL,,}" in
|
||||
"emerg" | "emergency")
|
||||
LOG_LEVEL=$NDLP_EMERG
|
||||
;;
|
||||
|
||||
"alert")
|
||||
LOG_LEVEL=$NDLP_ALERT
|
||||
;;
|
||||
|
||||
"crit" | "critical")
|
||||
LOG_LEVEL=$NDLP_CRIT
|
||||
;;
|
||||
|
||||
"err" | "error")
|
||||
LOG_LEVEL=$NDLP_ERR
|
||||
;;
|
||||
|
||||
"warn" | "warning")
|
||||
LOG_LEVEL=$NDLP_WARN
|
||||
;;
|
||||
|
||||
"notice")
|
||||
LOG_LEVEL=$NDLP_NOTICE
|
||||
;;
|
||||
|
||||
"info")
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
;;
|
||||
|
||||
"debug")
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
set_log_severity_level
|
||||
set_log_min_priority
|
||||
|
||||
log() {
|
||||
local level="${1}"
|
||||
shift 1
|
||||
|
||||
[[ -n "$level" && -n "$LOG_LEVEL" && "$level" -gt "$LOG_LEVEL" ]] && return
|
||||
|
||||
systemd-cat-native --log-as-netdata <<EOFLOG
|
||||
INVOCATION_ID=${NETDATA_INVOCATION_ID}
|
||||
SYSLOG_IDENTIFIER=${PROGRAM_NAME}
|
||||
PRIORITY=${level}
|
||||
THREAD_TAG="charts.d.plugin"
|
||||
ND_LOG_SOURCE=collector
|
||||
MESSAGE=${MODULE_NAME}: ${*//[$'\r\n']}
|
||||
|
||||
EOFLOG
|
||||
# AN EMPTY LINE IS NEEDED ABOVE
|
||||
}
|
||||
|
||||
info() {
|
||||
log "$NDLP_INFO" "${@}"
|
||||
}
|
||||
|
||||
warning() {
|
||||
log "$NDLP_WARN" "${@}"
|
||||
}
|
||||
|
||||
error() {
|
||||
log "$NDLP_ERR" "${@}"
|
||||
}
|
||||
|
||||
fatal() {
|
||||
log "$NDLP_ALERT" "${@}"
|
||||
echo "DISABLE"
|
||||
exit 1
|
||||
}
|
||||
|
||||
debug() {
|
||||
[ "$debug" = "1" ] && log "$NDLP_DEBUG" "${@}"
|
||||
}
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# check for BASH v4+ (required for associative arrays)
|
||||
|
||||
if [ ${BASH_VERSINFO[0]} -lt 4 ]; then
|
||||
echo >&2 "BASH version 4 or later is required (this is ${BASH_VERSION})."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# create temp dir
|
||||
|
@ -62,39 +149,6 @@ logdate() {
|
|||
date "+%Y-%m-%d %H:%M:%S"
|
||||
}
|
||||
|
||||
log() {
|
||||
local status="${1}"
|
||||
shift
|
||||
|
||||
echo >&2 "$(logdate): ${PROGRAM_NAME}: ${status}: ${MODULE_NAME}: ${*}"
|
||||
|
||||
}
|
||||
|
||||
info() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_INFO" -gt "$LOG_LEVEL" ]] && return
|
||||
log INFO "${@}"
|
||||
}
|
||||
|
||||
warning() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_WARN" -gt "$LOG_LEVEL" ]] && return
|
||||
log WARNING "${@}"
|
||||
}
|
||||
|
||||
error() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_ERR" -gt "$LOG_LEVEL" ]] && return
|
||||
log ERROR "${@}"
|
||||
}
|
||||
|
||||
fatal() {
|
||||
log FATAL "${@}"
|
||||
echo "DISABLE"
|
||||
exit 1
|
||||
}
|
||||
|
||||
debug() {
|
||||
[ $debug -eq 1 ] && log DEBUG "${@}"
|
||||
}
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# check a few commands
|
||||
|
||||
|
@ -194,12 +248,14 @@ while [ ! -z "$1" ]; do
|
|||
|
||||
if [ "$1" = "debug" -o "$1" = "all" ]; then
|
||||
debug=1
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
shift
|
||||
continue
|
||||
fi
|
||||
|
||||
if [ -f "$chartsd/$1.chart.sh" ]; then
|
||||
debug=1
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
chart_only="$(echo $1.chart.sh | sed "s/\.chart\.sh$//g")"
|
||||
shift
|
||||
continue
|
||||
|
@ -207,6 +263,7 @@ while [ ! -z "$1" ]; do
|
|||
|
||||
if [ -f "$chartsd/$1" ]; then
|
||||
debug=1
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
chart_only="$(echo $1 | sed "s/\.chart\.sh$//g")"
|
||||
shift
|
||||
continue
|
||||
|
|
|
@ -226,22 +226,8 @@ void reset_metrics() {
|
|||
}
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
stderror = stderr;
|
||||
clocks_init();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// initialization of netdata plugin
|
||||
|
||||
program_name = "cups.plugin";
|
||||
|
||||
// disable syslog
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
nd_log_initialize_for_external_plugins("cups.plugin");
|
||||
|
||||
parse_command_line(argc, argv);
|
||||
|
||||
|
|
|
@ -159,16 +159,8 @@ static void debugfs_parse_args(int argc, char **argv)
|
|||
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
// debug_flags = D_PROCFILE;
|
||||
stderror = stderr;
|
||||
|
||||
// set the name for logging
|
||||
program_name = "debugfs.plugin";
|
||||
|
||||
// disable syslog for debugfs.plugin
|
||||
error_log_syslog = 0;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
clocks_init();
|
||||
nd_log_initialize_for_external_plugins("debugfs.plugin");
|
||||
|
||||
netdata_configured_host_prefix = getenv("NETDATA_HOST_PREFIX");
|
||||
if (verify_netdata_host_prefix() == -1)
|
||||
|
|
|
@ -4024,11 +4024,9 @@ static void ebpf_manage_pid(pid_t pid)
|
|||
*/
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
stderror = stderr;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
|
||||
clocks_init();
|
||||
nd_log_initialize_for_external_plugins("ebpf.plugin");
|
||||
|
||||
main_thread_id = gettid();
|
||||
|
||||
set_global_variables();
|
||||
|
@ -4038,16 +4036,6 @@ int main(int argc, char **argv)
|
|||
if (ebpf_check_conditions())
|
||||
return 2;
|
||||
|
||||
// set name
|
||||
program_name = "ebpf.plugin";
|
||||
|
||||
// disable syslog
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
if (ebpf_adjust_memory_limit())
|
||||
return 3;
|
||||
|
||||
|
|
|
@ -1622,30 +1622,14 @@ static void plugin_exit(int code) {
|
|||
}
|
||||
|
||||
int main (int argc, char **argv) {
|
||||
clocks_init();
|
||||
nd_log_initialize_for_external_plugins("freeipmi.plugin");
|
||||
netdata_threads_init_for_external_plugins(0); // set the default threads stack size here
|
||||
|
||||
bool netdata_do_sel = IPMI_ENABLE_SEL_BY_DEFAULT;
|
||||
|
||||
stderror = stderr;
|
||||
clocks_init();
|
||||
|
||||
bool debug = false;
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// initialization of netdata plugin
|
||||
|
||||
program_name = "freeipmi.plugin";
|
||||
|
||||
// disable syslog
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
|
||||
// initialize the threads
|
||||
netdata_threads_init_for_external_plugins(0); // set the default threads stack size here
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// parse command line parameters
|
||||
|
||||
|
|
|
@ -747,22 +747,8 @@ void nfacct_signals()
|
|||
}
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
stderror = stderr;
|
||||
clocks_init();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// initialization of netdata plugin
|
||||
|
||||
program_name = "nfacct.plugin";
|
||||
|
||||
// disable syslog
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
nd_log_initialize_for_external_plugins("nfacct.plugin");
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// parse command line parameters
|
||||
|
|
|
@ -1283,22 +1283,8 @@ void parse_command_line(int argc, char **argv) {
|
|||
}
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
stderror = stderr;
|
||||
clocks_init();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// initialization of netdata plugin
|
||||
|
||||
program_name = "perf.plugin";
|
||||
|
||||
// disable syslog
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
nd_log_initialize_for_external_plugins("perf.plugin");
|
||||
|
||||
parse_command_line(argc, argv);
|
||||
|
||||
|
|
|
@ -47,8 +47,7 @@ static inline bool plugin_is_running(struct plugind *cd) {
|
|||
return ret;
|
||||
}
|
||||
|
||||
static void pluginsd_worker_thread_cleanup(void *arg)
|
||||
{
|
||||
static void pluginsd_worker_thread_cleanup(void *arg) {
|
||||
struct plugind *cd = (struct plugind *)arg;
|
||||
|
||||
worker_unregister();
|
||||
|
@ -143,41 +142,64 @@ static void *pluginsd_worker_thread(void *arg) {
|
|||
|
||||
netdata_thread_cleanup_push(pluginsd_worker_thread_cleanup, arg);
|
||||
|
||||
struct plugind *cd = (struct plugind *)arg;
|
||||
plugin_set_running(cd);
|
||||
{
|
||||
struct plugind *cd = (struct plugind *) arg;
|
||||
plugin_set_running(cd);
|
||||
|
||||
size_t count = 0;
|
||||
size_t count = 0;
|
||||
|
||||
while (service_running(SERVICE_COLLECTORS)) {
|
||||
FILE *fp_child_input = NULL;
|
||||
FILE *fp_child_output = netdata_popen(cd->cmd, &cd->unsafe.pid, &fp_child_input);
|
||||
while(service_running(SERVICE_COLLECTORS)) {
|
||||
FILE *fp_child_input = NULL;
|
||||
FILE *fp_child_output = netdata_popen(cd->cmd, &cd->unsafe.pid, &fp_child_input);
|
||||
|
||||
if (unlikely(!fp_child_input || !fp_child_output)) {
|
||||
netdata_log_error("PLUGINSD: 'host:%s', cannot popen(\"%s\", \"r\").", rrdhost_hostname(cd->host), cd->cmd);
|
||||
break;
|
||||
}
|
||||
if(unlikely(!fp_child_input || !fp_child_output)) {
|
||||
netdata_log_error("PLUGINSD: 'host:%s', cannot popen(\"%s\", \"r\").",
|
||||
rrdhost_hostname(cd->host), cd->cmd);
|
||||
break;
|
||||
}
|
||||
|
||||
netdata_log_info("PLUGINSD: 'host:%s' connected to '%s' running on pid %d",
|
||||
rrdhost_hostname(cd->host), cd->fullfilename, cd->unsafe.pid);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"PLUGINSD: 'host:%s' connected to '%s' running on pid %d",
|
||||
rrdhost_hostname(cd->host),
|
||||
cd->fullfilename, cd->unsafe.pid);
|
||||
|
||||
count = pluginsd_process(cd->host, cd, fp_child_input, fp_child_output, 0);
|
||||
const char *plugin = strrchr(cd->fullfilename, '/');
|
||||
if(plugin)
|
||||
plugin++;
|
||||
else
|
||||
plugin = cd->fullfilename;
|
||||
|
||||
netdata_log_info("PLUGINSD: 'host:%s', '%s' (pid %d) disconnected after %zu successful data collections (ENDs).",
|
||||
rrdhost_hostname(cd->host), cd->fullfilename, cd->unsafe.pid, count);
|
||||
char module[100];
|
||||
snprintfz(module, sizeof(module), "plugins.d[%s]", plugin);
|
||||
ND_LOG_STACK lgs[] = {
|
||||
ND_LOG_FIELD_TXT(NDF_MODULE, module),
|
||||
ND_LOG_FIELD_TXT(NDF_NIDL_NODE, rrdhost_hostname(cd->host)),
|
||||
ND_LOG_FIELD_TXT(NDF_SRC_TRANSPORT, "pluginsd"),
|
||||
ND_LOG_FIELD_END(),
|
||||
};
|
||||
ND_LOG_STACK_PUSH(lgs);
|
||||
|
||||
killpid(cd->unsafe.pid);
|
||||
count = pluginsd_process(cd->host, cd, fp_child_input, fp_child_output, 0);
|
||||
|
||||
int worker_ret_code = netdata_pclose(fp_child_input, fp_child_output, cd->unsafe.pid);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"PLUGINSD: 'host:%s', '%s' (pid %d) disconnected after %zu successful data collections (ENDs).",
|
||||
rrdhost_hostname(cd->host), cd->fullfilename, cd->unsafe.pid, count);
|
||||
|
||||
if (likely(worker_ret_code == 0))
|
||||
pluginsd_worker_thread_handle_success(cd);
|
||||
else
|
||||
pluginsd_worker_thread_handle_error(cd, worker_ret_code);
|
||||
killpid(cd->unsafe.pid);
|
||||
|
||||
cd->unsafe.pid = 0;
|
||||
if (unlikely(!plugin_is_enabled(cd)))
|
||||
break;
|
||||
}
|
||||
int worker_ret_code = netdata_pclose(fp_child_input, fp_child_output, cd->unsafe.pid);
|
||||
|
||||
if(likely(worker_ret_code == 0))
|
||||
pluginsd_worker_thread_handle_success(cd);
|
||||
else
|
||||
pluginsd_worker_thread_handle_error(cd, worker_ret_code);
|
||||
|
||||
cd->unsafe.pid = 0;
|
||||
|
||||
if(unlikely(!plugin_is_enabled(cd)))
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
netdata_thread_cleanup_pop(1);
|
||||
return NULL;
|
||||
|
|
|
@ -21,8 +21,6 @@
|
|||
#define PLUGINSD_KEYWORD_REPORT_JOB_STATUS "REPORT_JOB_STATUS"
|
||||
#define PLUGINSD_KEYWORD_DELETE_JOB "DELETE_JOB"
|
||||
|
||||
#define PLUGINSD_MAX_WORDS 30
|
||||
|
||||
#define PLUGINSD_MAX_DIRECTORIES 20
|
||||
extern char *plugin_directories[PLUGINSD_MAX_DIRECTORIES];
|
||||
|
||||
|
|
|
@ -153,11 +153,12 @@ static inline bool pluginsd_set_scope_chart(PARSER *parser, RRDSET *st, const ch
|
|||
|
||||
if(unlikely(old_collector_tid)) {
|
||||
if(old_collector_tid != my_collector_tid) {
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl, "PLUGINSD: keyword %s: 'host:%s/chart:%s' is collected twice (my tid %d, other collector tid %d)",
|
||||
keyword ? keyword : "UNKNOWN",
|
||||
rrdhost_hostname(st->rrdhost), rrdset_id(st),
|
||||
my_collector_tid, old_collector_tid);
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_COLLECTORS, NDLP_WARNING,
|
||||
"PLUGINSD: keyword %s: 'host:%s/chart:%s' is collected twice (my tid %d, other collector tid %d)",
|
||||
keyword ? keyword : "UNKNOWN",
|
||||
rrdhost_hostname(st->rrdhost), rrdset_id(st),
|
||||
my_collector_tid, old_collector_tid);
|
||||
|
||||
return false;
|
||||
}
|
||||
|
@ -389,8 +390,9 @@ static inline PARSER_RC PLUGINSD_DISABLE_PLUGIN(PARSER *parser, const char *keyw
|
|||
parser->user.enabled = 0;
|
||||
|
||||
if(keyword && msg) {
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl, "PLUGINSD: keyword %s: %s", keyword, msg);
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_COLLECTORS, NDLP_INFO,
|
||||
"PLUGINSD: keyword %s: %s", keyword, msg);
|
||||
}
|
||||
|
||||
return PARSER_RC_ERROR;
|
||||
|
@ -1109,7 +1111,8 @@ void pluginsd_function_cancel(void *data) {
|
|||
dfe_done(t);
|
||||
|
||||
if(sent <= 0)
|
||||
netdata_log_error("PLUGINSD: FUNCTION_CANCEL request didn't match any pending function requests in pluginsd.d.");
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"PLUGINSD: FUNCTION_CANCEL request didn't match any pending function requests in pluginsd.d.");
|
||||
}
|
||||
|
||||
// this is the function that is called from
|
||||
|
@ -1626,9 +1629,10 @@ static inline PARSER_RC pluginsd_replay_set(char **words, size_t num_words, PARS
|
|||
if(!st) return PLUGINSD_DISABLE_PLUGIN(parser, NULL, NULL);
|
||||
|
||||
if(!parser->user.replay.rset_enabled) {
|
||||
error_limit_static_thread_var(erl, 1, 0);
|
||||
error_limit(&erl, "PLUGINSD: 'host:%s/chart:%s' got a %s but it is disabled by %s errors",
|
||||
rrdhost_hostname(host), rrdset_id(st), PLUGINSD_KEYWORD_REPLAY_SET, PLUGINSD_KEYWORD_REPLAY_BEGIN);
|
||||
nd_log_limit_static_thread_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_COLLECTORS, NDLP_ERR,
|
||||
"PLUGINSD: 'host:%s/chart:%s' got a %s but it is disabled by %s errors",
|
||||
rrdhost_hostname(host), rrdset_id(st), PLUGINSD_KEYWORD_REPLAY_SET, PLUGINSD_KEYWORD_REPLAY_BEGIN);
|
||||
|
||||
// we have to return OK here
|
||||
return PARSER_RC_OK;
|
||||
|
@ -1675,8 +1679,10 @@ static inline PARSER_RC pluginsd_replay_set(char **words, size_t num_words, PARS
|
|||
rd->collector.counter++;
|
||||
}
|
||||
else {
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl, "PLUGINSD: 'host:%s/chart:%s/dim:%s' has the ARCHIVED flag set, but it is replicated. Ignoring data.",
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_COLLECTORS, NDLP_WARNING,
|
||||
"PLUGINSD: 'host:%s/chart:%s/dim:%s' has the ARCHIVED flag set, but it is replicated. "
|
||||
"Ignoring data.",
|
||||
rrdhost_hostname(st->rrdhost), rrdset_id(st), rrddim_name(rd));
|
||||
}
|
||||
}
|
||||
|
@ -2832,61 +2838,6 @@ static inline PARSER_RC streaming_claimed_id(char **words, size_t num_words, PAR
|
|||
|
||||
// ----------------------------------------------------------------------------
|
||||
|
||||
static inline bool buffered_reader_read(struct buffered_reader *reader, int fd) {
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
if(reader->read_buffer[reader->read_len] != '\0')
|
||||
fatal("%s(): read_buffer does not start with zero", __FUNCTION__ );
|
||||
#endif
|
||||
|
||||
ssize_t bytes_read = read(fd, reader->read_buffer + reader->read_len, sizeof(reader->read_buffer) - reader->read_len - 1);
|
||||
if(unlikely(bytes_read <= 0))
|
||||
return false;
|
||||
|
||||
reader->read_len += bytes_read;
|
||||
reader->read_buffer[reader->read_len] = '\0';
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static inline bool buffered_reader_read_timeout(struct buffered_reader *reader, int fd, int timeout_ms) {
|
||||
errno = 0;
|
||||
struct pollfd fds[1];
|
||||
|
||||
fds[0].fd = fd;
|
||||
fds[0].events = POLLIN;
|
||||
|
||||
int ret = poll(fds, 1, timeout_ms);
|
||||
|
||||
if (ret > 0) {
|
||||
/* There is data to read */
|
||||
if (fds[0].revents & POLLIN)
|
||||
return buffered_reader_read(reader, fd);
|
||||
|
||||
else if(fds[0].revents & POLLERR) {
|
||||
netdata_log_error("PARSER: read failed: POLLERR.");
|
||||
return false;
|
||||
}
|
||||
else if(fds[0].revents & POLLHUP) {
|
||||
netdata_log_error("PARSER: read failed: POLLHUP.");
|
||||
return false;
|
||||
}
|
||||
else if(fds[0].revents & POLLNVAL) {
|
||||
netdata_log_error("PARSER: read failed: POLLNVAL.");
|
||||
return false;
|
||||
}
|
||||
|
||||
netdata_log_error("PARSER: poll() returned positive number, but POLLIN|POLLERR|POLLHUP|POLLNVAL are not set.");
|
||||
return false;
|
||||
}
|
||||
else if (ret == 0) {
|
||||
netdata_log_error("PARSER: timeout while waiting for data.");
|
||||
return false;
|
||||
}
|
||||
|
||||
netdata_log_error("PARSER: poll() failed with code %d.", ret);
|
||||
return false;
|
||||
}
|
||||
|
||||
void pluginsd_process_thread_cleanup(void *ptr) {
|
||||
PARSER *parser = (PARSER *)ptr;
|
||||
|
||||
|
@ -2905,6 +2856,33 @@ void pluginsd_process_thread_cleanup(void *ptr) {
|
|||
parser_destroy(parser);
|
||||
}
|
||||
|
||||
bool parser_reconstruct_node(BUFFER *wb, void *ptr) {
|
||||
PARSER *parser = ptr;
|
||||
if(!parser || !parser->user.host)
|
||||
return false;
|
||||
|
||||
buffer_strcat(wb, rrdhost_hostname(parser->user.host));
|
||||
return true;
|
||||
}
|
||||
|
||||
bool parser_reconstruct_instance(BUFFER *wb, void *ptr) {
|
||||
PARSER *parser = ptr;
|
||||
if(!parser || !parser->user.st)
|
||||
return false;
|
||||
|
||||
buffer_strcat(wb, rrdset_name(parser->user.st));
|
||||
return true;
|
||||
}
|
||||
|
||||
bool parser_reconstruct_context(BUFFER *wb, void *ptr) {
|
||||
PARSER *parser = ptr;
|
||||
if(!parser || !parser->user.st)
|
||||
return false;
|
||||
|
||||
buffer_strcat(wb, string2str(parser->user.st->context));
|
||||
return true;
|
||||
}
|
||||
|
||||
inline size_t pluginsd_process(RRDHOST *host, struct plugind *cd, FILE *fp_plugin_input, FILE *fp_plugin_output, int trust_durations)
|
||||
{
|
||||
int enabled = cd->unsafe.enabled;
|
||||
|
@ -2952,33 +2930,51 @@ inline size_t pluginsd_process(RRDHOST *host, struct plugind *cd, FILE *fp_plugi
|
|||
// so, parser needs to be allocated before pushing it
|
||||
netdata_thread_cleanup_push(pluginsd_process_thread_cleanup, parser);
|
||||
|
||||
buffered_reader_init(&parser->reader);
|
||||
BUFFER *buffer = buffer_create(sizeof(parser->reader.read_buffer) + 2, NULL);
|
||||
while(likely(service_running(SERVICE_COLLECTORS))) {
|
||||
if (unlikely(!buffered_reader_next_line(&parser->reader, buffer))) {
|
||||
if(unlikely(!buffered_reader_read_timeout(&parser->reader, fileno((FILE *)parser->fp_input), 2 * 60 * MSEC_PER_SEC)))
|
||||
break;
|
||||
{
|
||||
ND_LOG_STACK lgs[] = {
|
||||
ND_LOG_FIELD_CB(NDF_REQUEST, line_splitter_reconstruct_line, &parser->line),
|
||||
ND_LOG_FIELD_CB(NDF_NIDL_NODE, parser_reconstruct_node, parser),
|
||||
ND_LOG_FIELD_CB(NDF_NIDL_INSTANCE, parser_reconstruct_instance, parser),
|
||||
ND_LOG_FIELD_CB(NDF_NIDL_CONTEXT, parser_reconstruct_context, parser),
|
||||
ND_LOG_FIELD_END(),
|
||||
};
|
||||
ND_LOG_STACK_PUSH(lgs);
|
||||
|
||||
continue;
|
||||
}
|
||||
buffered_reader_init(&parser->reader);
|
||||
BUFFER *buffer = buffer_create(sizeof(parser->reader.read_buffer) + 2, NULL);
|
||||
while(likely(service_running(SERVICE_COLLECTORS))) {
|
||||
|
||||
if(unlikely(parser_action(parser, buffer->buffer)))
|
||||
break;
|
||||
if(unlikely(!buffered_reader_next_line(&parser->reader, buffer))) {
|
||||
buffered_reader_ret_t ret = buffered_reader_read_timeout(
|
||||
&parser->reader,
|
||||
fileno((FILE *) parser->fp_input),
|
||||
2 * 60 * MSEC_PER_SEC, true
|
||||
);
|
||||
|
||||
buffer->len = 0;
|
||||
buffer->buffer[0] = '\0';
|
||||
}
|
||||
buffer_free(buffer);
|
||||
if(unlikely(ret != BUFFERED_READER_READ_OK))
|
||||
break;
|
||||
|
||||
cd->unsafe.enabled = parser->user.enabled;
|
||||
count = parser->user.data_collections_count;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (likely(count)) {
|
||||
cd->successful_collections += count;
|
||||
cd->serial_failures = 0;
|
||||
}
|
||||
else
|
||||
cd->serial_failures++;
|
||||
if(unlikely(parser_action(parser, buffer->buffer)))
|
||||
break;
|
||||
|
||||
buffer->len = 0;
|
||||
buffer->buffer[0] = '\0';
|
||||
}
|
||||
buffer_free(buffer);
|
||||
|
||||
cd->unsafe.enabled = parser->user.enabled;
|
||||
count = parser->user.data_collections_count;
|
||||
|
||||
if(likely(count)) {
|
||||
cd->successful_collections += count;
|
||||
cd->serial_failures = 0;
|
||||
}
|
||||
else
|
||||
cd->serial_failures++;
|
||||
}
|
||||
|
||||
// free parser with the pop function
|
||||
netdata_thread_cleanup_pop(1);
|
||||
|
|
|
@ -97,7 +97,6 @@ typedef struct parser {
|
|||
PARSER_REPERTOIRE repertoire;
|
||||
uint32_t flags;
|
||||
int fd; // Socket
|
||||
size_t line;
|
||||
FILE *fp_input; // Input source e.g. stream
|
||||
FILE *fp_output; // Stream to send commands to plugin
|
||||
|
||||
|
@ -111,6 +110,8 @@ typedef struct parser {
|
|||
PARSER_USER_OBJECT user; // User defined structure to hold extra state between calls
|
||||
|
||||
struct buffered_reader reader;
|
||||
struct line_splitter line;
|
||||
PARSER_KEYWORD *keyword;
|
||||
|
||||
struct {
|
||||
const char *end_keyword;
|
||||
|
@ -162,13 +163,17 @@ static inline PARSER_KEYWORD *parser_find_keyword(PARSER *parser, const char *co
|
|||
return NULL;
|
||||
}
|
||||
|
||||
bool parser_reconstruct_node(BUFFER *wb, void *ptr);
|
||||
bool parser_reconstruct_instance(BUFFER *wb, void *ptr);
|
||||
bool parser_reconstruct_context(BUFFER *wb, void *ptr);
|
||||
|
||||
static inline int parser_action(PARSER *parser, char *input) {
|
||||
#ifdef NETDATA_LOG_STREAM_RECEIVE
|
||||
static __thread char line[PLUGINSD_LINE_MAX + 1];
|
||||
strncpyz(line, input, sizeof(line) - 1);
|
||||
#endif
|
||||
|
||||
parser->line++;
|
||||
parser->line.count++;
|
||||
|
||||
if(unlikely(parser->flags & PARSER_DEFER_UNTIL_KEYWORD)) {
|
||||
char command[100 + 1];
|
||||
|
@ -200,24 +205,25 @@ static inline int parser_action(PARSER *parser, char *input) {
|
|||
return 0;
|
||||
}
|
||||
|
||||
static __thread char *words[PLUGINSD_MAX_WORDS];
|
||||
size_t num_words = quoted_strings_splitter_pluginsd(input, words, PLUGINSD_MAX_WORDS);
|
||||
const char *command = get_word(words, num_words, 0);
|
||||
parser->line.num_words = quoted_strings_splitter_pluginsd(input, parser->line.words, PLUGINSD_MAX_WORDS);
|
||||
const char *command = get_word(parser->line.words, parser->line.num_words, 0);
|
||||
|
||||
if(unlikely(!command))
|
||||
if(unlikely(!command)) {
|
||||
line_splitter_reset(&parser->line);
|
||||
return 0;
|
||||
}
|
||||
|
||||
PARSER_RC rc;
|
||||
PARSER_KEYWORD *t = parser_find_keyword(parser, command);
|
||||
if(likely(t)) {
|
||||
worker_is_busy(t->worker_job_id);
|
||||
parser->keyword = parser_find_keyword(parser, command);
|
||||
if(likely(parser->keyword)) {
|
||||
worker_is_busy(parser->keyword->worker_job_id);
|
||||
|
||||
#ifdef NETDATA_LOG_STREAM_RECEIVE
|
||||
if(parser->user.stream_log_fp && t->repertoire & parser->user.stream_log_repertoire)
|
||||
if(parser->user.stream_log_fp && parser->keyword->repertoire & parser->user.stream_log_repertoire)
|
||||
fprintf(parser->user.stream_log_fp, "%s", line);
|
||||
#endif
|
||||
|
||||
rc = parser_execute(parser, t, words, num_words);
|
||||
rc = parser_execute(parser, parser->keyword, parser->line.words, parser->line.num_words);
|
||||
// rc = (*t->func)(words, num_words, parser);
|
||||
worker_is_idle();
|
||||
}
|
||||
|
@ -225,22 +231,13 @@ static inline int parser_action(PARSER *parser, char *input) {
|
|||
rc = PARSER_RC_ERROR;
|
||||
|
||||
if(rc == PARSER_RC_ERROR) {
|
||||
BUFFER *wb = buffer_create(PLUGINSD_LINE_MAX, NULL);
|
||||
for(size_t i = 0; i < num_words ;i++) {
|
||||
if(i) buffer_fast_strcat(wb, " ", 1);
|
||||
|
||||
buffer_fast_strcat(wb, "\"", 1);
|
||||
const char *s = get_word(words, num_words, i);
|
||||
buffer_strcat(wb, s?s:"");
|
||||
buffer_fast_strcat(wb, "\"", 1);
|
||||
}
|
||||
|
||||
CLEAN_BUFFER *wb = buffer_create(PLUGINSD_LINE_MAX, NULL);
|
||||
line_splitter_reconstruct_line(wb, &parser->line);
|
||||
netdata_log_error("PLUGINSD: parser_action('%s') failed on line %zu: { %s } (quotes added to show parsing)",
|
||||
command, parser->line, buffer_tostring(wb));
|
||||
|
||||
buffer_free(wb);
|
||||
command, parser->line.count, buffer_tostring(wb));
|
||||
}
|
||||
|
||||
line_splitter_reset(&parser->line);
|
||||
return (rc == PARSER_RC_ERROR || rc == PARSER_RC_STOP);
|
||||
}
|
||||
|
||||
|
|
|
@ -138,6 +138,12 @@ static bool is_lxcfs_proc_mounted() {
|
|||
return false;
|
||||
}
|
||||
|
||||
static bool log_proc_module(BUFFER *wb, void *data) {
|
||||
struct proc_module *pm = data;
|
||||
buffer_sprintf(wb, "proc.plugin[%s]", pm->name);
|
||||
return true;
|
||||
}
|
||||
|
||||
void *proc_main(void *ptr)
|
||||
{
|
||||
worker_register("PROC");
|
||||
|
@ -153,46 +159,56 @@ void *proc_main(void *ptr)
|
|||
|
||||
netdata_thread_cleanup_push(proc_main_cleanup, ptr);
|
||||
|
||||
config_get_boolean("plugin:proc", "/proc/pagetypeinfo", CONFIG_BOOLEAN_NO);
|
||||
{
|
||||
config_get_boolean("plugin:proc", "/proc/pagetypeinfo", CONFIG_BOOLEAN_NO);
|
||||
|
||||
// check the enabled status for each module
|
||||
int i;
|
||||
for (i = 0; proc_modules[i].name; i++) {
|
||||
struct proc_module *pm = &proc_modules[i];
|
||||
// check the enabled status for each module
|
||||
int i;
|
||||
for(i = 0; proc_modules[i].name; i++) {
|
||||
struct proc_module *pm = &proc_modules[i];
|
||||
|
||||
pm->enabled = config_get_boolean("plugin:proc", pm->name, CONFIG_BOOLEAN_YES);
|
||||
pm->rd = NULL;
|
||||
pm->enabled = config_get_boolean("plugin:proc", pm->name, CONFIG_BOOLEAN_YES);
|
||||
pm->rd = NULL;
|
||||
|
||||
worker_register_job_name(i, proc_modules[i].dim);
|
||||
}
|
||||
worker_register_job_name(i, proc_modules[i].dim);
|
||||
}
|
||||
|
||||
usec_t step = localhost->rrd_update_every * USEC_PER_SEC;
|
||||
heartbeat_t hb;
|
||||
heartbeat_init(&hb);
|
||||
usec_t step = localhost->rrd_update_every * USEC_PER_SEC;
|
||||
heartbeat_t hb;
|
||||
heartbeat_init(&hb);
|
||||
|
||||
inside_lxc_container = is_lxcfs_proc_mounted();
|
||||
inside_lxc_container = is_lxcfs_proc_mounted();
|
||||
|
||||
while (service_running(SERVICE_COLLECTORS)) {
|
||||
worker_is_idle();
|
||||
usec_t hb_dt = heartbeat_next(&hb, step);
|
||||
#define LGS_MODULE_ID 0
|
||||
|
||||
if (unlikely(!service_running(SERVICE_COLLECTORS)))
|
||||
break;
|
||||
ND_LOG_STACK lgs[] = {
|
||||
[LGS_MODULE_ID] = ND_LOG_FIELD_TXT(NDF_MODULE, "proc.plugin"),
|
||||
ND_LOG_FIELD_END(),
|
||||
};
|
||||
ND_LOG_STACK_PUSH(lgs);
|
||||
|
||||
for (i = 0; proc_modules[i].name; i++) {
|
||||
if (unlikely(!service_running(SERVICE_COLLECTORS)))
|
||||
break;
|
||||
while(service_running(SERVICE_COLLECTORS)) {
|
||||
worker_is_idle();
|
||||
usec_t hb_dt = heartbeat_next(&hb, step);
|
||||
|
||||
struct proc_module *pm = &proc_modules[i];
|
||||
if (unlikely(!pm->enabled))
|
||||
continue;
|
||||
if(unlikely(!service_running(SERVICE_COLLECTORS)))
|
||||
break;
|
||||
|
||||
netdata_log_debug(D_PROCNETDEV_LOOP, "PROC calling %s.", pm->name);
|
||||
for(i = 0; proc_modules[i].name; i++) {
|
||||
if(unlikely(!service_running(SERVICE_COLLECTORS)))
|
||||
break;
|
||||
|
||||
worker_is_busy(i);
|
||||
pm->enabled = !pm->func(localhost->rrd_update_every, hb_dt);
|
||||
}
|
||||
}
|
||||
struct proc_module *pm = &proc_modules[i];
|
||||
if(unlikely(!pm->enabled))
|
||||
continue;
|
||||
|
||||
worker_is_busy(i);
|
||||
lgs[LGS_MODULE_ID] = ND_LOG_FIELD_CB(NDF_MODULE, log_proc_module, pm);
|
||||
pm->enabled = !pm->func(localhost->rrd_update_every, hb_dt);
|
||||
lgs[LGS_MODULE_ID] = ND_LOG_FIELD_TXT(NDF_MODULE, "proc.plugin");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
netdata_thread_cleanup_pop(1);
|
||||
return NULL;
|
||||
|
|
|
@ -336,14 +336,11 @@ void usage(void) {
|
|||
}
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
stderror = stderr;
|
||||
clocks_init();
|
||||
nd_log_initialize_for_external_plugins("slabinfo.plugin");
|
||||
|
||||
program_name = argv[0];
|
||||
program_version = "0.1";
|
||||
error_log_syslog = 0;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
|
||||
int update_every = 1, i, n, freq = 0;
|
||||
|
||||
|
|
|
@ -2326,7 +2326,7 @@ static inline void statsd_flush_index_metrics(STATSD_INDEX *index, void (*flush_
|
|||
if(unlikely(is_metric_checked(m))) break;
|
||||
|
||||
if(unlikely(!(m->options & STATSD_METRIC_OPTION_CHECKED_IN_APPS))) {
|
||||
netdata_log_access("NEW STATSD METRIC '%s': '%s'", statsd_metric_type_string(m->type), m->name);
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "NEW STATSD METRIC '%s': '%s'", statsd_metric_type_string(m->type), m->name);
|
||||
check_if_metric_is_for_app(index, m);
|
||||
m->options |= STATSD_METRIC_OPTION_CHECKED_IN_APPS;
|
||||
}
|
||||
|
|
|
@ -408,35 +408,50 @@ void netdata_systemd_journal_transform_boot_id(FACETS *facets __maybe_unused, BU
|
|||
};
|
||||
|
||||
sd_journal *j = NULL;
|
||||
if(sd_journal_open_files(&j, files, ND_SD_JOURNAL_OPEN_FLAGS) < 0 || !j) {
|
||||
internal_error(true, "JOURNAL: cannot open file '%s' to get boot_id", jf_dfe.name);
|
||||
int r = sd_journal_open_files(&j, files, ND_SD_JOURNAL_OPEN_FLAGS);
|
||||
if(r < 0 || !j) {
|
||||
internal_error(true, "JOURNAL: while looking for the first timestamp of boot_id '%s', "
|
||||
"sd_journal_open_files('%s') returned %d",
|
||||
boot_id, jf_dfe.name, r);
|
||||
continue;
|
||||
}
|
||||
|
||||
char m[100];
|
||||
size_t len = snprintfz(m, sizeof(m), "_BOOT_ID=%s", boot_id);
|
||||
|
||||
if(sd_journal_add_match(j, m, len) < 0) {
|
||||
internal_error(true, "JOURNAL: cannot add match '%s' to file '%s'", m, jf_dfe.name);
|
||||
r = sd_journal_add_match(j, m, len);
|
||||
if(r < 0) {
|
||||
internal_error(true, "JOURNAL: while looking for the first timestamp of boot_id '%s', "
|
||||
"sd_journal_add_match('%s') on file '%s' returned %d",
|
||||
boot_id, m, jf_dfe.name, r);
|
||||
sd_journal_close(j);
|
||||
continue;
|
||||
}
|
||||
|
||||
if(sd_journal_seek_head(j) < 0) {
|
||||
internal_error(true, "JOURNAL: cannot seek head to file '%s'", jf_dfe.name);
|
||||
r = sd_journal_seek_head(j);
|
||||
if(r < 0) {
|
||||
internal_error(true, "JOURNAL: while looking for the first timestamp of boot_id '%s', "
|
||||
"sd_journal_seek_head() on file '%s' returned %d",
|
||||
boot_id, jf_dfe.name, r);
|
||||
sd_journal_close(j);
|
||||
continue;
|
||||
}
|
||||
|
||||
if(sd_journal_next(j) < 0) {
|
||||
internal_error(true, "JOURNAL: cannot get next of file '%s'", jf_dfe.name);
|
||||
r = sd_journal_next(j);
|
||||
if(r < 0) {
|
||||
internal_error(true, "JOURNAL: while looking for the first timestamp of boot_id '%s', "
|
||||
"sd_journal_next() on file '%s' returned %d",
|
||||
boot_id, jf_dfe.name, r);
|
||||
sd_journal_close(j);
|
||||
continue;
|
||||
}
|
||||
|
||||
usec_t t_ut = 0;
|
||||
if(sd_journal_get_realtime_usec(j, &t_ut) < 0 || !t_ut) {
|
||||
internal_error(true, "JOURNAL: cannot get realtime_usec of file '%s'", jf_dfe.name);
|
||||
r = sd_journal_get_realtime_usec(j, &t_ut);
|
||||
if(r < 0 || !t_ut) {
|
||||
internal_error(r != -EADDRNOTAVAIL, "JOURNAL: while looking for the first timestamp of boot_id '%s', "
|
||||
"sd_journal_get_realtime_usec() on file '%s' returned %d",
|
||||
boot_id, jf_dfe.name, r);
|
||||
sd_journal_close(j);
|
||||
continue;
|
||||
}
|
||||
|
@ -454,25 +469,21 @@ void netdata_systemd_journal_transform_boot_id(FACETS *facets __maybe_unused, BU
|
|||
ut = *p_ut;
|
||||
|
||||
if(ut != UINT64_MAX) {
|
||||
time_t timestamp_sec = (time_t)(ut / USEC_PER_SEC);
|
||||
struct tm tm;
|
||||
char buffer[30];
|
||||
|
||||
gmtime_r(×tamp_sec, &tm);
|
||||
strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", &tm);
|
||||
char buffer[ISO8601_MAX_LENGTH];
|
||||
iso8601_datetime_ut(buffer, sizeof(buffer), ut, ISO8601_UTC);
|
||||
|
||||
switch(scope) {
|
||||
default:
|
||||
case FACETS_TRANSFORM_DATA:
|
||||
case FACETS_TRANSFORM_VALUE:
|
||||
buffer_sprintf(wb, " (%s UTC) ", buffer);
|
||||
buffer_sprintf(wb, " (%s) ", buffer);
|
||||
break;
|
||||
|
||||
case FACETS_TRANSFORM_FACET:
|
||||
case FACETS_TRANSFORM_FACET_SORT:
|
||||
case FACETS_TRANSFORM_HISTOGRAM:
|
||||
buffer_flush(wb);
|
||||
buffer_sprintf(wb, "%s UTC", buffer);
|
||||
buffer_sprintf(wb, "%s", buffer);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
@ -537,13 +548,9 @@ void netdata_systemd_journal_transform_timestamp_usec(FACETS *facets __maybe_unu
|
|||
if(*v && isdigit(*v)) {
|
||||
uint64_t ut = str2ull(buffer_tostring(wb), NULL);
|
||||
if(ut) {
|
||||
time_t timestamp_sec = (time_t)(ut / USEC_PER_SEC);
|
||||
struct tm tm;
|
||||
char buffer[30];
|
||||
|
||||
gmtime_r(×tamp_sec, &tm);
|
||||
strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", &tm);
|
||||
buffer_sprintf(wb, " (%s.%06llu UTC)", buffer, ut % USEC_PER_SEC);
|
||||
char buffer[ISO8601_MAX_LENGTH];
|
||||
iso8601_datetime_ut(buffer, sizeof(buffer), ut, ISO8601_UTC | ISO8601_MICROSECONDS);
|
||||
buffer_sprintf(wb, " (%s)", buffer);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -703,6 +710,23 @@ void netdata_systemd_journal_message_ids_init(void) {
|
|||
// gnome-shell
|
||||
// https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/main.js#L56
|
||||
i.msg = "Gnome shell started";dictionary_set(known_journal_messages_ids, "f3ea493c22934e26811cd62abe8e203a", &i, sizeof(i));
|
||||
|
||||
// flathub
|
||||
// https://docs.flatpak.org/de/latest/flatpak-command-reference.html
|
||||
i.msg = "Flatpak cache"; dictionary_set(known_journal_messages_ids, "c7b39b1e006b464599465e105b361485", &i, sizeof(i));
|
||||
|
||||
// ???
|
||||
i.msg = "Flathub pulls"; dictionary_set(known_journal_messages_ids, "75ba3deb0af041a9a46272ff85d9e73e", &i, sizeof(i));
|
||||
i.msg = "Flathub pull errors"; dictionary_set(known_journal_messages_ids, "f02bce89a54e4efab3a94a797d26204a", &i, sizeof(i));
|
||||
|
||||
// ??
|
||||
i.msg = "Boltd starting"; dictionary_set(known_journal_messages_ids, "dd11929c788e48bdbb6276fb5f26b08a", &i, sizeof(i));
|
||||
|
||||
// Netdata
|
||||
i.msg = "Netdata connection from child"; dictionary_set(known_journal_messages_ids, "ed4cdb8f1beb4ad3b57cb3cae2d162fa", &i, sizeof(i));
|
||||
i.msg = "Netdata connection to parent"; dictionary_set(known_journal_messages_ids, "6e2e3839067648968b646045dbf28d66", &i, sizeof(i));
|
||||
i.msg = "Netdata alert transition"; dictionary_set(known_journal_messages_ids, "9ce0cb58ab8b44df82c4bf1ad9ee22de", &i, sizeof(i));
|
||||
i.msg = "Netdata alert notification"; dictionary_set(known_journal_messages_ids, "6db0018e83e34320ae2a659d78019fb7", &i, sizeof(i));
|
||||
}
|
||||
|
||||
void netdata_systemd_journal_transform_message_id(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) {
|
||||
|
|
|
@ -333,8 +333,8 @@ static void files_registry_delete_cb(const DICTIONARY_ITEM *item, void *value, v
|
|||
struct journal_file *jf = value; (void)jf;
|
||||
const char *filename = dictionary_acquired_item_name(item); (void)filename;
|
||||
|
||||
string_freez(jf->source);
|
||||
internal_error(true, "removed journal file '%s'", filename);
|
||||
string_freez(jf->source);
|
||||
}
|
||||
|
||||
void journal_directory_scan(const char *dirname, int depth, usec_t last_scan_ut) {
|
||||
|
|
|
@ -165,6 +165,18 @@
|
|||
"|IMAGE_NAME" /* undocumented */ \
|
||||
/* "|CONTAINER_PARTIAL_MESSAGE" */ \
|
||||
\
|
||||
\
|
||||
/* --- NETDATA --- */ \
|
||||
\
|
||||
"|ND_NIDL_NODE" \
|
||||
"|ND_NIDL_CONTEXT" \
|
||||
"|ND_LOG_SOURCE" \
|
||||
/*"|ND_MODULE" */ \
|
||||
"|ND_ALERT_NAME" \
|
||||
"|ND_ALERT_CLASS" \
|
||||
"|ND_ALERT_COMPONENT" \
|
||||
"|ND_ALERT_TYPE" \
|
||||
\
|
||||
""
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
|
|
|
@ -9,19 +9,8 @@ netdata_mutex_t stdout_mutex = NETDATA_MUTEX_INITIALIZER;
|
|||
static bool plugin_should_exit = false;
|
||||
|
||||
int main(int argc __maybe_unused, char **argv __maybe_unused) {
|
||||
stderror = stderr;
|
||||
clocks_init();
|
||||
|
||||
program_name = "systemd-journal.plugin";
|
||||
|
||||
// disable syslog
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
nd_log_initialize_for_external_plugins("systemd-journal.plugin");
|
||||
|
||||
netdata_configured_host_prefix = getenv("NETDATA_HOST_PREFIX");
|
||||
if(verify_netdata_host_prefix() == -1) exit(1);
|
||||
|
|
|
@ -676,8 +676,6 @@ static void update_freezer_state(UnitInfo *u, UnitAttribute *ua) {
|
|||
// ----------------------------------------------------------------------------
|
||||
// common helpers
|
||||
|
||||
#define _cleanup_(x) __attribute__((__cleanup__(x)))
|
||||
|
||||
static void log_dbus_error(int r, const char *msg) {
|
||||
netdata_log_error("SYSTEMD_UNITS: %s failed with error %d (%s)", msg, r, strerror(-r));
|
||||
}
|
||||
|
|
|
@ -920,7 +920,6 @@ static void xenstat_send_domain_metrics() {
|
|||
}
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
stderror = stderr;
|
||||
clocks_init();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
|
@ -928,14 +927,7 @@ int main(int argc, char **argv) {
|
|||
|
||||
program_name = "xenstat.plugin";
|
||||
|
||||
// disable syslog
|
||||
error_log_syslog = 0;
|
||||
|
||||
// set errors flood protection to 100 logs per hour
|
||||
error_log_errors_per_period = 100;
|
||||
error_log_throttle_period = 3600;
|
||||
|
||||
log_set_global_severity_for_external_plugins();
|
||||
nd_log_initialize_for_external_plugins();
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// parse command line parameters
|
||||
|
|
62
configure.ac
62
configure.ac
|
@ -54,6 +54,8 @@ else
|
|||
AC_CHECK_TOOL([AR], [ar])
|
||||
fi
|
||||
|
||||
CFLAGS="$CFLAGS -fexceptions"
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# configurable options
|
||||
|
||||
|
@ -571,6 +573,48 @@ AC_CHECK_LIB(
|
|||
[LZ4_LIBS="-llz4"]
|
||||
)
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# libcurl
|
||||
|
||||
PKG_CHECK_MODULES(
|
||||
[LIBCURL],
|
||||
[libcurl],
|
||||
[AC_CHECK_LIB(
|
||||
[curl],
|
||||
[curl_easy_init],
|
||||
[have_libcurl=yes],
|
||||
[have_libcurl=no]
|
||||
)],
|
||||
[have_libcurl=no]
|
||||
)
|
||||
|
||||
if test "x$have_libcurl" = "xyes"; then
|
||||
AC_DEFINE([HAVE_CURL], [1], [libcurl usability])
|
||||
OPTIONAL_CURL_LIBS="-lcurl"
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# PCRE2
|
||||
|
||||
PKG_CHECK_MODULES(
|
||||
[LIBPCRE2],
|
||||
[libpcre2-8],
|
||||
[AC_CHECK_LIB(
|
||||
[pcre2-8],
|
||||
[pcre2_compile_8],
|
||||
[have_libpcre2=yes],
|
||||
[have_libpcre2=no]
|
||||
)],
|
||||
[have_libpcre2=no]
|
||||
)
|
||||
|
||||
if test "x$have_libpcre2" = "xyes"; then
|
||||
AC_DEFINE([HAVE_PCRE2], [1], [PCRE2 usability])
|
||||
OPTIONAL_PCRE2_LIBS="-lpcre2-8"
|
||||
fi
|
||||
|
||||
AM_CONDITIONAL([ENABLE_LOG2JOURNAL], [test "${have_libpcre2}" = "yes"])
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# zstd
|
||||
|
||||
|
@ -1590,18 +1634,6 @@ PKG_CHECK_MODULES(
|
|||
[have_libssl=no]
|
||||
)
|
||||
|
||||
PKG_CHECK_MODULES(
|
||||
[LIBCURL],
|
||||
[libcurl],
|
||||
[AC_CHECK_LIB(
|
||||
[curl],
|
||||
[curl_easy_init],
|
||||
[have_libcurl=yes],
|
||||
[have_libcurl=no]
|
||||
)],
|
||||
[have_libcurl=no]
|
||||
)
|
||||
|
||||
PKG_CHECK_MODULES(
|
||||
[AWS_CPP_SDK_CORE],
|
||||
[aws-cpp-sdk-core],
|
||||
|
@ -1946,6 +1978,8 @@ AC_SUBST([OPTIONAL_UV_LIBS])
|
|||
AC_SUBST([OPTIONAL_LZ4_LIBS])
|
||||
AC_SUBST([OPTIONAL_BROTLIENC_LIBS])
|
||||
AC_SUBST([OPTIONAL_BROTLIDEC_LIBS])
|
||||
AC_SUBST([OPTIONAL_CURL_LIBS])
|
||||
AC_SUBST([OPTIONAL_PCRE2_LIBS])
|
||||
AC_SUBST([OPTIONAL_ZSTD_LIBS])
|
||||
AC_SUBST([OPTIONAL_SSL_LIBS])
|
||||
AC_SUBST([OPTIONAL_JSONC_LIBS])
|
||||
|
@ -2073,15 +2107,18 @@ AC_CONFIG_FILES([
|
|||
libnetdata/aral/Makefile
|
||||
libnetdata/avl/Makefile
|
||||
libnetdata/buffer/Makefile
|
||||
libnetdata/buffered_reader/Makefile
|
||||
libnetdata/clocks/Makefile
|
||||
libnetdata/completion/Makefile
|
||||
libnetdata/config/Makefile
|
||||
libnetdata/datetime/Makefile
|
||||
libnetdata/dictionary/Makefile
|
||||
libnetdata/ebpf/Makefile
|
||||
libnetdata/eval/Makefile
|
||||
libnetdata/facets/Makefile
|
||||
libnetdata/functions_evloop/Makefile
|
||||
libnetdata/july/Makefile
|
||||
libnetdata/line_splitter/Makefile
|
||||
libnetdata/locks/Makefile
|
||||
libnetdata/log/Makefile
|
||||
libnetdata/onewayalloc/Makefile
|
||||
|
@ -2095,6 +2132,7 @@ AC_CONFIG_FILES([
|
|||
libnetdata/storage_number/tests/Makefile
|
||||
libnetdata/threads/Makefile
|
||||
libnetdata/url/Makefile
|
||||
libnetdata/uuid/Makefile
|
||||
libnetdata/json/Makefile
|
||||
libnetdata/health/Makefile
|
||||
libnetdata/worker_utilization/Makefile
|
||||
|
|
|
@ -4,6 +4,7 @@ Build-Depends: debhelper (>= 9.20160709),
|
|||
dpkg-dev (>= 1.13.19),
|
||||
zlib1g-dev,
|
||||
uuid-dev,
|
||||
libcurl4-openssl-dev,
|
||||
libelf-dev,
|
||||
libuv1-dev,
|
||||
liblz4-dev,
|
||||
|
@ -15,6 +16,7 @@ Build-Depends: debhelper (>= 9.20160709),
|
|||
libipmimonitoring-dev,
|
||||
libnetfilter-acct-dev,
|
||||
libsnappy-dev,
|
||||
libpcre2-8-0,
|
||||
libprotobuf-dev,
|
||||
libprotoc-dev,
|
||||
libsystemd-dev,
|
||||
|
|
|
@ -142,10 +142,10 @@ static cmd_status_t cmd_reload_health_execute(char *args, char **message)
|
|||
(void)args;
|
||||
(void)message;
|
||||
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("COMMAND: Reloading HEALTH configuration.");
|
||||
health_reload();
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_reset();
|
||||
|
||||
return CMD_STATUS_SUCCESS;
|
||||
}
|
||||
|
@ -155,11 +155,11 @@ static cmd_status_t cmd_save_database_execute(char *args, char **message)
|
|||
(void)args;
|
||||
(void)message;
|
||||
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("COMMAND: Saving databases.");
|
||||
rrdhost_save_all();
|
||||
netdata_log_info("COMMAND: Databases saved.");
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_reset();
|
||||
|
||||
return CMD_STATUS_SUCCESS;
|
||||
}
|
||||
|
@ -169,10 +169,9 @@ static cmd_status_t cmd_reopen_logs_execute(char *args, char **message)
|
|||
(void)args;
|
||||
(void)message;
|
||||
|
||||
error_log_limit_unlimited();
|
||||
netdata_log_info("COMMAND: Reopening all log files.");
|
||||
reopen_all_log_files();
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_unlimited();
|
||||
nd_log_reopen_log_files();
|
||||
nd_log_limits_reset();
|
||||
|
||||
return CMD_STATUS_SUCCESS;
|
||||
}
|
||||
|
@ -182,7 +181,7 @@ static cmd_status_t cmd_exit_execute(char *args, char **message)
|
|||
(void)args;
|
||||
(void)message;
|
||||
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("COMMAND: Cleaning up to exit.");
|
||||
netdata_cleanup_and_exit(0);
|
||||
exit(0);
|
||||
|
|
|
@ -31,22 +31,6 @@ void get_netdata_execution_path(void) {
|
|||
dirname(netdata_exe_path);
|
||||
}
|
||||
|
||||
static void chown_open_file(int fd, uid_t uid, gid_t gid) {
|
||||
if(fd == -1) return;
|
||||
|
||||
struct stat buf;
|
||||
|
||||
if(fstat(fd, &buf) == -1) {
|
||||
netdata_log_error("Cannot fstat() fd %d", fd);
|
||||
return;
|
||||
}
|
||||
|
||||
if((buf.st_uid != uid || buf.st_gid != gid) && S_ISREG(buf.st_mode)) {
|
||||
if(fchown(fd, uid, gid) == -1)
|
||||
netdata_log_error("Cannot fchown() fd %d.", fd);
|
||||
}
|
||||
}
|
||||
|
||||
static void fix_directory_file_permissions(const char *dirname, uid_t uid, gid_t gid, bool recursive)
|
||||
{
|
||||
char filename[FILENAME_MAX + 1];
|
||||
|
@ -150,9 +134,9 @@ int become_user(const char *username, int pid_fd) {
|
|||
}
|
||||
}
|
||||
|
||||
nd_log_chown_log_files(uid, gid);
|
||||
chown_open_file(STDOUT_FILENO, uid, gid);
|
||||
chown_open_file(STDERR_FILENO, uid, gid);
|
||||
chown_open_file(stdaccess_fd, uid, gid);
|
||||
chown_open_file(pid_fd, uid, gid);
|
||||
|
||||
if(supplementary_groups && ngroups > 0) {
|
||||
|
|
|
@ -315,7 +315,7 @@ void netdata_cleanup_and_exit(int ret) {
|
|||
const char *prev_msg = NULL;
|
||||
bool timeout = false;
|
||||
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("NETDATA SHUTDOWN: initializing shutdown with code %d...", ret);
|
||||
|
||||
send_statistics("EXIT", ret?"ERROR":"OK","-");
|
||||
|
@ -449,8 +449,9 @@ void netdata_cleanup_and_exit(int ret) {
|
|||
running += rrdeng_collectors_running(multidb_ctx[tier]);
|
||||
|
||||
if(running) {
|
||||
error_limit_static_thread_var(erl, 1, 100 * USEC_PER_MS);
|
||||
error_limit(&erl, "waiting for %zu collectors to finish", running);
|
||||
nd_log_limit_static_thread_var(erl, 1, 100 * USEC_PER_MS);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_NOTICE,
|
||||
"waiting for %zu collectors to finish", running);
|
||||
// sleep_usec(100 * USEC_PER_MS);
|
||||
cleanup_destroyed_dictionaries();
|
||||
}
|
||||
|
@ -618,8 +619,14 @@ int killpid(pid_t pid) {
|
|||
int ret;
|
||||
netdata_log_debug(D_EXIT, "Request to kill pid %d", pid);
|
||||
|
||||
int signal = SIGTERM;
|
||||
//#ifdef NETDATA_INTERNAL_CHECKS
|
||||
// if(service_running(SERVICE_COLLECTORS))
|
||||
// signal = SIGABRT;
|
||||
//#endif
|
||||
|
||||
errno = 0;
|
||||
ret = kill(pid, SIGTERM);
|
||||
ret = kill(pid, signal);
|
||||
if (ret == -1) {
|
||||
switch(errno) {
|
||||
case ESRCH:
|
||||
|
@ -666,7 +673,7 @@ static void set_nofile_limit(struct rlimit *rl) {
|
|||
}
|
||||
|
||||
void cancel_main_threads() {
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
|
||||
int i, found = 0;
|
||||
usec_t max = 5 * USEC_PER_SEC, step = 100000;
|
||||
|
@ -756,7 +763,7 @@ int help(int exitcode) {
|
|||
" | '-' '-' '-' '-' real-time performance monitoring, done right! \n"
|
||||
" +----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+--->\n"
|
||||
"\n"
|
||||
" Copyright (C) 2016-2022, Netdata, Inc. <info@netdata.cloud>\n"
|
||||
" Copyright (C) 2016-2023, Netdata, Inc. <info@netdata.cloud>\n"
|
||||
" Released under GNU General Public License v3 or later.\n"
|
||||
" All rights reserved.\n"
|
||||
"\n"
|
||||
|
@ -845,44 +852,49 @@ static void security_init(){
|
|||
#endif
|
||||
|
||||
static void log_init(void) {
|
||||
nd_log_set_facility(config_get(CONFIG_SECTION_LOGS, "facility", "daemon"));
|
||||
|
||||
time_t period = ND_LOG_DEFAULT_THROTTLE_PERIOD;
|
||||
size_t logs = ND_LOG_DEFAULT_THROTTLE_LOGS;
|
||||
period = config_get_number(CONFIG_SECTION_LOGS, "logs flood protection period", period);
|
||||
logs = (unsigned long)config_get_number(CONFIG_SECTION_LOGS, "logs to trigger flood protection", (long long int)logs);
|
||||
nd_log_set_flood_protection(logs, period);
|
||||
|
||||
nd_log_set_priority_level(config_get(CONFIG_SECTION_LOGS, "level", NDLP_INFO_STR));
|
||||
|
||||
char filename[FILENAME_MAX + 1];
|
||||
snprintfz(filename, FILENAME_MAX, "%s/debug.log", netdata_configured_log_dir);
|
||||
stdout_filename = config_get(CONFIG_SECTION_LOGS, "debug", filename);
|
||||
nd_log_set_user_settings(NDLS_DEBUG, config_get(CONFIG_SECTION_LOGS, "debug", filename));
|
||||
|
||||
snprintfz(filename, FILENAME_MAX, "%s/error.log", netdata_configured_log_dir);
|
||||
stderr_filename = config_get(CONFIG_SECTION_LOGS, "error", filename);
|
||||
bool with_journal = is_stderr_connected_to_journal() /* || nd_log_journal_socket_available() */;
|
||||
if(with_journal)
|
||||
snprintfz(filename, FILENAME_MAX, "journal");
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/daemon.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_DAEMON, config_get(CONFIG_SECTION_LOGS, "daemon", filename));
|
||||
|
||||
snprintfz(filename, FILENAME_MAX, "%s/collector.log", netdata_configured_log_dir);
|
||||
stdcollector_filename = config_get(CONFIG_SECTION_LOGS, "collector", filename);
|
||||
if(with_journal)
|
||||
snprintfz(filename, FILENAME_MAX, "journal");
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/collector.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_COLLECTORS, config_get(CONFIG_SECTION_LOGS, "collector", filename));
|
||||
|
||||
snprintfz(filename, FILENAME_MAX, "%s/access.log", netdata_configured_log_dir);
|
||||
stdaccess_filename = config_get(CONFIG_SECTION_LOGS, "access", filename);
|
||||
nd_log_set_user_settings(NDLS_ACCESS, config_get(CONFIG_SECTION_LOGS, "access", filename));
|
||||
|
||||
snprintfz(filename, FILENAME_MAX, "%s/health.log", netdata_configured_log_dir);
|
||||
stdhealth_filename = config_get(CONFIG_SECTION_LOGS, "health", filename);
|
||||
if(with_journal)
|
||||
snprintfz(filename, FILENAME_MAX, "journal");
|
||||
else
|
||||
snprintfz(filename, FILENAME_MAX, "%s/health.log", netdata_configured_log_dir);
|
||||
nd_log_set_user_settings(NDLS_HEALTH, config_get(CONFIG_SECTION_LOGS, "health", filename));
|
||||
|
||||
#ifdef ENABLE_ACLK
|
||||
aclklog_enabled = config_get_boolean(CONFIG_SECTION_CLOUD, "conversation log", CONFIG_BOOLEAN_NO);
|
||||
if (aclklog_enabled) {
|
||||
snprintfz(filename, FILENAME_MAX, "%s/aclk.log", netdata_configured_log_dir);
|
||||
aclklog_filename = config_get(CONFIG_SECTION_CLOUD, "conversation log file", filename);
|
||||
nd_log_set_user_settings(NDLS_ACLK, config_get(CONFIG_SECTION_CLOUD, "conversation log file", filename));
|
||||
}
|
||||
#endif
|
||||
|
||||
char deffacility[8];
|
||||
snprintfz(deffacility,7,"%s","daemon");
|
||||
facility_log = config_get(CONFIG_SECTION_LOGS, "facility", deffacility);
|
||||
|
||||
error_log_throttle_period = config_get_number(CONFIG_SECTION_LOGS, "errors flood protection period", error_log_throttle_period);
|
||||
error_log_errors_per_period = (unsigned long)config_get_number(CONFIG_SECTION_LOGS, "errors to trigger flood protection", (long long int)error_log_errors_per_period);
|
||||
error_log_errors_per_period_backup = error_log_errors_per_period;
|
||||
|
||||
setenv("NETDATA_ERRORS_THROTTLE_PERIOD", config_get(CONFIG_SECTION_LOGS, "errors flood protection period" , ""), 1);
|
||||
setenv("NETDATA_ERRORS_PER_PERIOD", config_get(CONFIG_SECTION_LOGS, "errors to trigger flood protection", ""), 1);
|
||||
|
||||
char *selected_level = config_get(CONFIG_SECTION_LOGS, "severity level", NETDATA_LOG_LEVEL_INFO_STR);
|
||||
global_log_severity_level = log_severity_string_to_severity_level(selected_level);
|
||||
setenv("NETDATA_LOG_SEVERITY_LEVEL", selected_level , 1);
|
||||
}
|
||||
|
||||
char *initialize_lock_directory_path(char *prefix)
|
||||
|
@ -1054,6 +1066,17 @@ static void backwards_compatible_config() {
|
|||
config_move(CONFIG_SECTION_GLOBAL, "enable zero metrics",
|
||||
CONFIG_SECTION_DB, "enable zero metrics");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "error",
|
||||
CONFIG_SECTION_LOGS, "daemon");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "severity level",
|
||||
CONFIG_SECTION_LOGS, "level");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "errors to trigger flood protection",
|
||||
CONFIG_SECTION_LOGS, "logs to trigger flood protection");
|
||||
|
||||
config_move(CONFIG_SECTION_LOGS, "errors flood protection period",
|
||||
CONFIG_SECTION_LOGS, "logs flood protection period");
|
||||
}
|
||||
|
||||
static int get_hostname(char *buf, size_t buf_size) {
|
||||
|
@ -1354,6 +1377,7 @@ int pluginsd_parser_unittest(void);
|
|||
void replication_initialize(void);
|
||||
void bearer_tokens_init(void);
|
||||
int unittest_rrdpush_compressions(void);
|
||||
int uuid_unittest(void);
|
||||
|
||||
int main(int argc, char **argv) {
|
||||
// initialize the system clocks
|
||||
|
@ -1363,8 +1387,6 @@ int main(int argc, char **argv) {
|
|||
usec_t started_ut = now_monotonic_usec();
|
||||
usec_t last_ut = started_ut;
|
||||
const char *prev_msg = NULL;
|
||||
// Initialize stderror avoiding coredump when netdata_log_info() or netdata_log_error() is called
|
||||
stderror = stderr;
|
||||
|
||||
int i;
|
||||
int config_loaded = 0;
|
||||
|
@ -1516,6 +1538,8 @@ int main(int argc, char **argv) {
|
|||
return 1;
|
||||
if (ctx_unittest())
|
||||
return 1;
|
||||
if (uuid_unittest())
|
||||
return 1;
|
||||
fprintf(stderr, "\n\nALL TESTS PASSED\n\n");
|
||||
return 0;
|
||||
}
|
||||
|
@ -1542,6 +1566,10 @@ int main(int argc, char **argv) {
|
|||
unittest_running = true;
|
||||
return buffer_unittest();
|
||||
}
|
||||
else if(strcmp(optarg, "uuidtest") == 0) {
|
||||
unittest_running = true;
|
||||
return uuid_unittest();
|
||||
}
|
||||
#ifdef ENABLE_DBENGINE
|
||||
else if(strcmp(optarg, "mctest") == 0) {
|
||||
unittest_running = true;
|
||||
|
@ -1919,10 +1947,10 @@ int main(int argc, char **argv) {
|
|||
// get log filenames and settings
|
||||
|
||||
log_init();
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
|
||||
// initialize the log files
|
||||
open_all_log_files();
|
||||
nd_log_initialize();
|
||||
netdata_log_info("Netdata agent version \""VERSION"\" is starting");
|
||||
|
||||
ieee754_doubles = is_system_ieee754_double();
|
||||
|
@ -2103,7 +2131,7 @@ int main(int argc, char **argv) {
|
|||
// ------------------------------------------------------------------------
|
||||
// enable log flood protection
|
||||
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_reset();
|
||||
|
||||
// Load host labels
|
||||
delta_startup_time("collect host labels");
|
||||
|
|
|
@ -203,28 +203,28 @@ void signals_handle(void) {
|
|||
|
||||
switch (signals_waiting[i].action) {
|
||||
case NETDATA_SIGNAL_RELOAD_HEALTH:
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("SIGNAL: Received %s. Reloading HEALTH configuration...", name);
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_reset();
|
||||
execute_command(CMD_RELOAD_HEALTH, NULL, NULL);
|
||||
break;
|
||||
|
||||
case NETDATA_SIGNAL_SAVE_DATABASE:
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("SIGNAL: Received %s. Saving databases...", name);
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_reset();
|
||||
execute_command(CMD_SAVE_DATABASE, NULL, NULL);
|
||||
break;
|
||||
|
||||
case NETDATA_SIGNAL_REOPEN_LOGS:
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("SIGNAL: Received %s. Reopening all log files...", name);
|
||||
error_log_limit_reset();
|
||||
nd_log_limits_reset();
|
||||
execute_command(CMD_REOPEN_LOGS, NULL, NULL);
|
||||
break;
|
||||
|
||||
case NETDATA_SIGNAL_EXIT_CLEANLY:
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
netdata_log_info("SIGNAL: Received %s. Cleaning up to exit...", name);
|
||||
commands_exit();
|
||||
netdata_cleanup_and_exit(0);
|
||||
|
|
|
@ -2118,7 +2118,7 @@ int test_dbengine(void)
|
|||
RRDDIM *rd[CHARTS][DIMS];
|
||||
time_t time_start[REGIONS], time_end[REGIONS];
|
||||
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
fprintf(stderr, "\nRunning DB-engine test\n");
|
||||
|
||||
default_rrd_memory_mode = RRD_MEMORY_MODE_DBENGINE;
|
||||
|
@ -2347,7 +2347,7 @@ void generate_dbengine_dataset(unsigned history_seconds)
|
|||
(1024 * 1024);
|
||||
default_rrdeng_disk_quota_mb -= default_rrdeng_disk_quota_mb * EXPECTED_COMPRESSION_RATIO / 100;
|
||||
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
fprintf(stderr, "Initializing localhost with hostname 'dbengine-dataset'");
|
||||
|
||||
host = dbengine_rrdhost_find_or_create("dbengine-dataset");
|
||||
|
@ -2522,7 +2522,7 @@ void dbengine_stress_test(unsigned TEST_DURATION_SEC, unsigned DSET_CHARTS, unsi
|
|||
unsigned i, j;
|
||||
time_t time_start, test_duration;
|
||||
|
||||
error_log_limit_unlimited();
|
||||
nd_log_limits_unlimited();
|
||||
|
||||
if (!TEST_DURATION_SEC)
|
||||
TEST_DURATION_SEC = 10;
|
||||
|
|
|
@ -224,26 +224,31 @@ void rrdcontext_hub_checkpoint_command(void *ptr) {
|
|||
struct ctxs_checkpoint *cmd = ptr;
|
||||
|
||||
if(!rrdhost_check_our_claim_id(cmd->claim_id)) {
|
||||
netdata_log_error("RRDCONTEXT: received checkpoint command for claim_id '%s', node id '%s', but this is not our claim id. Ours '%s', received '%s'. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id,
|
||||
localhost->aclk_state.claimed_id?localhost->aclk_state.claimed_id:"NOT SET",
|
||||
cmd->claim_id);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"RRDCONTEXT: received checkpoint command for claim_id '%s', node id '%s', "
|
||||
"but this is not our claim id. Ours '%s', received '%s'. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id,
|
||||
localhost->aclk_state.claimed_id?localhost->aclk_state.claimed_id:"NOT SET",
|
||||
cmd->claim_id);
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
RRDHOST *host = rrdhost_find_by_node_id(cmd->node_id);
|
||||
if(!host) {
|
||||
netdata_log_error("RRDCONTEXT: received checkpoint command for claim id '%s', node id '%s', but there is no node with such node id here. Ignoring command.",
|
||||
cmd->claim_id,
|
||||
cmd->node_id);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"RRDCONTEXT: received checkpoint command for claim id '%s', node id '%s', "
|
||||
"but there is no node with such node id here. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id);
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
if(rrdhost_flag_check(host, RRDHOST_FLAG_ACLK_STREAM_CONTEXTS)) {
|
||||
netdata_log_info("RRDCONTEXT: received checkpoint command for claim id '%s', node id '%s', while node '%s' has an active context streaming.",
|
||||
cmd->claim_id, cmd->node_id, rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"RRDCONTEXT: received checkpoint command for claim id '%s', node id '%s', "
|
||||
"while node '%s' has an active context streaming.",
|
||||
cmd->claim_id, cmd->node_id, rrdhost_hostname(host));
|
||||
|
||||
// disable it temporarily, so that our worker will not attempt to send messages in parallel
|
||||
rrdhost_flag_clear(host, RRDHOST_FLAG_ACLK_STREAM_CONTEXTS);
|
||||
|
@ -252,8 +257,10 @@ void rrdcontext_hub_checkpoint_command(void *ptr) {
|
|||
uint64_t our_version_hash = rrdcontext_version_hash(host);
|
||||
|
||||
if(cmd->version_hash != our_version_hash) {
|
||||
netdata_log_error("RRDCONTEXT: received version hash %"PRIu64" for host '%s', does not match our version hash %"PRIu64". Sending snapshot of all contexts.",
|
||||
cmd->version_hash, rrdhost_hostname(host), our_version_hash);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"RRDCONTEXT: received version hash %"PRIu64" for host '%s', does not match our version hash %"PRIu64". "
|
||||
"Sending snapshot of all contexts.",
|
||||
cmd->version_hash, rrdhost_hostname(host), our_version_hash);
|
||||
|
||||
#ifdef ENABLE_ACLK
|
||||
// prepare the snapshot
|
||||
|
@ -275,41 +282,55 @@ void rrdcontext_hub_checkpoint_command(void *ptr) {
|
|||
#endif
|
||||
}
|
||||
|
||||
internal_error(true, "RRDCONTEXT: host '%s' enabling streaming of contexts", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRDCONTEXT: host '%s' enabling streaming of contexts",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
rrdhost_flag_set(host, RRDHOST_FLAG_ACLK_STREAM_CONTEXTS);
|
||||
char node_str[UUID_STR_LEN];
|
||||
uuid_unparse_lower(*host->node_id, node_str);
|
||||
netdata_log_access("ACLK REQ [%s (%s)]: STREAM CONTEXTS ENABLED", node_str, rrdhost_hostname(host));
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG,
|
||||
"ACLK REQ [%s (%s)]: STREAM CONTEXTS ENABLED",
|
||||
node_str, rrdhost_hostname(host));
|
||||
}
|
||||
|
||||
void rrdcontext_hub_stop_streaming_command(void *ptr) {
|
||||
struct stop_streaming_ctxs *cmd = ptr;
|
||||
|
||||
if(!rrdhost_check_our_claim_id(cmd->claim_id)) {
|
||||
netdata_log_error("RRDCONTEXT: received stop streaming command for claim_id '%s', node id '%s', but this is not our claim id. Ours '%s', received '%s'. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id,
|
||||
localhost->aclk_state.claimed_id?localhost->aclk_state.claimed_id:"NOT SET",
|
||||
cmd->claim_id);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"RRDCONTEXT: received stop streaming command for claim_id '%s', node id '%s', "
|
||||
"but this is not our claim id. Ours '%s', received '%s'. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id,
|
||||
localhost->aclk_state.claimed_id?localhost->aclk_state.claimed_id:"NOT SET",
|
||||
cmd->claim_id);
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
RRDHOST *host = rrdhost_find_by_node_id(cmd->node_id);
|
||||
if(!host) {
|
||||
netdata_log_error("RRDCONTEXT: received stop streaming command for claim id '%s', node id '%s', but there is no node with such node id here. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"RRDCONTEXT: received stop streaming command for claim id '%s', node id '%s', "
|
||||
"but there is no node with such node id here. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id);
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
if(!rrdhost_flag_check(host, RRDHOST_FLAG_ACLK_STREAM_CONTEXTS)) {
|
||||
netdata_log_error("RRDCONTEXT: received stop streaming command for claim id '%s', node id '%s', but node '%s' does not have active context streaming. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id, rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"RRDCONTEXT: received stop streaming command for claim id '%s', node id '%s', "
|
||||
"but node '%s' does not have active context streaming. Ignoring command.",
|
||||
cmd->claim_id, cmd->node_id, rrdhost_hostname(host));
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
internal_error(true, "RRDCONTEXT: host '%s' disabling streaming of contexts", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRDCONTEXT: host '%s' disabling streaming of contexts",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
rrdhost_flag_clear(host, RRDHOST_FLAG_ACLK_STREAM_CONTEXTS);
|
||||
}
|
||||
|
||||
|
|
|
@ -1171,9 +1171,10 @@ static bool evict_pages_with_filter(PGC *cache, size_t max_skip, size_t max_evic
|
|||
if(all_of_them && !filter) {
|
||||
pgc_ll_lock(cache, &cache->clean);
|
||||
if(cache->clean.stats->entries) {
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl, "DBENGINE CACHE: cannot free all clean pages, %zu are still in the clean queue",
|
||||
cache->clean.stats->entries);
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_NOTICE,
|
||||
"DBENGINE CACHE: cannot free all clean pages, %zu are still in the clean queue",
|
||||
cache->clean.stats->entries);
|
||||
}
|
||||
pgc_ll_unlock(cache, &cache->clean);
|
||||
}
|
||||
|
|
|
@ -479,14 +479,18 @@ int create_new_datafile_pair(struct rrdengine_instance *ctx, bool having_lock)
|
|||
int ret;
|
||||
char path[RRDENG_PATH_MAX];
|
||||
|
||||
netdata_log_info("DBENGINE: creating new data and journal files in path %s", ctx->config.dbfiles_path);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"DBENGINE: creating new data and journal files in path %s",
|
||||
ctx->config.dbfiles_path);
|
||||
|
||||
datafile = datafile_alloc_and_init(ctx, 1, fileno);
|
||||
ret = create_data_file(datafile);
|
||||
if(ret)
|
||||
goto error_after_datafile;
|
||||
|
||||
generate_datafilepath(datafile, path, sizeof(path));
|
||||
netdata_log_info("DBENGINE: created data file \"%s\".", path);
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO,
|
||||
"DBENGINE: created data file \"%s\".", path);
|
||||
|
||||
journalfile = journalfile_alloc_and_init(datafile);
|
||||
ret = journalfile_create(journalfile, datafile);
|
||||
|
@ -494,7 +498,8 @@ int create_new_datafile_pair(struct rrdengine_instance *ctx, bool having_lock)
|
|||
goto error_after_journalfile;
|
||||
|
||||
journalfile_v1_generate_path(datafile, path, sizeof(path));
|
||||
netdata_log_info("DBENGINE: created journal file \"%s\".", path);
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO,
|
||||
"DBENGINE: created journal file \"%s\".", path);
|
||||
|
||||
ctx_current_disk_space_increase(ctx, datafile->pos + journalfile->unsafe.pos);
|
||||
datafile_list_insert(ctx, datafile, having_lock);
|
||||
|
|
|
@ -592,27 +592,30 @@ inline void mrg_update_metric_retention_and_granularity_by_uuid(
|
|||
time_t update_every_s, time_t now_s)
|
||||
{
|
||||
if(unlikely(last_time_s > now_s)) {
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl, "DBENGINE JV2: wrong last time on-disk (%ld - %ld, now %ld), "
|
||||
"fixing last time to now",
|
||||
first_time_s, last_time_s, now_s);
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE JV2: wrong last time on-disk (%ld - %ld, now %ld), "
|
||||
"fixing last time to now",
|
||||
first_time_s, last_time_s, now_s);
|
||||
last_time_s = now_s;
|
||||
}
|
||||
|
||||
if (unlikely(first_time_s > last_time_s)) {
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl, "DBENGINE JV2: wrong first time on-disk (%ld - %ld, now %ld), "
|
||||
"fixing first time to last time",
|
||||
first_time_s, last_time_s, now_s);
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE JV2: wrong first time on-disk (%ld - %ld, now %ld), "
|
||||
"fixing first time to last time",
|
||||
first_time_s, last_time_s, now_s);
|
||||
|
||||
first_time_s = last_time_s;
|
||||
}
|
||||
|
||||
if (unlikely(first_time_s == 0 || last_time_s == 0)) {
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl, "DBENGINE JV2: zero on-disk timestamps (%ld - %ld, now %ld), "
|
||||
"using them as-is",
|
||||
first_time_s, last_time_s, now_s);
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE JV2: zero on-disk timestamps (%ld - %ld, now %ld), "
|
||||
"using them as-is",
|
||||
first_time_s, last_time_s, now_s);
|
||||
}
|
||||
|
||||
bool added = false;
|
||||
|
|
|
@ -772,7 +772,7 @@ VALIDATED_PAGE_DESCRIPTOR validate_page(
|
|||
|
||||
if(unlikely(!vd.is_valid || updated)) {
|
||||
#ifndef NETDATA_INTERNAL_CHECKS
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
#endif
|
||||
char uuid_str[UUID_STR_LEN + 1];
|
||||
uuid_unparse(*uuid, uuid_str);
|
||||
|
@ -788,7 +788,7 @@ VALIDATED_PAGE_DESCRIPTOR validate_page(
|
|||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
internal_error(true,
|
||||
#else
|
||||
error_limit(&erl,
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_ERR,
|
||||
#endif
|
||||
"DBENGINE: metric '%s' %s invalid page of type %u "
|
||||
"from %ld to %ld (now %ld), update every %ld, page length %zu, entries %zu (flags: %s)",
|
||||
|
@ -808,7 +808,7 @@ VALIDATED_PAGE_DESCRIPTOR validate_page(
|
|||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
internal_error(true,
|
||||
#else
|
||||
error_limit(&erl,
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_ERR,
|
||||
#endif
|
||||
"DBENGINE: metric '%s' %s page of type %u "
|
||||
"from %ld to %ld (now %ld), update every %ld, page length %zu, entries %zu (flags: %s), "
|
||||
|
@ -915,8 +915,8 @@ static void epdl_extent_loading_error_log(struct rrdengine_instance *ctx, EPDL *
|
|||
if(end_time_s)
|
||||
log_date(end_time_str, LOG_DATE_LENGTH, end_time_s);
|
||||
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl,
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_ERR,
|
||||
"DBENGINE: error while reading extent from datafile %u of tier %d, at offset %" PRIu64 " (%u bytes) "
|
||||
"%s from %ld (%s) to %ld (%s) %s%s: "
|
||||
"%s",
|
||||
|
|
|
@ -1478,12 +1478,19 @@ static void *journal_v2_indexing_tp_worker(struct rrdengine_instance *ctx __mayb
|
|||
spinlock_unlock(&datafile->writers.spinlock);
|
||||
|
||||
if(!available) {
|
||||
netdata_log_info("DBENGINE: journal file %u needs to be indexed, but it has writers working on it - skipping it for now", datafile->fileno);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"DBENGINE: journal file %u needs to be indexed, but it has writers working on it - "
|
||||
"skipping it for now",
|
||||
datafile->fileno);
|
||||
|
||||
datafile = datafile->next;
|
||||
continue;
|
||||
}
|
||||
|
||||
netdata_log_info("DBENGINE: journal file %u is ready to be indexed", datafile->fileno);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"DBENGINE: journal file %u is ready to be indexed",
|
||||
datafile->fileno);
|
||||
|
||||
pgc_open_cache_to_journal_v2(open_cache, (Word_t) ctx, (int) datafile->fileno, ctx->config.page_type,
|
||||
journalfile_migrate_to_v2_callback, (void *) datafile->journalfile);
|
||||
|
||||
|
@ -1496,7 +1503,10 @@ static void *journal_v2_indexing_tp_worker(struct rrdengine_instance *ctx __mayb
|
|||
}
|
||||
|
||||
errno = 0;
|
||||
internal_error(count, "DBENGINE: journal indexing done; %u files processed", count);
|
||||
if(count)
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"DBENGINE: journal indexing done; %u files processed",
|
||||
count);
|
||||
|
||||
worker_is_idle();
|
||||
|
||||
|
|
|
@ -361,12 +361,12 @@ static void rrdeng_store_metric_create_new_page(struct rrdeng_collect_handle *ha
|
|||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
internal_error(true,
|
||||
#else
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl,
|
||||
#endif
|
||||
"DBENGINE: metric '%s' new page from %ld to %ld, update every %ld, has a conflict in main cache "
|
||||
"with existing %s%s page from %ld to %ld, update every %ld - "
|
||||
"is it collected more than once?",
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
#endif
|
||||
"DBENGINE: metric '%s' new page from %ld to %ld, update every %ld, has a conflict in main cache "
|
||||
"with existing %s%s page from %ld to %ld, update every %ld - "
|
||||
"is it collected more than once?",
|
||||
uuid,
|
||||
page_entry.start_time_s, page_entry.end_time_s, (time_t)page_entry.update_every_s,
|
||||
pgc_is_page_hot(pgc_page) ? "hot" : "not-hot",
|
||||
|
@ -521,8 +521,8 @@ static void store_metric_next_error_log(struct rrdeng_collect_handle *handle __m
|
|||
collect_page_flags_to_buffer(wb, handle->page_flags);
|
||||
}
|
||||
|
||||
error_limit_static_global_var(erl, 1, 0);
|
||||
error_limit(&erl,
|
||||
nd_log_limit_static_global_var(erl, 1, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_NOTICE,
|
||||
"DBENGINE: metric '%s' collected point at %ld, %s last collection at %ld, "
|
||||
"update every %ld, %s page from %ld to %ld, position %u (of %u), flags: %s",
|
||||
uuid,
|
||||
|
@ -535,7 +535,7 @@ static void store_metric_next_error_log(struct rrdeng_collect_handle *handle __m
|
|||
(time_t)(handle->page_end_time_ut / USEC_PER_SEC),
|
||||
handle->page_position, handle->page_entries_max,
|
||||
wb ? buffer_tostring(wb) : ""
|
||||
);
|
||||
);
|
||||
|
||||
buffer_free(wb);
|
||||
#else
|
||||
|
|
|
@ -1023,7 +1023,7 @@ typedef enum __attribute__ ((__packed__)) rrdhost_flags {
|
|||
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
#define rrdset_debug(st, fmt, args...) do { if(unlikely(debug_flags & D_RRD_STATS && rrdset_flag_check(st, RRDSET_FLAG_DEBUG))) \
|
||||
debug_int(__FILE__, __FUNCTION__, __LINE__, "%s: " fmt, rrdset_name(st), ##args); } while(0)
|
||||
netdata_logger(NDLS_DEBUG, NDLP_DEBUG, __FILE__, __FUNCTION__, __LINE__, "%s: " fmt, rrdset_name(st), ##args); } while(0)
|
||||
#else
|
||||
#define rrdset_debug(st, fmt, args...) debug_dummy()
|
||||
#endif
|
||||
|
|
|
@ -809,10 +809,10 @@ void rrdcalc_delete_alerts_not_matching_host_labels_from_this_host(RRDHOST *host
|
|||
continue;
|
||||
|
||||
if(!rrdlabels_match_simple_pattern_parsed(host->rrdlabels, rc->host_labels_pattern, '=', NULL)) {
|
||||
netdata_log_health("Health configuration for alarm '%s' cannot be applied, because the host %s does not have the label(s) '%s'",
|
||||
rrdcalc_name(rc),
|
||||
rrdhost_hostname(host),
|
||||
rrdcalc_host_labels(rc));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Health configuration for alarm '%s' cannot be applied, "
|
||||
"because the host %s does not have the label(s) '%s'",
|
||||
rrdcalc_name(rc), rrdhost_hostname(host), rrdcalc_host_labels(rc));
|
||||
|
||||
rrdcalc_unlink_and_delete(host, rc, false);
|
||||
}
|
||||
|
|
|
@ -1007,11 +1007,11 @@ int rrd_function_run(RRDHOST *host, BUFFER *result_wb, int timeout, const char *
|
|||
// the function can only be executed in async mode
|
||||
// put the function into the inflight requests
|
||||
|
||||
char uuid_str[UUID_STR_LEN];
|
||||
char uuid_str[UUID_COMPACT_STR_LEN];
|
||||
if(!transaction) {
|
||||
uuid_t uuid;
|
||||
uuid_generate_random(uuid);
|
||||
uuid_unparse_lower(uuid, uuid_str);
|
||||
uuid_unparse_lower_compact(uuid, uuid_str);
|
||||
transaction = uuid_str;
|
||||
}
|
||||
|
||||
|
|
|
@ -80,8 +80,6 @@ static inline void rrdhost_init() {
|
|||
}
|
||||
|
||||
RRDHOST_ACQUIRED *rrdhost_find_and_acquire(const char *machine_guid) {
|
||||
netdata_log_debug(D_RRD_CALLS, "rrdhost_find_and_acquire() host %s", machine_guid);
|
||||
|
||||
return (RRDHOST_ACQUIRED *)dictionary_get_and_acquire_item(rrdhost_root_index, machine_guid);
|
||||
}
|
||||
|
||||
|
@ -116,8 +114,9 @@ static inline RRDHOST *rrdhost_index_add_by_guid(RRDHOST *host) {
|
|||
rrdhost_option_set(host, RRDHOST_OPTION_INDEXED_MACHINE_GUID);
|
||||
else {
|
||||
rrdhost_option_clear(host, RRDHOST_OPTION_INDEXED_MACHINE_GUID);
|
||||
netdata_log_error("RRDHOST: %s() host with machine guid '%s' is already indexed",
|
||||
__FUNCTION__, host->machine_guid);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"RRDHOST: host with machine guid '%s' is already indexed. Not adding it again.",
|
||||
host->machine_guid);
|
||||
}
|
||||
|
||||
return host;
|
||||
|
@ -126,8 +125,9 @@ static inline RRDHOST *rrdhost_index_add_by_guid(RRDHOST *host) {
|
|||
static void rrdhost_index_del_by_guid(RRDHOST *host) {
|
||||
if(rrdhost_option_check(host, RRDHOST_OPTION_INDEXED_MACHINE_GUID)) {
|
||||
if(!dictionary_del(rrdhost_root_index, host->machine_guid))
|
||||
netdata_log_error("RRDHOST: %s() failed to delete machine guid '%s' from index",
|
||||
__FUNCTION__, host->machine_guid);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"RRDHOST: failed to delete machine guid '%s' from index",
|
||||
host->machine_guid);
|
||||
|
||||
rrdhost_option_clear(host, RRDHOST_OPTION_INDEXED_MACHINE_GUID);
|
||||
}
|
||||
|
@ -148,8 +148,9 @@ static inline void rrdhost_index_del_hostname(RRDHOST *host) {
|
|||
|
||||
if(rrdhost_option_check(host, RRDHOST_OPTION_INDEXED_HOSTNAME)) {
|
||||
if(!dictionary_del(rrdhost_root_index_hostname, rrdhost_hostname(host)))
|
||||
netdata_log_error("RRDHOST: %s() failed to delete hostname '%s' from index",
|
||||
__FUNCTION__, rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"RRDHOST: failed to delete hostname '%s' from index",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
rrdhost_option_clear(host, RRDHOST_OPTION_INDEXED_HOSTNAME);
|
||||
}
|
||||
|
@ -303,11 +304,11 @@ static RRDHOST *rrdhost_create(
|
|||
int is_localhost,
|
||||
bool archived
|
||||
) {
|
||||
netdata_log_debug(D_RRDHOST, "Host '%s': adding with guid '%s'", hostname, guid);
|
||||
|
||||
if(memory_mode == RRD_MEMORY_MODE_DBENGINE && !dbengine_enabled) {
|
||||
netdata_log_error("memory mode 'dbengine' is not enabled, but host '%s' is configured for it. Falling back to 'alloc'",
|
||||
hostname);
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR,
|
||||
"memory mode 'dbengine' is not enabled, but host '%s' is configured for it. Falling back to 'alloc'",
|
||||
hostname);
|
||||
|
||||
memory_mode = RRD_MEMORY_MODE_ALLOC;
|
||||
}
|
||||
|
||||
|
@ -392,7 +393,9 @@ int is_legacy = 1;
|
|||
(host->rrd_memory_mode == RRD_MEMORY_MODE_DBENGINE && is_legacy))) {
|
||||
int r = mkdir(host->cache_dir, 0775);
|
||||
if(r != 0 && errno != EEXIST)
|
||||
netdata_log_error("Host '%s': cannot create directory '%s'", rrdhost_hostname(host), host->cache_dir);
|
||||
nd_log(NDLS_DAEMON, NDLP_CRIT,
|
||||
"Host '%s': cannot create directory '%s'",
|
||||
rrdhost_hostname(host), host->cache_dir);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -418,7 +421,9 @@ int is_legacy = 1;
|
|||
ret = mkdir(dbenginepath, 0775);
|
||||
|
||||
if (ret != 0 && errno != EEXIST)
|
||||
netdata_log_error("Host '%s': cannot create directory '%s'", rrdhost_hostname(host), dbenginepath);
|
||||
nd_log(NDLS_DAEMON, NDLP_CRIT,
|
||||
"Host '%s': cannot create directory '%s'",
|
||||
rrdhost_hostname(host), dbenginepath);
|
||||
else
|
||||
ret = 0; // succeed
|
||||
|
||||
|
@ -459,8 +464,9 @@ int is_legacy = 1;
|
|||
}
|
||||
|
||||
if (ret) { // check legacy or multihost initialization success
|
||||
netdata_log_error("Host '%s': cannot initialize host with machine guid '%s'. Failed to initialize DB engine at '%s'.",
|
||||
rrdhost_hostname(host), host->machine_guid, host->cache_dir);
|
||||
nd_log(NDLS_DAEMON, NDLP_CRIT,
|
||||
"Host '%s': cannot initialize host with machine guid '%s'. Failed to initialize DB engine at '%s'.",
|
||||
rrdhost_hostname(host), host->machine_guid, host->cache_dir);
|
||||
|
||||
rrd_wrlock();
|
||||
rrdhost_free___while_having_rrd_wrlock(host, true);
|
||||
|
@ -508,10 +514,13 @@ int is_legacy = 1;
|
|||
|
||||
RRDHOST *t = rrdhost_index_add_by_guid(host);
|
||||
if(t != host) {
|
||||
netdata_log_error("Host '%s': cannot add host with machine guid '%s' to index. It already exists as host '%s' with machine guid '%s'.",
|
||||
rrdhost_hostname(host), host->machine_guid, rrdhost_hostname(t), t->machine_guid);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"Host '%s': cannot add host with machine guid '%s' to index. It already exists as host '%s' with machine guid '%s'.",
|
||||
rrdhost_hostname(host), host->machine_guid, rrdhost_hostname(t), t->machine_guid);
|
||||
|
||||
if (!is_localhost)
|
||||
rrdhost_free___while_having_rrd_wrlock(host, true);
|
||||
|
||||
rrd_unlock();
|
||||
return NULL;
|
||||
}
|
||||
|
@ -527,21 +536,22 @@ int is_legacy = 1;
|
|||
|
||||
// ------------------------------------------------------------------------
|
||||
|
||||
netdata_log_info("Host '%s' (at registry as '%s') with guid '%s' initialized"
|
||||
", os '%s'"
|
||||
", timezone '%s'"
|
||||
", tags '%s'"
|
||||
", program_name '%s'"
|
||||
", program_version '%s'"
|
||||
", update every %d"
|
||||
", memory mode %s"
|
||||
", history entries %d"
|
||||
", streaming %s"
|
||||
" (to '%s' with api key '%s')"
|
||||
", health %s"
|
||||
", cache_dir '%s'"
|
||||
", alarms default handler '%s'"
|
||||
", alarms default recipient '%s'"
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO,
|
||||
"Host '%s' (at registry as '%s') with guid '%s' initialized"
|
||||
", os '%s'"
|
||||
", timezone '%s'"
|
||||
", tags '%s'"
|
||||
", program_name '%s'"
|
||||
", program_version '%s'"
|
||||
", update every %d"
|
||||
", memory mode %s"
|
||||
", history entries %d"
|
||||
", streaming %s"
|
||||
" (to '%s' with api key '%s')"
|
||||
", health %s"
|
||||
", cache_dir '%s'"
|
||||
", alarms default handler '%s'"
|
||||
", alarms default recipient '%s'"
|
||||
, rrdhost_hostname(host)
|
||||
, rrdhost_registry_hostname(host)
|
||||
, host->machine_guid
|
||||
|
@ -621,44 +631,56 @@ static void rrdhost_update(RRDHOST *host
|
|||
host->registry_hostname = string_strdupz((registry_hostname && *registry_hostname)?registry_hostname:hostname);
|
||||
|
||||
if(strcmp(rrdhost_hostname(host), hostname) != 0) {
|
||||
netdata_log_info("Host '%s' has been renamed to '%s'. If this is not intentional it may mean multiple hosts are using the same machine_guid.", rrdhost_hostname(host), hostname);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Host '%s' has been renamed to '%s'. If this is not intentional it may mean multiple hosts are using the same machine_guid.",
|
||||
rrdhost_hostname(host), hostname);
|
||||
|
||||
rrdhost_init_hostname(host, hostname, true);
|
||||
} else {
|
||||
rrdhost_index_add_hostname(host);
|
||||
}
|
||||
|
||||
if(strcmp(rrdhost_program_name(host), program_name) != 0) {
|
||||
netdata_log_info("Host '%s' switched program name from '%s' to '%s'", rrdhost_hostname(host), rrdhost_program_name(host), program_name);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"Host '%s' switched program name from '%s' to '%s'",
|
||||
rrdhost_hostname(host), rrdhost_program_name(host), program_name);
|
||||
|
||||
STRING *t = host->program_name;
|
||||
host->program_name = string_strdupz(program_name);
|
||||
string_freez(t);
|
||||
}
|
||||
|
||||
if(strcmp(rrdhost_program_version(host), program_version) != 0) {
|
||||
netdata_log_info("Host '%s' switched program version from '%s' to '%s'", rrdhost_hostname(host), rrdhost_program_version(host), program_version);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"Host '%s' switched program version from '%s' to '%s'",
|
||||
rrdhost_hostname(host), rrdhost_program_version(host), program_version);
|
||||
|
||||
STRING *t = host->program_version;
|
||||
host->program_version = string_strdupz(program_version);
|
||||
string_freez(t);
|
||||
}
|
||||
|
||||
if(host->rrd_update_every != update_every)
|
||||
netdata_log_error("Host '%s' has an update frequency of %d seconds, but the wanted one is %d seconds. "
|
||||
"Restart netdata here to apply the new settings.",
|
||||
rrdhost_hostname(host), host->rrd_update_every, update_every);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Host '%s' has an update frequency of %d seconds, but the wanted one is %d seconds. "
|
||||
"Restart netdata here to apply the new settings.",
|
||||
rrdhost_hostname(host), host->rrd_update_every, update_every);
|
||||
|
||||
if(host->rrd_memory_mode != mode)
|
||||
netdata_log_error("Host '%s' has memory mode '%s', but the wanted one is '%s'. "
|
||||
"Restart netdata here to apply the new settings.",
|
||||
rrdhost_hostname(host),
|
||||
rrd_memory_mode_name(host->rrd_memory_mode),
|
||||
rrd_memory_mode_name(mode));
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Host '%s' has memory mode '%s', but the wanted one is '%s'. "
|
||||
"Restart netdata here to apply the new settings.",
|
||||
rrdhost_hostname(host),
|
||||
rrd_memory_mode_name(host->rrd_memory_mode),
|
||||
rrd_memory_mode_name(mode));
|
||||
|
||||
else if(host->rrd_memory_mode != RRD_MEMORY_MODE_DBENGINE && host->rrd_history_entries < history)
|
||||
netdata_log_error("Host '%s' has history of %d entries, but the wanted one is %ld entries. "
|
||||
"Restart netdata here to apply the new settings.",
|
||||
rrdhost_hostname(host),
|
||||
host->rrd_history_entries,
|
||||
history);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Host '%s' has history of %d entries, but the wanted one is %ld entries. "
|
||||
"Restart netdata here to apply the new settings.",
|
||||
rrdhost_hostname(host),
|
||||
host->rrd_history_entries,
|
||||
history);
|
||||
|
||||
// update host tags
|
||||
rrdhost_init_tags(host, tags);
|
||||
|
@ -700,7 +722,9 @@ static void rrdhost_update(RRDHOST *host
|
|||
ml_host_new(host);
|
||||
|
||||
rrdhost_load_rrdcontext_data(host);
|
||||
netdata_log_info("Host %s is not in archived mode anymore", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Host %s is not in archived mode anymore",
|
||||
rrdhost_hostname(host));
|
||||
}
|
||||
|
||||
spinlock_unlock(&host->rrdhost_update_lock);
|
||||
|
@ -731,8 +755,6 @@ RRDHOST *rrdhost_find_or_create(
|
|||
, struct rrdhost_system_info *system_info
|
||||
, bool archived
|
||||
) {
|
||||
netdata_log_debug(D_RRDHOST, "Searching for host '%s' with guid '%s'", hostname, guid);
|
||||
|
||||
RRDHOST *host = rrdhost_find_by_guid(guid);
|
||||
if (unlikely(host && host->rrd_memory_mode != mode && rrdhost_flag_check(host, RRDHOST_FLAG_ARCHIVED))) {
|
||||
|
||||
|
@ -740,10 +762,11 @@ RRDHOST *rrdhost_find_or_create(
|
|||
return host;
|
||||
|
||||
/* If a legacy memory mode instantiates all dbengine state must be discarded to avoid inconsistencies */
|
||||
netdata_log_error("Archived host '%s' has memory mode '%s', but the wanted one is '%s'. Discarding archived state.",
|
||||
rrdhost_hostname(host),
|
||||
rrd_memory_mode_name(host->rrd_memory_mode),
|
||||
rrd_memory_mode_name(mode));
|
||||
nd_log(NDLS_DAEMON, NDLP_INFO,
|
||||
"Archived host '%s' has memory mode '%s', but the wanted one is '%s'. Discarding archived state.",
|
||||
rrdhost_hostname(host),
|
||||
rrd_memory_mode_name(host->rrd_memory_mode),
|
||||
rrd_memory_mode_name(mode));
|
||||
|
||||
rrd_wrlock();
|
||||
rrdhost_free___while_having_rrd_wrlock(host, true);
|
||||
|
@ -851,18 +874,26 @@ void dbengine_init(char *hostname) {
|
|||
if (read_num > 0 && read_num <= MAX_PAGES_PER_EXTENT)
|
||||
rrdeng_pages_per_extent = read_num;
|
||||
else {
|
||||
netdata_log_error("Invalid dbengine pages per extent %u given. Using %u.", read_num, rrdeng_pages_per_extent);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Invalid dbengine pages per extent %u given. Using %u.",
|
||||
read_num, rrdeng_pages_per_extent);
|
||||
|
||||
config_set_number(CONFIG_SECTION_DB, "dbengine pages per extent", rrdeng_pages_per_extent);
|
||||
}
|
||||
|
||||
storage_tiers = config_get_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
if(storage_tiers < 1) {
|
||||
netdata_log_error("At least 1 storage tier is required. Assuming 1.");
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"At least 1 storage tier is required. Assuming 1.");
|
||||
|
||||
storage_tiers = 1;
|
||||
config_set_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
}
|
||||
if(storage_tiers > RRD_STORAGE_TIERS) {
|
||||
netdata_log_error("Up to %d storage tier are supported. Assuming %d.", RRD_STORAGE_TIERS, RRD_STORAGE_TIERS);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Up to %d storage tier are supported. Assuming %d.",
|
||||
RRD_STORAGE_TIERS, RRD_STORAGE_TIERS);
|
||||
|
||||
storage_tiers = RRD_STORAGE_TIERS;
|
||||
config_set_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
}
|
||||
|
@ -884,7 +915,9 @@ void dbengine_init(char *hostname) {
|
|||
|
||||
int ret = mkdir(dbenginepath, 0775);
|
||||
if (ret != 0 && errno != EEXIST) {
|
||||
netdata_log_error("DBENGINE on '%s': cannot create directory '%s'", hostname, dbenginepath);
|
||||
nd_log(NDLS_DAEMON, NDLP_CRIT,
|
||||
"DBENGINE on '%s': cannot create directory '%s'",
|
||||
hostname, dbenginepath);
|
||||
break;
|
||||
}
|
||||
|
||||
|
@ -904,9 +937,9 @@ void dbengine_init(char *hostname) {
|
|||
if(grouping_iterations < 2) {
|
||||
grouping_iterations = 2;
|
||||
config_set_number(CONFIG_SECTION_DB, dbengineconfig, grouping_iterations);
|
||||
netdata_log_error("DBENGINE on '%s': 'dbegnine tier %zu update every iterations' cannot be less than 2. Assuming 2.",
|
||||
hostname,
|
||||
tier);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE on '%s': 'dbegnine tier %zu update every iterations' cannot be less than 2. Assuming 2.",
|
||||
hostname, tier);
|
||||
}
|
||||
|
||||
snprintfz(dbengineconfig, 200, "dbengine tier %zu backfill", tier);
|
||||
|
@ -915,7 +948,10 @@ void dbengine_init(char *hostname) {
|
|||
else if(strcmp(bf, "full") == 0) backfill = RRD_BACKFILL_FULL;
|
||||
else if(strcmp(bf, "none") == 0) backfill = RRD_BACKFILL_NONE;
|
||||
else {
|
||||
netdata_log_error("DBENGINE: unknown backfill value '%s', assuming 'new'", bf);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE: unknown backfill value '%s', assuming 'new'",
|
||||
bf);
|
||||
|
||||
config_set(CONFIG_SECTION_DB, dbengineconfig, "new");
|
||||
backfill = RRD_BACKFILL_NEW;
|
||||
}
|
||||
|
@ -926,10 +962,10 @@ void dbengine_init(char *hostname) {
|
|||
|
||||
if(tier > 0 && get_tier_grouping(tier) > 65535) {
|
||||
storage_tiers_grouping_iterations[tier] = 1;
|
||||
netdata_log_error("DBENGINE on '%s': dbengine tier %zu gives aggregation of more than 65535 points of tier 0. Disabling tiers above %zu",
|
||||
hostname,
|
||||
tier,
|
||||
tier);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE on '%s': dbengine tier %zu gives aggregation of more than 65535 points of tier 0. "
|
||||
"Disabling tiers above %zu",
|
||||
hostname, tier, tier);
|
||||
break;
|
||||
}
|
||||
|
||||
|
@ -957,21 +993,19 @@ void dbengine_init(char *hostname) {
|
|||
netdata_thread_join(tiers_init[tier].thread, &ptr);
|
||||
|
||||
if(tiers_init[tier].ret != 0) {
|
||||
netdata_log_error("DBENGINE on '%s': Failed to initialize multi-host database tier %zu on path '%s'",
|
||||
hostname,
|
||||
tiers_init[tier].tier,
|
||||
tiers_init[tier].path);
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR,
|
||||
"DBENGINE on '%s': Failed to initialize multi-host database tier %zu on path '%s'",
|
||||
hostname, tiers_init[tier].tier, tiers_init[tier].path);
|
||||
}
|
||||
else if(created_tiers == tier)
|
||||
created_tiers++;
|
||||
}
|
||||
|
||||
if(created_tiers && created_tiers < storage_tiers) {
|
||||
netdata_log_error("DBENGINE on '%s': Managed to create %zu tiers instead of %zu. Continuing with %zu available.",
|
||||
hostname,
|
||||
created_tiers,
|
||||
storage_tiers,
|
||||
created_tiers);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE on '%s': Managed to create %zu tiers instead of %zu. Continuing with %zu available.",
|
||||
hostname, created_tiers, storage_tiers, created_tiers);
|
||||
|
||||
storage_tiers = created_tiers;
|
||||
}
|
||||
else if(!created_tiers)
|
||||
|
@ -984,7 +1018,10 @@ void dbengine_init(char *hostname) {
|
|||
#else
|
||||
storage_tiers = config_get_number(CONFIG_SECTION_DB, "storage tiers", 1);
|
||||
if(storage_tiers != 1) {
|
||||
netdata_log_error("DBENGINE is not available on '%s', so only 1 database tier can be supported.", hostname);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"DBENGINE is not available on '%s', so only 1 database tier can be supported.",
|
||||
hostname);
|
||||
|
||||
storage_tiers = 1;
|
||||
config_set_number(CONFIG_SECTION_DB, "storage tiers", storage_tiers);
|
||||
}
|
||||
|
@ -1000,7 +1037,9 @@ int rrd_init(char *hostname, struct rrdhost_system_info *system_info, bool unitt
|
|||
set_late_global_environment(system_info);
|
||||
fatal("Failed to initialize SQLite");
|
||||
}
|
||||
netdata_log_info("Skipping SQLITE metadata initialization since memory mode is not dbengine");
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Skipping SQLITE metadata initialization since memory mode is not dbengine");
|
||||
}
|
||||
|
||||
if (unlikely(sql_init_context_database(system_info ? 0 : 1))) {
|
||||
|
@ -1015,23 +1054,28 @@ int rrd_init(char *hostname, struct rrdhost_system_info *system_info, bool unitt
|
|||
rrdpush_init();
|
||||
|
||||
if (default_rrd_memory_mode == RRD_MEMORY_MODE_DBENGINE || rrdpush_receiver_needs_dbengine()) {
|
||||
netdata_log_info("DBENGINE: Initializing ...");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"DBENGINE: Initializing ...");
|
||||
|
||||
dbengine_init(hostname);
|
||||
}
|
||||
else {
|
||||
netdata_log_info("DBENGINE: Not initializing ...");
|
||||
else
|
||||
storage_tiers = 1;
|
||||
}
|
||||
|
||||
if (!dbengine_enabled) {
|
||||
if (storage_tiers > 1) {
|
||||
netdata_log_error("dbengine is not enabled, but %zu tiers have been requested. Resetting tiers to 1",
|
||||
storage_tiers);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"dbengine is not enabled, but %zu tiers have been requested. Resetting tiers to 1",
|
||||
storage_tiers);
|
||||
|
||||
storage_tiers = 1;
|
||||
}
|
||||
|
||||
if (default_rrd_memory_mode == RRD_MEMORY_MODE_DBENGINE) {
|
||||
netdata_log_error("dbengine is not enabled, but it has been given as the default db mode. Resetting db mode to alloc");
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"dbengine is not enabled, but it has been given as the default db mode. "
|
||||
"Resetting db mode to alloc");
|
||||
|
||||
default_rrd_memory_mode = RRD_MEMORY_MODE_ALLOC;
|
||||
}
|
||||
}
|
||||
|
@ -1040,7 +1084,6 @@ int rrd_init(char *hostname, struct rrdhost_system_info *system_info, bool unitt
|
|||
if(!unittest)
|
||||
metadata_sync_init();
|
||||
|
||||
netdata_log_debug(D_RRDHOST, "Initializing localhost with hostname '%s'", hostname);
|
||||
localhost = rrdhost_create(
|
||||
hostname
|
||||
, registry_get_this_machine_hostname()
|
||||
|
@ -1177,7 +1220,9 @@ void rrdhost_free___while_having_rrd_wrlock(RRDHOST *host, bool force) {
|
|||
if(!host) return;
|
||||
|
||||
if (netdata_exit || force) {
|
||||
netdata_log_info("RRD: 'host:%s' freeing memory...", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: 'host:%s' freeing memory...",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
// ------------------------------------------------------------------------
|
||||
// first remove it from the indexes, so that it will not be discoverable
|
||||
|
@ -1243,7 +1288,10 @@ void rrdhost_free___while_having_rrd_wrlock(RRDHOST *host, bool force) {
|
|||
#endif
|
||||
|
||||
if (!netdata_exit && !force) {
|
||||
netdata_log_info("RRD: 'host:%s' is now in archive mode...", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: 'host:%s' is now in archive mode...",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
rrdhost_flag_set(host, RRDHOST_FLAG_ARCHIVED | RRDHOST_FLAG_ORPHAN);
|
||||
return;
|
||||
}
|
||||
|
@ -1313,7 +1361,9 @@ void rrd_finalize_collection_for_all_hosts(void) {
|
|||
void rrdhost_save_charts(RRDHOST *host) {
|
||||
if(!host) return;
|
||||
|
||||
netdata_log_info("RRD: 'host:%s' saving / closing database...", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: 'host:%s' saving / closing database...",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
RRDSET *st;
|
||||
|
||||
|
@ -1442,7 +1492,9 @@ static void rrdhost_load_config_labels(void) {
|
|||
int status = config_load(NULL, 1, CONFIG_SECTION_HOST_LABEL);
|
||||
if(!status) {
|
||||
char *filename = CONFIG_DIR "/" CONFIG_FILENAME;
|
||||
netdata_log_error("RRDLABEL: Cannot reload the configuration file '%s', using labels in memory", filename);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"RRDLABEL: Cannot reload the configuration file '%s', using labels in memory",
|
||||
filename);
|
||||
}
|
||||
|
||||
struct section *co = appconfig_get_section(&netdata_config, CONFIG_SECTION_HOST_LABEL);
|
||||
|
@ -1462,12 +1514,13 @@ static void rrdhost_load_kubernetes_labels(void) {
|
|||
sprintf(label_script, "%s/%s", netdata_configured_primary_plugins_dir, "get-kubernetes-labels.sh");
|
||||
|
||||
if (unlikely(access(label_script, R_OK) != 0)) {
|
||||
netdata_log_error("Kubernetes pod label fetching script %s not found.",label_script);
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR,
|
||||
"Kubernetes pod label fetching script %s not found.",
|
||||
label_script);
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
netdata_log_debug(D_RRDHOST, "Attempting to fetch external labels via %s", label_script);
|
||||
|
||||
pid_t pid;
|
||||
FILE *fp_child_input;
|
||||
FILE *fp_child_output = netdata_popen(label_script, &pid, &fp_child_input);
|
||||
|
@ -1481,7 +1534,9 @@ static void rrdhost_load_kubernetes_labels(void) {
|
|||
// Here we'll inform with an ERROR that the script failed, show whatever (if anything) was added to the list of labels, free the memory and set the return to null
|
||||
int rc = netdata_pclose(fp_child_input, fp_child_output, pid);
|
||||
if(rc)
|
||||
netdata_log_error("%s exited abnormally. Failed to get kubernetes labels.", label_script);
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR,
|
||||
"%s exited abnormally. Failed to get kubernetes labels.",
|
||||
label_script);
|
||||
}
|
||||
|
||||
void reload_host_labels(void) {
|
||||
|
@ -1501,7 +1556,9 @@ void reload_host_labels(void) {
|
|||
}
|
||||
|
||||
void rrdhost_finalize_collection(RRDHOST *host) {
|
||||
netdata_log_info("RRD: 'host:%s' stopping data collection...", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: 'host:%s' stopping data collection...",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
RRDSET *st;
|
||||
rrdset_foreach_read(st, host)
|
||||
|
@ -1515,7 +1572,9 @@ void rrdhost_finalize_collection(RRDHOST *host) {
|
|||
void rrdhost_delete_charts(RRDHOST *host) {
|
||||
if(!host) return;
|
||||
|
||||
netdata_log_info("RRD: 'host:%s' deleting disk files...", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: 'host:%s' deleting disk files...",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
RRDSET *st;
|
||||
|
||||
|
@ -1523,8 +1582,8 @@ void rrdhost_delete_charts(RRDHOST *host) {
|
|||
// we get a write lock
|
||||
// to ensure only one thread is saving the database
|
||||
rrdset_foreach_write(st, host){
|
||||
rrdset_delete_files(st);
|
||||
}
|
||||
rrdset_delete_files(st);
|
||||
}
|
||||
rrdset_foreach_done(st);
|
||||
}
|
||||
|
||||
|
@ -1537,7 +1596,9 @@ void rrdhost_delete_charts(RRDHOST *host) {
|
|||
void rrdhost_cleanup_charts(RRDHOST *host) {
|
||||
if(!host) return;
|
||||
|
||||
netdata_log_info("RRD: 'host:%s' cleaning up disk files...", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: 'host:%s' cleaning up disk files...",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
RRDSET *st;
|
||||
uint32_t rrdhost_delete_obsolete_charts = rrdhost_option_check(host, RRDHOST_OPTION_DELETE_OBSOLETE_CHARTS);
|
||||
|
@ -1564,7 +1625,9 @@ void rrdhost_cleanup_charts(RRDHOST *host) {
|
|||
// RRDHOST - save all hosts to disk
|
||||
|
||||
void rrdhost_save_all(void) {
|
||||
netdata_log_info("RRD: saving databases [%zu hosts(s)]...", rrdhost_hosts_available());
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: saving databases [%zu hosts(s)]...",
|
||||
rrdhost_hosts_available());
|
||||
|
||||
rrd_rdlock();
|
||||
|
||||
|
@ -1579,7 +1642,9 @@ void rrdhost_save_all(void) {
|
|||
// RRDHOST - save or delete all hosts from disk
|
||||
|
||||
void rrdhost_cleanup_all(void) {
|
||||
netdata_log_info("RRD: cleaning up database [%zu hosts(s)]...", rrdhost_hosts_available());
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"RRD: cleaning up database [%zu hosts(s)]...",
|
||||
rrdhost_hosts_available());
|
||||
|
||||
rrd_rdlock();
|
||||
|
||||
|
|
|
@ -1323,6 +1323,14 @@ void rrddim_store_metric_with_trace(RRDDIM *rd, usec_t point_end_time_ut, NETDAT
|
|||
#else // !NETDATA_LOG_COLLECTION_ERRORS
|
||||
void rrddim_store_metric(RRDDIM *rd, usec_t point_end_time_ut, NETDATA_DOUBLE n, SN_FLAGS flags) {
|
||||
#endif // !NETDATA_LOG_COLLECTION_ERRORS
|
||||
|
||||
static __thread struct log_stack_entry lgs[] = {
|
||||
[0] = ND_LOG_FIELD_STR(NDF_NIDL_DIMENSION, NULL),
|
||||
[1] = ND_LOG_FIELD_END(),
|
||||
};
|
||||
lgs[0].str = rd->id;
|
||||
log_stack_push(lgs);
|
||||
|
||||
#ifdef NETDATA_LOG_COLLECTION_ERRORS
|
||||
rd->rrddim_store_metric_count++;
|
||||
|
||||
|
@ -1384,6 +1392,7 @@ void rrddim_store_metric(RRDDIM *rd, usec_t point_end_time_ut, NETDATA_DOUBLE n,
|
|||
}
|
||||
|
||||
rrdcontext_collected_rrddim(rd);
|
||||
log_stack_pop(&lgs);
|
||||
}
|
||||
|
||||
void store_metric_collection_completed() {
|
||||
|
|
|
@ -267,7 +267,7 @@ void aclk_push_alert_event(struct aclk_sync_cfg_t *wc)
|
|||
int rc;
|
||||
|
||||
if (unlikely(!wc->alert_updates)) {
|
||||
netdata_log_access(
|
||||
nd_log(NDLS_ACCESS, NDLP_NOTICE,
|
||||
"ACLK STA [%s (%s)]: Ignoring alert push event, updates have been turned off for this node.",
|
||||
wc->node_id,
|
||||
wc->host ? rrdhost_hostname(wc->host) : "N/A");
|
||||
|
@ -424,7 +424,7 @@ void aclk_push_alert_event(struct aclk_sync_cfg_t *wc)
|
|||
|
||||
} else {
|
||||
if (wc->alerts_log_first_sequence_id)
|
||||
netdata_log_access(
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG,
|
||||
"ACLK RES [%s (%s)]: ALERTS SENT from %" PRIu64 " to %" PRIu64 "",
|
||||
wc->node_id,
|
||||
wc->host ? rrdhost_hostname(wc->host) : "N/A",
|
||||
|
@ -523,8 +523,11 @@ void aclk_send_alarm_configuration(char *config_hash)
|
|||
if (unlikely(!wc))
|
||||
return;
|
||||
|
||||
netdata_log_access(
|
||||
"ACLK REQ [%s (%s)]: Request to send alert config %s.", wc->node_id, rrdhost_hostname(wc->host), config_hash);
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG,
|
||||
"ACLK REQ [%s (%s)]: Request to send alert config %s.",
|
||||
wc->node_id,
|
||||
wc->host ? rrdhost_hostname(wc->host) : "N/A",
|
||||
config_hash);
|
||||
|
||||
aclk_push_alert_config(wc->node_id, config_hash);
|
||||
}
|
||||
|
@ -634,13 +637,13 @@ int aclk_push_alert_config_event(char *node_id __maybe_unused, char *config_hash
|
|||
}
|
||||
|
||||
if (likely(p_alarm_config.cfg_hash)) {
|
||||
netdata_log_access("ACLK RES [%s (%s)]: Sent alert config %s.", wc->node_id, wc->host ? rrdhost_hostname(wc->host) : "N/A", config_hash);
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK RES [%s (%s)]: Sent alert config %s.", wc->node_id, wc->host ? rrdhost_hostname(wc->host) : "N/A", config_hash);
|
||||
aclk_send_provide_alarm_cfg(&p_alarm_config);
|
||||
freez(p_alarm_config.cfg_hash);
|
||||
destroy_aclk_alarm_configuration(&alarm_config);
|
||||
}
|
||||
else
|
||||
netdata_log_access("ACLK STA [%s (%s)]: Alert config for %s not found.", wc->node_id, wc->host ? rrdhost_hostname(wc->host) : "N/A", config_hash);
|
||||
nd_log(NDLS_ACCESS, NDLP_WARNING, "ACLK STA [%s (%s)]: Alert config for %s not found.", wc->node_id, wc->host ? rrdhost_hostname(wc->host) : "N/A", config_hash);
|
||||
|
||||
bind_fail:
|
||||
rc = sqlite3_finalize(res);
|
||||
|
@ -669,20 +672,15 @@ void aclk_start_alert_streaming(char *node_id, bool resets)
|
|||
return;
|
||||
|
||||
if (unlikely(!host->health.health_enabled)) {
|
||||
netdata_log_access(
|
||||
"ACLK STA [%s (N/A)]: Ignoring request to stream alert state changes, health is disabled.", node_id);
|
||||
nd_log(NDLS_ACCESS, NDLP_NOTICE, "ACLK STA [%s (N/A)]: Ignoring request to stream alert state changes, health is disabled.", node_id);
|
||||
return;
|
||||
}
|
||||
|
||||
if (resets) {
|
||||
netdata_log_access(
|
||||
"ACLK REQ [%s (%s)]: STREAM ALERTS ENABLED (RESET REQUESTED)",
|
||||
node_id,
|
||||
wc->host ? rrdhost_hostname(wc->host) : "N/A");
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK REQ [%s (%s)]: STREAM ALERTS ENABLED (RESET REQUESTED)", node_id, wc->host ? rrdhost_hostname(wc->host) : "N/A");
|
||||
sql_queue_existing_alerts_to_aclk(host);
|
||||
} else
|
||||
netdata_log_access(
|
||||
"ACLK REQ [%s (%s)]: STREAM ALERTS ENABLED", node_id, wc->host ? rrdhost_hostname(wc->host) : "N/A");
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK REQ [%s (%s)]: STREAM ALERTS ENABLED", node_id, wc->host ? rrdhost_hostname(wc->host) : "N/A");
|
||||
|
||||
wc->alert_updates = 1;
|
||||
wc->alert_queue_removed = SEND_REMOVED_AFTER_HEALTH_LOOPS;
|
||||
|
@ -725,7 +723,7 @@ void sql_process_queue_removed_alerts_to_aclk(char *node_id)
|
|||
|
||||
rc = execute_insert(res);
|
||||
if (likely(rc == SQLITE_DONE)) {
|
||||
netdata_log_access("ACLK STA [%s (%s)]: QUEUED REMOVED ALERTS", wc->node_id, rrdhost_hostname(wc->host));
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK STA [%s (%s)]: QUEUED REMOVED ALERTS", wc->node_id, rrdhost_hostname(wc->host));
|
||||
rrdhost_flag_set(wc->host, RRDHOST_FLAG_ACLK_STREAM_ALERTS);
|
||||
wc->alert_queue_removed = 0;
|
||||
}
|
||||
|
@ -758,15 +756,15 @@ void aclk_process_send_alarm_snapshot(char *node_id, char *claim_id __maybe_unus
|
|||
|
||||
RRDHOST *host = find_host_by_node_id(node_id);
|
||||
if (unlikely(!host || !(wc = host->aclk_config))) {
|
||||
netdata_log_access("ACLK STA [%s (N/A)]: ACLK node id does not exist", node_id);
|
||||
nd_log(NDLS_ACCESS, NDLP_WARNING, "ACLK STA [%s (N/A)]: ACLK node id does not exist", node_id);
|
||||
return;
|
||||
}
|
||||
|
||||
netdata_log_access(
|
||||
"IN [%s (%s)]: Request to send alerts snapshot, snapshot_uuid %s",
|
||||
node_id,
|
||||
wc->host ? rrdhost_hostname(wc->host) : "N/A",
|
||||
snapshot_uuid);
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG,
|
||||
"IN [%s (%s)]: Request to send alerts snapshot, snapshot_uuid %s",
|
||||
node_id,
|
||||
wc->host ? rrdhost_hostname(wc->host) : "N/A",
|
||||
snapshot_uuid);
|
||||
|
||||
if (wc->alerts_snapshot_uuid && !strcmp(wc->alerts_snapshot_uuid,snapshot_uuid))
|
||||
return;
|
||||
|
@ -855,7 +853,7 @@ void aclk_push_alert_snapshot_event(char *node_id __maybe_unused)
|
|||
RRDHOST *host = find_host_by_node_id(node_id);
|
||||
|
||||
if (unlikely(!host)) {
|
||||
netdata_log_access("AC [%s (N/A)]: Node id not found", node_id);
|
||||
nd_log(NDLS_ACCESS, NDLP_WARNING, "AC [%s (N/A)]: Node id not found", node_id);
|
||||
freez(node_id);
|
||||
return;
|
||||
}
|
||||
|
@ -865,7 +863,7 @@ void aclk_push_alert_snapshot_event(char *node_id __maybe_unused)
|
|||
|
||||
// we perhaps we don't need this for snapshots
|
||||
if (unlikely(!wc->alert_updates)) {
|
||||
netdata_log_access(
|
||||
nd_log(NDLS_ACCESS, NDLP_NOTICE,
|
||||
"ACLK STA [%s (%s)]: Ignoring alert snapshot event, updates have been turned off for this node.",
|
||||
wc->node_id,
|
||||
wc->host ? rrdhost_hostname(wc->host) : "N/A");
|
||||
|
@ -879,7 +877,7 @@ void aclk_push_alert_snapshot_event(char *node_id __maybe_unused)
|
|||
if (unlikely(!claim_id))
|
||||
return;
|
||||
|
||||
netdata_log_access("ACLK REQ [%s (%s)]: Sending alerts snapshot, snapshot_uuid %s", wc->node_id, rrdhost_hostname(wc->host), wc->alerts_snapshot_uuid);
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK REQ [%s (%s)]: Sending alerts snapshot, snapshot_uuid %s", wc->node_id, rrdhost_hostname(wc->host), wc->alerts_snapshot_uuid);
|
||||
|
||||
uint32_t cnt = 0;
|
||||
|
||||
|
@ -1057,9 +1055,9 @@ void aclk_send_alarm_checkpoint(char *node_id, char *claim_id __maybe_unused)
|
|||
RRDHOST *host = find_host_by_node_id(node_id);
|
||||
|
||||
if (unlikely(!host || !(wc = host->aclk_config)))
|
||||
netdata_log_access("ACLK REQ [%s (N/A)]: ALERTS CHECKPOINT REQUEST RECEIVED FOR INVALID NODE", node_id);
|
||||
nd_log(NDLS_ACCESS, NDLP_WARNING, "ACLK REQ [%s (N/A)]: ALERTS CHECKPOINT REQUEST RECEIVED FOR INVALID NODE", node_id);
|
||||
else {
|
||||
netdata_log_access("ACLK REQ [%s (%s)]: ALERTS CHECKPOINT REQUEST RECEIVED", node_id, rrdhost_hostname(host));
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK REQ [%s (%s)]: ALERTS CHECKPOINT REQUEST RECEIVED", node_id, rrdhost_hostname(host));
|
||||
wc->alert_checkpoint_req = SEND_CHECKPOINT_AFTER_HEALTH_LOOPS;
|
||||
}
|
||||
}
|
||||
|
@ -1087,14 +1085,14 @@ void aclk_push_alarm_checkpoint(RRDHOST *host __maybe_unused)
|
|||
#ifdef ENABLE_ACLK
|
||||
struct aclk_sync_cfg_t *wc = host->aclk_config;
|
||||
if (unlikely(!wc)) {
|
||||
netdata_log_access("ACLK REQ [%s (N/A)]: ALERTS CHECKPOINT REQUEST RECEIVED FOR INVALID NODE", rrdhost_hostname(host));
|
||||
nd_log(NDLS_ACCESS, NDLP_WARNING, "ACLK REQ [%s (N/A)]: ALERTS CHECKPOINT REQUEST RECEIVED FOR INVALID NODE", rrdhost_hostname(host));
|
||||
return;
|
||||
}
|
||||
|
||||
if (rrdhost_flag_check(host, RRDHOST_FLAG_ACLK_STREAM_ALERTS)) {
|
||||
//postpone checkpoint send
|
||||
wc->alert_checkpoint_req += 3;
|
||||
netdata_log_access("ACLK REQ [%s (N/A)]: ALERTS CHECKPOINT POSTPONED", rrdhost_hostname(host));
|
||||
nd_log(NDLS_ACCESS, NDLP_NOTICE, "ACLK REQ [%s (N/A)]: ALERTS CHECKPOINT POSTPONED", rrdhost_hostname(host));
|
||||
return;
|
||||
}
|
||||
|
||||
|
@ -1157,9 +1155,9 @@ void aclk_push_alarm_checkpoint(RRDHOST *host __maybe_unused)
|
|||
|
||||
aclk_send_provide_alarm_checkpoint(&alarm_checkpoint);
|
||||
freez(claim_id);
|
||||
netdata_log_access("ACLK RES [%s (%s)]: ALERTS CHECKPOINT SENT", wc->node_id, rrdhost_hostname(host));
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK RES [%s (%s)]: ALERTS CHECKPOINT SENT", wc->node_id, rrdhost_hostname(host));
|
||||
} else
|
||||
netdata_log_access("ACLK RES [%s (%s)]: FAILED TO CREATE ALERTS CHECKPOINT HASH", wc->node_id, rrdhost_hostname(host));
|
||||
nd_log(NDLS_ACCESS, NDLP_ERR, "ACLK RES [%s (%s)]: FAILED TO CREATE ALERTS CHECKPOINT HASH", wc->node_id, rrdhost_hostname(host));
|
||||
|
||||
wc->alert_checkpoint_req = 0;
|
||||
buffer_free(alarms_to_hash);
|
||||
|
|
|
@ -43,7 +43,7 @@ static void build_node_collectors(RRDHOST *host)
|
|||
dictionary_destroy(dict);
|
||||
freez(upd_node_collectors.claim_id);
|
||||
|
||||
netdata_log_access("ACLK RES [%s (%s)]: NODE COLLECTORS SENT", wc->node_id, rrdhost_hostname(host));
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK RES [%s (%s)]: NODE COLLECTORS SENT", wc->node_id, rrdhost_hostname(host));
|
||||
}
|
||||
|
||||
static void build_node_info(RRDHOST *host)
|
||||
|
@ -103,7 +103,7 @@ static void build_node_info(RRDHOST *host)
|
|||
node_info.data.host_labels_ptr = host->rrdlabels;
|
||||
|
||||
aclk_update_node_info(&node_info);
|
||||
netdata_log_access("ACLK RES [%s (%s)]: NODE INFO SENT for guid [%s] (%s)", wc->node_id, rrdhost_hostname(wc->host), host->machine_guid, wc->host == localhost ? "parent" : "child");
|
||||
nd_log(NDLS_ACCESS, NDLP_DEBUG, "ACLK RES [%s (%s)]: NODE INFO SENT for guid [%s] (%s)", wc->node_id, rrdhost_hostname(wc->host), host->machine_guid, wc->host == localhost ? "parent" : "child");
|
||||
|
||||
rrd_unlock();
|
||||
freez(node_info.claim_id);
|
||||
|
@ -169,8 +169,9 @@ void aclk_check_node_info_and_collectors(void)
|
|||
dfe_done(host);
|
||||
|
||||
if (context_loading || replicating) {
|
||||
error_limit_static_thread_var(erl, 10, 100 * USEC_PER_MS);
|
||||
error_limit(&erl, "%zu nodes loading contexts, %zu replicating data", context_loading, replicating);
|
||||
nd_log_limit_static_thread_var(erl, 10, 100 * USEC_PER_MS);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_INFO,
|
||||
"%zu nodes loading contexts, %zu replicating data", context_loading, replicating);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -914,7 +914,9 @@ void sql_health_alarm_log_load(RRDHOST *host)
|
|||
if (unlikely(!host->health_log.next_alarm_id || host->health_log.next_alarm_id <= host->health_max_alarm_id))
|
||||
host->health_log.next_alarm_id = host->health_max_alarm_id + 1;
|
||||
|
||||
netdata_log_health("[%s]: Table health_log, loaded %zd alarm entries, errors in %zd entries.", rrdhost_hostname(host), loaded, errored);
|
||||
nd_log(NDLS_DAEMON, errored ? NDLP_WARNING : NDLP_DEBUG,
|
||||
"[%s]: Table health_log, loaded %zd alarm entries, errors in %zd entries.",
|
||||
rrdhost_hostname(host), loaded, errored);
|
||||
|
||||
ret = sqlite3_finalize(res);
|
||||
if (unlikely(ret != SQLITE_OK))
|
||||
|
|
|
@ -1144,7 +1144,7 @@ void vacuum_database(sqlite3 *database, const char *db_alias, int threshold, int
|
|||
if (free_pages > (total_pages * threshold / 100)) {
|
||||
|
||||
int do_free_pages = (int) (free_pages * vacuum_pc / 100);
|
||||
netdata_log_info("%s: Freeing %d database pages", db_alias, do_free_pages);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "%s: Freeing %d database pages", db_alias, do_free_pages);
|
||||
|
||||
char sql[128];
|
||||
snprintfz(sql, 127, "PRAGMA incremental_vacuum(%d)", do_free_pages);
|
||||
|
@ -1258,7 +1258,7 @@ static void start_all_host_load_context(uv_work_t *req __maybe_unused)
|
|||
RRDHOST *host;
|
||||
|
||||
size_t max_threads = MIN(get_netdata_cpus() / 2, 6);
|
||||
netdata_log_info("METADATA: Using %zu threads for context loading", max_threads);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Using %zu threads for context loading", max_threads);
|
||||
struct host_context_load_thread *hclt = callocz(max_threads, sizeof(*hclt));
|
||||
|
||||
size_t thread_index;
|
||||
|
@ -1291,7 +1291,7 @@ static void start_all_host_load_context(uv_work_t *req __maybe_unused)
|
|||
cleanup_finished_threads(hclt, max_threads, true);
|
||||
freez(hclt);
|
||||
usec_t ended_ut = now_monotonic_usec(); (void)ended_ut;
|
||||
netdata_log_info("METADATA: host contexts loaded in %0.2f ms", (double)(ended_ut - started_ut) / USEC_PER_MS);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: host contexts loaded in %0.2f ms", (double)(ended_ut - started_ut) / USEC_PER_MS);
|
||||
|
||||
worker_is_idle();
|
||||
}
|
||||
|
@ -1556,7 +1556,7 @@ static void metadata_event_loop(void *arg)
|
|||
wc->timer_req.data = wc;
|
||||
fatal_assert(0 == uv_timer_start(&wc->timer_req, timer_cb, TIMER_INITIAL_PERIOD_MS, TIMER_REPEAT_PERIOD_MS));
|
||||
|
||||
netdata_log_info("Starting metadata sync thread");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "Starting metadata sync thread");
|
||||
|
||||
struct metadata_cmd cmd;
|
||||
memset(&cmd, 0, sizeof(cmd));
|
||||
|
@ -1684,7 +1684,7 @@ static void metadata_event_loop(void *arg)
|
|||
freez(loop);
|
||||
worker_unregister();
|
||||
|
||||
netdata_log_info("Shutting down event loop");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "Shutting down event loop");
|
||||
completion_mark_complete(&wc->start_stop_complete);
|
||||
if (wc->scan_complete) {
|
||||
completion_destroy(wc->scan_complete);
|
||||
|
@ -1710,15 +1710,15 @@ void metadata_sync_shutdown(void)
|
|||
|
||||
struct metadata_cmd cmd;
|
||||
memset(&cmd, 0, sizeof(cmd));
|
||||
netdata_log_info("METADATA: Sending a shutdown command");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Sending a shutdown command");
|
||||
cmd.opcode = METADATA_SYNC_SHUTDOWN;
|
||||
metadata_enq_cmd(&metasync_worker, &cmd);
|
||||
|
||||
/* wait for metadata thread to shut down */
|
||||
netdata_log_info("METADATA: Waiting for shutdown ACK");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Waiting for shutdown ACK");
|
||||
completion_wait_for(&metasync_worker.start_stop_complete);
|
||||
completion_destroy(&metasync_worker.start_stop_complete);
|
||||
netdata_log_info("METADATA: Shutdown complete");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Shutdown complete");
|
||||
}
|
||||
|
||||
void metadata_sync_shutdown_prepare(void)
|
||||
|
@ -1735,20 +1735,20 @@ void metadata_sync_shutdown_prepare(void)
|
|||
completion_init(compl);
|
||||
__atomic_store_n(&wc->scan_complete, compl, __ATOMIC_RELAXED);
|
||||
|
||||
netdata_log_info("METADATA: Sending a scan host command");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Sending a scan host command");
|
||||
uint32_t max_wait_iterations = 2000;
|
||||
while (unlikely(metadata_flag_check(&metasync_worker, METADATA_FLAG_PROCESSING)) && max_wait_iterations--) {
|
||||
if (max_wait_iterations == 1999)
|
||||
netdata_log_info("METADATA: Current worker is running; waiting to finish");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Current worker is running; waiting to finish");
|
||||
sleep_usec(1000);
|
||||
}
|
||||
|
||||
cmd.opcode = METADATA_SCAN_HOSTS;
|
||||
metadata_enq_cmd(&metasync_worker, &cmd);
|
||||
|
||||
netdata_log_info("METADATA: Waiting for host scan completion");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Waiting for host scan completion");
|
||||
completion_wait_for(wc->scan_complete);
|
||||
netdata_log_info("METADATA: Host scan complete; can continue with shutdown");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "METADATA: Host scan complete; can continue with shutdown");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------
|
||||
|
@ -1766,7 +1766,7 @@ void metadata_sync_init(void)
|
|||
completion_wait_for(&wc->start_stop_complete);
|
||||
completion_destroy(&wc->start_stop_complete);
|
||||
|
||||
netdata_log_info("SQLite metadata sync initialization complete");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "SQLite metadata sync initialization complete");
|
||||
}
|
||||
|
||||
|
||||
|
@ -1825,7 +1825,7 @@ void metadata_queue_load_host_context(RRDHOST *host)
|
|||
if (unlikely(!metasync_worker.loop))
|
||||
return;
|
||||
queue_metadata_cmd(METADATA_LOAD_HOST_CONTEXT, host, NULL);
|
||||
netdata_log_info("Queued command to load host contexts");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "Queued command to load host contexts");
|
||||
}
|
||||
|
||||
//
|
||||
|
|
|
@ -153,8 +153,8 @@ void aws_kinesis_connector_worker(void *instance_p)
|
|||
char error_message[ERROR_LINE_MAX + 1] = "";
|
||||
|
||||
netdata_log_debug(D_EXPORTING,
|
||||
"EXPORTING: kinesis_put_record(): dest = %s, id = %s, key = %s, stream = %s, partition_key = %s, \ "
|
||||
" buffer = %zu, record = %zu",
|
||||
"EXPORTING: kinesis_put_record(): dest = %s, id = %s, key = %s, stream = %s, partition_key = %s, "
|
||||
"buffer = %zu, record = %zu",
|
||||
instance->config.destination,
|
||||
connector_specific_config->auth_key_id,
|
||||
connector_specific_config->secure_key,
|
||||
|
|
108
health/health.c
108
health/health.c
|
@ -82,10 +82,13 @@ static bool prepare_command(BUFFER *wb,
|
|||
const char *edit_command,
|
||||
const char *machine_guid,
|
||||
uuid_t *transition_id,
|
||||
const char *summary
|
||||
const char *summary,
|
||||
const char *context,
|
||||
const char *component,
|
||||
const char *type
|
||||
) {
|
||||
char buf[8192];
|
||||
size_t n = 8192 - 1;
|
||||
size_t n = sizeof(buf) - 1;
|
||||
|
||||
buffer_strcat(wb, "exec");
|
||||
|
||||
|
@ -195,6 +198,18 @@ static bool prepare_command(BUFFER *wb,
|
|||
return false;
|
||||
buffer_sprintf(wb, " '%s'", buf);
|
||||
|
||||
if (!sanitize_command_argument_string(buf, context, n))
|
||||
return false;
|
||||
buffer_sprintf(wb, " '%s'", buf);
|
||||
|
||||
if (!sanitize_command_argument_string(buf, component, n))
|
||||
return false;
|
||||
buffer_sprintf(wb, " '%s'", buf);
|
||||
|
||||
if (!sanitize_command_argument_string(buf, type, n))
|
||||
return false;
|
||||
buffer_sprintf(wb, " '%s'", buf);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
|
@ -342,7 +357,9 @@ static void health_reload_host(RRDHOST *host) {
|
|||
if(unlikely(!host->health.health_enabled) && !rrdhost_flag_check(host, RRDHOST_FLAG_INITIALIZED_HEALTH))
|
||||
return;
|
||||
|
||||
netdata_log_health("[%s]: Reloading health.", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Reloading health.",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
char *user_path = health_user_config_dir();
|
||||
char *stock_path = health_stock_config_dir();
|
||||
|
@ -436,8 +453,10 @@ static inline void health_alarm_execute(RRDHOST *host, ALARM_ENTRY *ae) {
|
|||
|
||||
if(unlikely(ae->new_status <= RRDCALC_STATUS_CLEAR && (ae->flags & HEALTH_ENTRY_FLAG_NO_CLEAR_NOTIFICATION))) {
|
||||
// do not send notifications for disabled statuses
|
||||
netdata_log_debug(D_HEALTH, "Health not sending notification for alarm '%s.%s' status %s (it has no-clear-notification enabled)", ae_chart_id(ae), ae_name(ae), rrdcalc_status2string(ae->new_status));
|
||||
netdata_log_health("[%s]: Health not sending notification for alarm '%s.%s' status %s (it has no-clear-notification enabled)", rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), rrdcalc_status2string(ae->new_status));
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Health not sending notification for alarm '%s.%s' status %s (it has no-clear-notification enabled)",
|
||||
rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), rrdcalc_status2string(ae->new_status));
|
||||
|
||||
// mark it as run, so that we will send the same alarm if it happens again
|
||||
goto done;
|
||||
|
@ -454,10 +473,10 @@ static inline void health_alarm_execute(RRDHOST *host, ALARM_ENTRY *ae) {
|
|||
// we have executed this alarm notification in the past
|
||||
if(last_executed_status == ae->new_status && !(ae->flags & HEALTH_ENTRY_FLAG_IS_REPEATING)) {
|
||||
// don't send the notification for the same status again
|
||||
netdata_log_debug(D_HEALTH, "Health not sending again notification for alarm '%s.%s' status %s", ae_chart_id(ae), ae_name(ae)
|
||||
, rrdcalc_status2string(ae->new_status));
|
||||
netdata_log_health("[%s]: Health not sending again notification for alarm '%s.%s' status %s", rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae)
|
||||
, rrdcalc_status2string(ae->new_status));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Health not sending again notification for alarm '%s.%s' status %s",
|
||||
rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae),
|
||||
rrdcalc_status2string(ae->new_status));
|
||||
goto done;
|
||||
}
|
||||
}
|
||||
|
@ -476,11 +495,16 @@ static inline void health_alarm_execute(RRDHOST *host, ALARM_ENTRY *ae) {
|
|||
|
||||
// Check if alarm notifications are silenced
|
||||
if (ae->flags & HEALTH_ENTRY_FLAG_SILENCED) {
|
||||
netdata_log_health("[%s]: Health not sending notification for alarm '%s.%s' status %s (command API has disabled notifications)", rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), rrdcalc_status2string(ae->new_status));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Health not sending notification for alarm '%s.%s' status %s "
|
||||
"(command API has disabled notifications)",
|
||||
rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), rrdcalc_status2string(ae->new_status));
|
||||
goto done;
|
||||
}
|
||||
|
||||
netdata_log_health("[%s]: Sending notification for alarm '%s.%s' status %s.", rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), rrdcalc_status2string(ae->new_status));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Sending notification for alarm '%s.%s' status %s.",
|
||||
rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), rrdcalc_status2string(ae->new_status));
|
||||
|
||||
const char *exec = (ae->exec) ? ae_exec(ae) : string2str(host->health.health_default_exec);
|
||||
const char *recipient = (ae->recipient) ? ae_recipient(ae) : string2str(host->health.health_default_recipient);
|
||||
|
@ -581,7 +605,11 @@ static inline void health_alarm_execute(RRDHOST *host, ALARM_ENTRY *ae) {
|
|||
edit_command,
|
||||
host->machine_guid,
|
||||
&ae->transition_id,
|
||||
host->health.use_summary_for_notifications && ae->summary?ae_summary(ae):ae_name(ae));
|
||||
host->health.use_summary_for_notifications && ae->summary?ae_summary(ae):ae_name(ae),
|
||||
string2str(ae->chart_context),
|
||||
string2str(ae->component),
|
||||
string2str(ae->type)
|
||||
);
|
||||
|
||||
const char *command_to_run = buffer_tostring(wb);
|
||||
if (ok) {
|
||||
|
@ -778,7 +806,8 @@ static void health_main_cleanup(void *ptr) {
|
|||
netdata_log_info("cleaning up...");
|
||||
static_thread->enabled = NETDATA_MAIN_THREAD_EXITED;
|
||||
|
||||
netdata_log_health("Health thread ended.");
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Health thread ended.");
|
||||
}
|
||||
|
||||
static void initialize_health(RRDHOST *host)
|
||||
|
@ -790,7 +819,9 @@ static void initialize_health(RRDHOST *host)
|
|||
|
||||
rrdhost_flag_set(host, RRDHOST_FLAG_INITIALIZED_HEALTH);
|
||||
|
||||
netdata_log_health("[%s]: Initializing health.", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Initializing health.",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
host->health.health_default_warn_repeat_every = config_get_duration(CONFIG_SECTION_HEALTH, "default repeat warning", "never");
|
||||
host->health.health_default_crit_repeat_every = config_get_duration(CONFIG_SECTION_HEALTH, "default repeat critical", "never");
|
||||
|
@ -803,7 +834,11 @@ static void initialize_health(RRDHOST *host)
|
|||
|
||||
long n = config_get_number(CONFIG_SECTION_HEALTH, "in memory max health log entries", host->health_log.max);
|
||||
if(n < 10) {
|
||||
netdata_log_health("Host '%s': health configuration has invalid max log entries %ld. Using default %u", rrdhost_hostname(host), n, host->health_log.max);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Host '%s': health configuration has invalid max log entries %ld. "
|
||||
"Using default %u",
|
||||
rrdhost_hostname(host), n, host->health_log.max);
|
||||
|
||||
config_set_number(CONFIG_SECTION_HEALTH, "in memory max health log entries", (long)host->health_log.max);
|
||||
}
|
||||
else
|
||||
|
@ -811,7 +846,11 @@ static void initialize_health(RRDHOST *host)
|
|||
|
||||
uint32_t m = config_get_number(CONFIG_SECTION_HEALTH, "health log history", HEALTH_LOG_DEFAULT_HISTORY);
|
||||
if (m < HEALTH_LOG_MINIMUM_HISTORY) {
|
||||
netdata_log_health("Host '%s': health configuration has invalid health log history %u. Using minimum %d", rrdhost_hostname(host), m, HEALTH_LOG_MINIMUM_HISTORY);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING,
|
||||
"Host '%s': health configuration has invalid health log history %u. "
|
||||
"Using minimum %d",
|
||||
rrdhost_hostname(host), m, HEALTH_LOG_MINIMUM_HISTORY);
|
||||
|
||||
config_set_number(CONFIG_SECTION_HEALTH, "health log history", HEALTH_LOG_MINIMUM_HISTORY);
|
||||
m = HEALTH_LOG_MINIMUM_HISTORY;
|
||||
}
|
||||
|
@ -823,7 +862,9 @@ static void initialize_health(RRDHOST *host)
|
|||
} else
|
||||
host->health_log.health_log_history = m;
|
||||
|
||||
netdata_log_health("[%s]: Health log history is set to %u seconds (%u days)", rrdhost_hostname(host), host->health_log.health_log_history, host->health_log.health_log_history / 86400);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Health log history is set to %u seconds (%u days)",
|
||||
rrdhost_hostname(host), host->health_log.health_log_history, host->health_log.health_log_history / 86400);
|
||||
|
||||
conf_enabled_alarms = simple_pattern_create(config_get(CONFIG_SECTION_HEALTH, "enabled alarms", "*"), NULL,
|
||||
SIMPLE_PATTERN_EXACT, true);
|
||||
|
@ -1049,7 +1090,7 @@ void *health_main(void *ptr) {
|
|||
if (unlikely(check_if_resumed_from_suspension())) {
|
||||
apply_hibernation_delay = 1;
|
||||
|
||||
netdata_log_health(
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE,
|
||||
"Postponing alarm checks for %"PRId64" seconds, "
|
||||
"because it seems that the system was just resumed from suspension.",
|
||||
(int64_t)hibernation_delay);
|
||||
|
@ -1058,8 +1099,9 @@ void *health_main(void *ptr) {
|
|||
if (unlikely(silencers->all_alarms && silencers->stype == STYPE_DISABLE_ALARMS)) {
|
||||
static int logged=0;
|
||||
if (!logged) {
|
||||
netdata_log_health("Skipping health checks, because all alarms are disabled via a %s command.",
|
||||
HEALTH_CMDAPI_CMD_DISABLEALL);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"Skipping health checks, because all alarms are disabled via a %s command.",
|
||||
HEALTH_CMDAPI_CMD_DISABLEALL);
|
||||
logged = 1;
|
||||
}
|
||||
}
|
||||
|
@ -1081,7 +1123,7 @@ void *health_main(void *ptr) {
|
|||
rrdcalc_delete_alerts_not_matching_host_labels_from_this_host(host);
|
||||
|
||||
if (unlikely(apply_hibernation_delay)) {
|
||||
netdata_log_health(
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Postponing health checks for %"PRId64" seconds.",
|
||||
rrdhost_hostname(host),
|
||||
(int64_t)hibernation_delay);
|
||||
|
@ -1094,20 +1136,30 @@ void *health_main(void *ptr) {
|
|||
continue;
|
||||
}
|
||||
|
||||
netdata_log_health("[%s]: Resuming health checks after delay.", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Resuming health checks after delay.",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
host->health.health_delay_up_to = 0;
|
||||
}
|
||||
|
||||
// wait until cleanup of obsolete charts on children is complete
|
||||
if (host != localhost) {
|
||||
if (unlikely(host->trigger_chart_obsoletion_check == 1)) {
|
||||
netdata_log_health("[%s]: Waiting for chart obsoletion check.", rrdhost_hostname(host));
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Waiting for chart obsoletion check.",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
if (!health_running_logged) {
|
||||
netdata_log_health("[%s]: Health is running.", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Health is running.",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
health_running_logged = true;
|
||||
}
|
||||
|
||||
|
@ -1161,6 +1213,7 @@ void *health_main(void *ptr) {
|
|||
rrdcalc_isrepeating(rc)?HEALTH_ENTRY_FLAG_IS_REPEATING:0);
|
||||
|
||||
if (ae) {
|
||||
health_log_alert(host, ae);
|
||||
health_alarm_log_add_entry(host, ae);
|
||||
rc->old_status = rc->status;
|
||||
rc->status = RRDCALC_STATUS_REMOVED;
|
||||
|
@ -1432,9 +1485,13 @@ void *health_main(void *ptr) {
|
|||
)
|
||||
);
|
||||
|
||||
health_log_alert(host, ae);
|
||||
health_alarm_log_add_entry(host, ae);
|
||||
|
||||
netdata_log_health("[%s]: Alert event for [%s.%s], value [%s], status [%s].", rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), ae_new_value_string(ae), rrdcalc_status2string(ae->new_status));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Alert event for [%s.%s], value [%s], status [%s].",
|
||||
rrdhost_hostname(host), ae_chart_id(ae), ae_name(ae), ae_new_value_string(ae),
|
||||
rrdcalc_status2string(ae->new_status));
|
||||
|
||||
rc->last_status_change_value = rc->value;
|
||||
rc->last_status_change = now;
|
||||
|
@ -1519,6 +1576,7 @@ void *health_main(void *ptr) {
|
|||
)
|
||||
);
|
||||
|
||||
health_log_alert(host, ae);
|
||||
ae->last_repeat = rc->last_repeat;
|
||||
if (!(rc->run_flags & RRDCALC_FLAG_RUN_ONCE) && rc->status == RRDCALC_STATUS_CLEAR) {
|
||||
ae->flags |= HEALTH_ENTRY_RUN_ONCE;
|
||||
|
|
|
@ -105,4 +105,7 @@ void sql_refresh_hashes(void);
|
|||
void health_add_host_labels(void);
|
||||
void health_string2json(BUFFER *wb, const char *prefix, const char *label, const char *value, const char *suffix);
|
||||
|
||||
void health_log_alert_transition_with_trace(RRDHOST *host, ALARM_ENTRY *ae, int line, const char *file, const char *function);
|
||||
#define health_log_alert(host, ae) health_log_alert_transition_with_trace(host, ae, __LINE__, __FILE__, __FUNCTION__)
|
||||
|
||||
#endif //NETDATA_HEALTH_H
|
||||
|
|
|
@ -1368,7 +1368,10 @@ void health_readdir(RRDHOST *host, const char *user_path, const char *stock_path
|
|||
CONFIG_BOOLEAN_YES);
|
||||
|
||||
if (!stock_enabled) {
|
||||
netdata_log_health("[%s]: Netdata will not load stock alarms.", rrdhost_hostname(host));
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Netdata will not load stock alarms.",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
stock_path = user_path;
|
||||
}
|
||||
|
||||
|
@ -1376,6 +1379,10 @@ void health_readdir(RRDHOST *host, const char *user_path, const char *stock_path
|
|||
health_rrdvars = health_rrdvariables_create();
|
||||
|
||||
recursive_config_double_dir_load(user_path, stock_path, subpath, health_readfile, (void *) host, 0);
|
||||
netdata_log_health("[%s]: Read health configuration.", rrdhost_hostname(host));
|
||||
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG,
|
||||
"[%s]: Read health configuration.",
|
||||
rrdhost_hostname(host));
|
||||
|
||||
sql_store_hashes = 0;
|
||||
}
|
||||
|
|
|
@ -8,6 +8,79 @@ inline void health_alarm_log_save(RRDHOST *host, ALARM_ENTRY *ae) {
|
|||
sql_health_alarm_log_save(host, ae);
|
||||
}
|
||||
|
||||
|
||||
void health_log_alert_transition_with_trace(RRDHOST *host, ALARM_ENTRY *ae, int line, const char *file, const char *function) {
|
||||
ND_LOG_STACK lgs[] = {
|
||||
ND_LOG_FIELD_UUID(NDF_MESSAGE_ID, &health_alert_transition_msgid),
|
||||
ND_LOG_FIELD_STR(NDF_NIDL_NODE, host->hostname),
|
||||
ND_LOG_FIELD_STR(NDF_NIDL_INSTANCE, ae->chart_name),
|
||||
ND_LOG_FIELD_STR(NDF_NIDL_CONTEXT, ae->chart_context),
|
||||
ND_LOG_FIELD_U64(NDF_ALERT_ID, ae->alarm_id),
|
||||
ND_LOG_FIELD_U64(NDF_ALERT_UNIQUE_ID, ae->unique_id),
|
||||
ND_LOG_FIELD_U64(NDF_ALERT_EVENT_ID, ae->alarm_event_id),
|
||||
ND_LOG_FIELD_UUID(NDF_ALERT_CONFIG_HASH, &ae->config_hash_id),
|
||||
ND_LOG_FIELD_UUID(NDF_ALERT_TRANSITION_ID, &ae->transition_id),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_NAME, ae->name),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_CLASS, ae->classification),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_COMPONENT, ae->component),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_TYPE, ae->type),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_EXEC, ae->exec),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_RECIPIENT, ae->recipient),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_SOURCE, ae->exec),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_UNITS, ae->units),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_SUMMARY, ae->summary),
|
||||
ND_LOG_FIELD_STR(NDF_ALERT_INFO, ae->info),
|
||||
ND_LOG_FIELD_DBL(NDF_ALERT_VALUE, ae->new_value),
|
||||
ND_LOG_FIELD_DBL(NDF_ALERT_VALUE_OLD, ae->old_value),
|
||||
ND_LOG_FIELD_TXT(NDF_ALERT_STATUS, rrdcalc_status2string(ae->new_status)),
|
||||
ND_LOG_FIELD_TXT(NDF_ALERT_STATUS_OLD, rrdcalc_status2string(ae->old_status)),
|
||||
ND_LOG_FIELD_I64(NDF_ALERT_DURATION, ae->duration),
|
||||
ND_LOG_FIELD_I64(NDF_RESPONSE_CODE, ae->exec_code),
|
||||
ND_LOG_FIELD_U64(NDF_ALERT_NOTIFICATION_REALTIME_USEC, ae->delay_up_to_timestamp * USEC_PER_SEC),
|
||||
ND_LOG_FIELD_END(),
|
||||
};
|
||||
ND_LOG_STACK_PUSH(lgs);
|
||||
|
||||
errno = 0;
|
||||
|
||||
ND_LOG_FIELD_PRIORITY priority = NDLP_INFO;
|
||||
|
||||
switch(ae->new_status) {
|
||||
case RRDCALC_STATUS_UNDEFINED:
|
||||
if(ae->old_status >= RRDCALC_STATUS_CLEAR)
|
||||
priority = NDLP_NOTICE;
|
||||
else
|
||||
priority = NDLP_DEBUG;
|
||||
break;
|
||||
|
||||
default:
|
||||
case RRDCALC_STATUS_UNINITIALIZED:
|
||||
case RRDCALC_STATUS_REMOVED:
|
||||
priority = NDLP_DEBUG;
|
||||
break;
|
||||
|
||||
case RRDCALC_STATUS_CLEAR:
|
||||
priority = NDLP_INFO;
|
||||
break;
|
||||
|
||||
case RRDCALC_STATUS_WARNING:
|
||||
if(ae->old_status < RRDCALC_STATUS_WARNING)
|
||||
priority = NDLP_WARNING;
|
||||
break;
|
||||
|
||||
case RRDCALC_STATUS_CRITICAL:
|
||||
if(ae->old_status < RRDCALC_STATUS_CRITICAL)
|
||||
priority = NDLP_CRIT;
|
||||
break;
|
||||
}
|
||||
|
||||
netdata_logger(NDLS_HEALTH, priority, file, function, line,
|
||||
"ALERT '%s' of instance '%s' on node '%s', transitioned from %s to %s",
|
||||
string2str(ae->name), string2str(ae->chart), string2str(host->hostname),
|
||||
rrdcalc_status2string(ae->old_status), rrdcalc_status2string(ae->new_status)
|
||||
);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// health alarm log management
|
||||
|
||||
|
|
|
@ -42,6 +42,8 @@
|
|||
# -----------------------------------------------------------------------------
|
||||
# testing notifications
|
||||
|
||||
cmd_line="'${0}' $(printf "'%s' " "${@}")"
|
||||
|
||||
if { [ "${1}" = "test" ] || [ "${2}" = "test" ]; } && [ "${#}" -le 2 ]; then
|
||||
if [ "${2}" = "test" ]; then
|
||||
recipient="${1}"
|
||||
|
@ -78,61 +80,139 @@ export PATH="${PATH}:/sbin:/usr/sbin:/usr/local/sbin"
|
|||
export LC_ALL=C
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# logging
|
||||
|
||||
PROGRAM_NAME="$(basename "${0}")"
|
||||
|
||||
LOG_LEVEL_ERR=1
|
||||
LOG_LEVEL_WARN=2
|
||||
LOG_LEVEL_INFO=3
|
||||
LOG_LEVEL="$LOG_LEVEL_INFO"
|
||||
# these should be the same with syslog() priorities
|
||||
NDLP_EMERG=0 # system is unusable
|
||||
NDLP_ALERT=1 # action must be taken immediately
|
||||
NDLP_CRIT=2 # critical conditions
|
||||
NDLP_ERR=3 # error conditions
|
||||
NDLP_WARN=4 # warning conditions
|
||||
NDLP_NOTICE=5 # normal but significant condition
|
||||
NDLP_INFO=6 # informational
|
||||
NDLP_DEBUG=7 # debug-level messages
|
||||
|
||||
set_log_severity_level() {
|
||||
case ${NETDATA_LOG_SEVERITY_LEVEL,,} in
|
||||
"info") LOG_LEVEL="$LOG_LEVEL_INFO";;
|
||||
"warn" | "warning") LOG_LEVEL="$LOG_LEVEL_WARN";;
|
||||
"err" | "error") LOG_LEVEL="$LOG_LEVEL_ERR";;
|
||||
# the max (numerically) log level we will log
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
|
||||
set_log_min_priority() {
|
||||
case "${NETDATA_LOG_PRIORITY_LEVEL,,}" in
|
||||
"emerg" | "emergency")
|
||||
LOG_LEVEL=$NDLP_EMERG
|
||||
;;
|
||||
|
||||
"alert")
|
||||
LOG_LEVEL=$NDLP_ALERT
|
||||
;;
|
||||
|
||||
"crit" | "critical")
|
||||
LOG_LEVEL=$NDLP_CRIT
|
||||
;;
|
||||
|
||||
"err" | "error")
|
||||
LOG_LEVEL=$NDLP_ERR
|
||||
;;
|
||||
|
||||
"warn" | "warning")
|
||||
LOG_LEVEL=$NDLP_WARN
|
||||
;;
|
||||
|
||||
"notice")
|
||||
LOG_LEVEL=$NDLP_NOTICE
|
||||
;;
|
||||
|
||||
"info")
|
||||
LOG_LEVEL=$NDLP_INFO
|
||||
;;
|
||||
|
||||
"debug")
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
set_log_severity_level
|
||||
|
||||
logdate() {
|
||||
date "+%Y-%m-%d %H:%M:%S"
|
||||
}
|
||||
set_log_min_priority
|
||||
|
||||
log() {
|
||||
local status="${1}"
|
||||
shift
|
||||
local level="${1}"
|
||||
shift 1
|
||||
|
||||
echo >&2 "$(logdate): ${PROGRAM_NAME}: ${status}: ${*}"
|
||||
[[ -n "$level" && -n "$LOG_LEVEL" && "$level" -gt "$LOG_LEVEL" ]] && return
|
||||
|
||||
systemd-cat-native --log-as-netdata --newline="{NEWLINE}" <<EOFLOG
|
||||
INVOCATION_ID=${NETDATA_INVOCATION_ID}
|
||||
SYSLOG_IDENTIFIER=${PROGRAM_NAME}
|
||||
PRIORITY=${level}
|
||||
THREAD_TAG="alarm-notify"
|
||||
ND_LOG_SOURCE=health
|
||||
ND_NIDL_NODE=${host}
|
||||
ND_NIDL_INSTANCE=${chart}
|
||||
ND_NIDL_CONTEXT=${context}
|
||||
ND_ALERT_NAME=${name}
|
||||
ND_ALERT_ID=${alarm_id}
|
||||
ND_ALERT_UNIQUE_ID=${unique_id}
|
||||
ND_ALERT_EVENT_ID=${alarm_event_id}
|
||||
ND_ALERT_TRANSITION_ID=${transition_id//-/}
|
||||
ND_ALERT_CLASS=${classification}
|
||||
ND_ALERT_COMPONENT=${component}
|
||||
ND_ALERT_TYPE=${type}
|
||||
ND_ALERT_RECIPIENT=${roles}
|
||||
ND_ALERT_VALUE=${value}
|
||||
ND_ALERT_VALUE_OLD=${old_value}
|
||||
ND_ALERT_STATUS=${status}
|
||||
ND_ALERT_STATUS_OLD=${old_status}
|
||||
ND_ALERT_UNITS=${units}
|
||||
ND_ALERT_SUMMARY=${summary}
|
||||
ND_ALERT_INFO=${info}
|
||||
ND_ALERT_DURATION=${duration}
|
||||
ND_REQUEST=${cmd_line}
|
||||
MESSAGE_ID=6db0018e83e34320ae2a659d78019fb7
|
||||
MESSAGE=[ALERT NOTIFICATION]: ${*//[$'\r\n']/{NEWLINE}}
|
||||
|
||||
EOFLOG
|
||||
# AN EMPTY LINE IS NEEDED ABOVE
|
||||
}
|
||||
|
||||
info() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_INFO" -gt "$LOG_LEVEL" ]] && return
|
||||
log INFO "${@}"
|
||||
log "$NDLP_INFO" "${@}"
|
||||
}
|
||||
|
||||
warning() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_WARN" -gt "$LOG_LEVEL" ]] && return
|
||||
log WARNING "${@}"
|
||||
log "$NDLP_WARN" "${@}"
|
||||
}
|
||||
|
||||
error() {
|
||||
[[ -n "$LOG_LEVEL" && "$LOG_LEVEL_ERR" -gt "$LOG_LEVEL" ]] && return
|
||||
log ERROR "${@}"
|
||||
log "$NDLP_ERR" "${@}"
|
||||
}
|
||||
|
||||
fatal() {
|
||||
log FATAL "${@}"
|
||||
log "$NDLP_ALERT" "${@}"
|
||||
exit 1
|
||||
}
|
||||
|
||||
debug=${NETDATA_ALARM_NOTIFY_DEBUG-0}
|
||||
debug() {
|
||||
[ "${debug}" = "1" ] && log DEBUG "${@}"
|
||||
log "$NDLP_DEBUG" "${@}"
|
||||
}
|
||||
|
||||
debug=0
|
||||
if [ "${NETDATA_ALARM_NOTIFY_DEBUG-0}" = "1" ]; then
|
||||
debug=1
|
||||
LOG_LEVEL=$NDLP_DEBUG
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# check for BASH v4+ (required for associative arrays)
|
||||
|
||||
if [ ${BASH_VERSINFO[0]} -lt 4 ]; then
|
||||
echo >&2 "BASH version 4 or later is required (this is ${BASH_VERSION})."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
docurl() {
|
||||
if [ -z "${curl}" ]; then
|
||||
error "${curl} is unset."
|
||||
|
@ -199,16 +279,9 @@ ntfy
|
|||
# this is to be overwritten by the config file
|
||||
|
||||
custom_sender() {
|
||||
info "not sending custom notification for ${status} of '${host}.${chart}.${name}'"
|
||||
info "custom notification mechanism is not configured; not sending ${notification_description}"
|
||||
}
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
||||
# check for BASH v4+ (required for associative arrays)
|
||||
if [ ${BASH_VERSINFO[0]} -lt 4 ]; then
|
||||
fatal "BASH version 4 or later is required (this is ${BASH_VERSION})."
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# defaults to allow running this script by hand
|
||||
|
||||
|
@ -228,8 +301,8 @@ if [[ ${1} = "unittest" ]]; then
|
|||
status="${4}" # the current status : REMOVED, UNINITIALIZED, UNDEFINED, CLEAR, WARNING, CRITICAL
|
||||
old_status="${5}" # the previous status: REMOVED, UNINITIALIZED, UNDEFINED, CLEAR, WARNING, CRITICAL
|
||||
elif [[ ${1} = "dump_methods" ]]; then
|
||||
dump_methods=1
|
||||
status="WARNING"
|
||||
dump_methods=1
|
||||
status="WARNING"
|
||||
else
|
||||
roles="${1}" # the roles that should be notified for this event
|
||||
args_host="${2}" # the host generated this event
|
||||
|
@ -263,6 +336,9 @@ else
|
|||
child_machine_guid="${28}" # the machine_guid of the child
|
||||
transition_id="${29}" # the transition_id of the alert
|
||||
summary="${30}" # the summary text field of the alert
|
||||
context="${31}" # the context of the chart
|
||||
component="${32}"
|
||||
type="${33}"
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
@ -276,18 +352,20 @@ else
|
|||
host="${args_host}"
|
||||
fi
|
||||
|
||||
notification_description="notification to '${roles}' for transition from ${old_status} to ${status}, of alert '${name}' = '${value_string}', of instance '${chart}', context '${context}' on host '${host}'"
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# screen statuses we don't need to send a notification
|
||||
|
||||
# don't do anything if this is not WARNING, CRITICAL or CLEAR
|
||||
if [ "${status}" != "WARNING" ] && [ "${status}" != "CRITICAL" ] && [ "${status}" != "CLEAR" ]; then
|
||||
info "not sending notification for ${status} of '${host}.${chart}.${name}'"
|
||||
debug "not sending ${notification_description}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# don't do anything if this is CLEAR, but it was not WARNING or CRITICAL
|
||||
if [ "${clear_alarm_always}" != "YES" ] && [ "${old_status}" != "WARNING" ] && [ "${old_status}" != "CRITICAL" ] && [ "${status}" = "CLEAR" ]; then
|
||||
info "not sending notification for ${status} of '${host}.${chart}.${name}' (last status was ${old_status})"
|
||||
debug "not sending ${notification_description}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
|
@ -434,7 +512,7 @@ else
|
|||
debug "Loading config file '${CONFIG}'..."
|
||||
source "${CONFIG}" || error "Failed to load config file '${CONFIG}'."
|
||||
else
|
||||
warning "Cannot find file '${CONFIG}'."
|
||||
debug "Cannot find file '${CONFIG}'."
|
||||
fi
|
||||
done
|
||||
fi
|
||||
|
@ -598,7 +676,16 @@ filter_recipient_by_criticality() {
|
|||
}
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# verify the delivery methods supported
|
||||
# check the configured targets
|
||||
|
||||
# check email
|
||||
if [ "${SEND_EMAIL}" = "AUTO" ]; then
|
||||
if command -v curl >/dev/null 2>&1; then
|
||||
SEND_EMAIL="YES"
|
||||
else
|
||||
SEND_EMAIL="NO"
|
||||
fi
|
||||
fi
|
||||
|
||||
# check slack
|
||||
[ -z "${SLACK_WEBHOOK_URL}" ] && SEND_SLACK="NO"
|
||||
|
@ -677,112 +764,121 @@ filter_recipient_by_criticality() {
|
|||
# check custom
|
||||
[ -z "${DEFAULT_RECIPIENT_CUSTOM}" ] && SEND_CUSTOM="NO"
|
||||
|
||||
if [ "${SEND_PUSHOVER}" = "YES" ] ||
|
||||
[ "${SEND_SLACK}" = "YES" ] ||
|
||||
[ "${SEND_ROCKETCHAT}" = "YES" ] ||
|
||||
[ "${SEND_ALERTA}" = "YES" ] ||
|
||||
[ "${SEND_PD}" = "YES" ] ||
|
||||
[ "${SEND_FLOCK}" = "YES" ] ||
|
||||
[ "${SEND_DISCORD}" = "YES" ] ||
|
||||
[ "${SEND_HIPCHAT}" = "YES" ] ||
|
||||
[ "${SEND_TWILIO}" = "YES" ] ||
|
||||
[ "${SEND_MESSAGEBIRD}" = "YES" ] ||
|
||||
[ "${SEND_KAVENEGAR}" = "YES" ] ||
|
||||
[ "${SEND_TELEGRAM}" = "YES" ] ||
|
||||
[ "${SEND_PUSHBULLET}" = "YES" ] ||
|
||||
[ "${SEND_KAFKA}" = "YES" ] ||
|
||||
[ "${SEND_FLEEP}" = "YES" ] ||
|
||||
[ "${SEND_PROWL}" = "YES" ] ||
|
||||
[ "${SEND_MATRIX}" = "YES" ] ||
|
||||
[ "${SEND_CUSTOM}" = "YES" ] ||
|
||||
[ "${SEND_MSTEAMS}" = "YES" ] ||
|
||||
[ "${SEND_DYNATRACE}" = "YES" ] ||
|
||||
[ "${SEND_OPSGENIE}" = "YES" ] ||
|
||||
[ "${SEND_GOTIFY}" = "YES" ] ||
|
||||
[ "${SEND_NTFY}" = "YES" ]; then
|
||||
# if we need curl, check for the curl command
|
||||
if [ -z "${curl}" ]; then
|
||||
curl="$(command -v curl 2>/dev/null)"
|
||||
fi
|
||||
if [ -z "${curl}" ]; then
|
||||
error "Cannot find curl command in the system path. Disabling all curl based notifications."
|
||||
SEND_PUSHOVER="NO"
|
||||
SEND_PUSHBULLET="NO"
|
||||
SEND_TELEGRAM="NO"
|
||||
SEND_SLACK="NO"
|
||||
SEND_MSTEAMS="NO"
|
||||
SEND_ROCKETCHAT="NO"
|
||||
SEND_ALERTA="NO"
|
||||
SEND_PD="NO"
|
||||
SEND_FLOCK="NO"
|
||||
SEND_DISCORD="NO"
|
||||
SEND_TWILIO="NO"
|
||||
SEND_HIPCHAT="NO"
|
||||
SEND_MESSAGEBIRD="NO"
|
||||
SEND_KAVENEGAR="NO"
|
||||
SEND_KAFKA="NO"
|
||||
SEND_FLEEP="NO"
|
||||
SEND_PROWL="NO"
|
||||
SEND_MATRIX="NO"
|
||||
SEND_CUSTOM="NO"
|
||||
SEND_DYNATRACE="NO"
|
||||
SEND_OPSGENIE="NO"
|
||||
SEND_GOTIFY="NO"
|
||||
SEND_NTFY="NO"
|
||||
fi
|
||||
fi
|
||||
# -----------------------------------------------------------------------------
|
||||
# check the availability of targets
|
||||
|
||||
if [ "${SEND_SMS}" = "YES" ]; then
|
||||
if [ -z "${sendsms}" ]; then
|
||||
sendsms="$(command -v sendsms 2>/dev/null)"
|
||||
fi
|
||||
if [ -z "${sendsms}" ]; then
|
||||
SEND_SMS="NO"
|
||||
fi
|
||||
fi
|
||||
# if we need sendmail, check for the sendmail command
|
||||
if [ "${SEND_EMAIL}" = "YES" ] && [ -z "${sendmail}" ]; then
|
||||
sendmail="$(command -v sendmail 2>/dev/null)"
|
||||
if [ -z "${sendmail}" ]; then
|
||||
debug "Cannot find sendmail command in the system path. Disabling email notifications."
|
||||
SEND_EMAIL="NO"
|
||||
fi
|
||||
fi
|
||||
check_supported_targets() {
|
||||
local log=${1}
|
||||
shift
|
||||
|
||||
# if we need logger, check for the logger command
|
||||
if [ "${SEND_SYSLOG}" = "YES" ] && [ -z "${logger}" ]; then
|
||||
logger="$(command -v logger 2>/dev/null)"
|
||||
if [ -z "${logger}" ]; then
|
||||
debug "Cannot find logger command in the system path. Disabling syslog notifications."
|
||||
SEND_SYSLOG="NO"
|
||||
if [ "${SEND_PUSHOVER}" = "YES" ] ||
|
||||
[ "${SEND_SLACK}" = "YES" ] ||
|
||||
[ "${SEND_ROCKETCHAT}" = "YES" ] ||
|
||||
[ "${SEND_ALERTA}" = "YES" ] ||
|
||||
[ "${SEND_PD}" = "YES" ] ||
|
||||
[ "${SEND_FLOCK}" = "YES" ] ||
|
||||
[ "${SEND_DISCORD}" = "YES" ] ||
|
||||
[ "${SEND_HIPCHAT}" = "YES" ] ||
|
||||
[ "${SEND_TWILIO}" = "YES" ] ||
|
||||
[ "${SEND_MESSAGEBIRD}" = "YES" ] ||
|
||||
[ "${SEND_KAVENEGAR}" = "YES" ] ||
|
||||
[ "${SEND_TELEGRAM}" = "YES" ] ||
|
||||
[ "${SEND_PUSHBULLET}" = "YES" ] ||
|
||||
[ "${SEND_KAFKA}" = "YES" ] ||
|
||||
[ "${SEND_FLEEP}" = "YES" ] ||
|
||||
[ "${SEND_PROWL}" = "YES" ] ||
|
||||
[ "${SEND_MATRIX}" = "YES" ] ||
|
||||
[ "${SEND_CUSTOM}" = "YES" ] ||
|
||||
[ "${SEND_MSTEAMS}" = "YES" ] ||
|
||||
[ "${SEND_DYNATRACE}" = "YES" ] ||
|
||||
[ "${SEND_OPSGENIE}" = "YES" ] ||
|
||||
[ "${SEND_GOTIFY}" = "YES" ] ||
|
||||
[ "${SEND_NTFY}" = "YES" ]; then
|
||||
# if we need curl, check for the curl command
|
||||
if [ -z "${curl}" ]; then
|
||||
curl="$(command -v curl 2>/dev/null)"
|
||||
fi
|
||||
if [ -z "${curl}" ]; then
|
||||
$log "Cannot find curl command in the system path. Disabling all curl based notifications."
|
||||
SEND_PUSHOVER="NO"
|
||||
SEND_PUSHBULLET="NO"
|
||||
SEND_TELEGRAM="NO"
|
||||
SEND_SLACK="NO"
|
||||
SEND_MSTEAMS="NO"
|
||||
SEND_ROCKETCHAT="NO"
|
||||
SEND_ALERTA="NO"
|
||||
SEND_PD="NO"
|
||||
SEND_FLOCK="NO"
|
||||
SEND_DISCORD="NO"
|
||||
SEND_TWILIO="NO"
|
||||
SEND_HIPCHAT="NO"
|
||||
SEND_MESSAGEBIRD="NO"
|
||||
SEND_KAVENEGAR="NO"
|
||||
SEND_KAFKA="NO"
|
||||
SEND_FLEEP="NO"
|
||||
SEND_PROWL="NO"
|
||||
SEND_MATRIX="NO"
|
||||
SEND_CUSTOM="NO"
|
||||
SEND_DYNATRACE="NO"
|
||||
SEND_OPSGENIE="NO"
|
||||
SEND_GOTIFY="NO"
|
||||
SEND_NTFY="NO"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# if we need aws, check for the aws command
|
||||
if [ "${SEND_AWSSNS}" = "YES" ] && [ -z "${aws}" ]; then
|
||||
aws="$(command -v aws 2>/dev/null)"
|
||||
if [ -z "${aws}" ]; then
|
||||
debug "Cannot find aws command in the system path. Disabling Amazon SNS notifications."
|
||||
SEND_AWSSNS="NO"
|
||||
if [ "${SEND_SMS}" = "YES" ]; then
|
||||
if [ -z "${sendsms}" ]; then
|
||||
sendsms="$(command -v sendsms 2>/dev/null)"
|
||||
fi
|
||||
if [ -z "${sendsms}" ]; then
|
||||
SEND_SMS="NO"
|
||||
fi
|
||||
fi
|
||||
# if we need sendmail, check for the sendmail command
|
||||
if [ "${SEND_EMAIL}" = "YES" ] && [ -z "${sendmail}" ]; then
|
||||
sendmail="$(command -v sendmail 2>/dev/null)"
|
||||
if [ -z "${sendmail}" ]; then
|
||||
$log "Cannot find sendmail command in the system path. Disabling email notifications."
|
||||
SEND_EMAIL="NO"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# if we need nc, check for the nc command
|
||||
if [ "${SEND_IRC}" = "YES" ] && [ -z "${nc}" ]; then
|
||||
nc="$(command -v nc 2>/dev/null)"
|
||||
if [ -z "${nc}" ]; then
|
||||
debug "Cannot find nc command in the system path. Disabling IRC notifications."
|
||||
SEND_IRC="NO"
|
||||
# if we need logger, check for the logger command
|
||||
if [ "${SEND_SYSLOG}" = "YES" ] && [ -z "${logger}" ]; then
|
||||
logger="$(command -v logger 2>/dev/null)"
|
||||
if [ -z "${logger}" ]; then
|
||||
$log "Cannot find logger command in the system path. Disabling syslog notifications."
|
||||
SEND_SYSLOG="NO"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# if we need aws, check for the aws command
|
||||
if [ "${SEND_AWSSNS}" = "YES" ] && [ -z "${aws}" ]; then
|
||||
aws="$(command -v aws 2>/dev/null)"
|
||||
if [ -z "${aws}" ]; then
|
||||
$log "Cannot find aws command in the system path. Disabling Amazon SNS notifications."
|
||||
SEND_AWSSNS="NO"
|
||||
fi
|
||||
fi
|
||||
|
||||
# if we need nc, check for the nc command
|
||||
if [ "${SEND_IRC}" = "YES" ] && [ -z "${nc}" ]; then
|
||||
nc="$(command -v nc 2>/dev/null)"
|
||||
if [ -z "${nc}" ]; then
|
||||
$log "Cannot find nc command in the system path. Disabling IRC notifications."
|
||||
SEND_IRC="NO"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
if [ ${dump_methods} ]; then
|
||||
check_supported_targets debug
|
||||
for name in "${!SEND_@}"; do
|
||||
if [ "${!name}" = "YES" ]; then
|
||||
echo "$name"
|
||||
fi
|
||||
done
|
||||
exit
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
@ -790,6 +886,7 @@ fi
|
|||
|
||||
# netdata may call us with multiple roles, and roles may have multiple but
|
||||
# overlapping recipients - so, here we find the unique recipients.
|
||||
have_to_send_something="NO"
|
||||
for method_name in ${method_names}; do
|
||||
send_var="SEND_${method_name^^}"
|
||||
if [ "${!send_var}" = "NO" ]; then
|
||||
|
@ -819,7 +916,11 @@ for method_name in ${method_names}; do
|
|||
to_var="to_${method_name}"
|
||||
declare to_${method_name}="${!arr_var[*]}"
|
||||
|
||||
[ -z "${!to_var}" ] && declare ${send_var}="NO"
|
||||
if [ -z "${!to_var}" ]; then
|
||||
declare ${send_var}="NO"
|
||||
else
|
||||
have_to_send_something="YES"
|
||||
fi
|
||||
done
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
|
@ -884,10 +985,18 @@ for method in "${SEND_EMAIL}" \
|
|||
break
|
||||
fi
|
||||
done
|
||||
|
||||
if [ "$proceed" -eq 0 ]; then
|
||||
fatal "All notification methods are disabled. Not sending notification for host '${host}', chart '${chart}' to '${roles}' for '${name}' = '${value}' for status '${status}'."
|
||||
if [ "${have_to_send_something}" = "NO" ]; then
|
||||
debug "All notification methods are disabled; not sending ${notification_description}."
|
||||
exit 0
|
||||
else
|
||||
fatal "All notification methods are disabled; not sending ${notification_description}."
|
||||
fi
|
||||
fi
|
||||
|
||||
check_supported_targets error
|
||||
|
||||
# -----------------------------------------------------------------------------
|
||||
# get the date the alarm happened
|
||||
|
||||
|
@ -1023,10 +1132,10 @@ send_email() {
|
|||
ret=$?
|
||||
|
||||
if [ ${ret} -eq 0 ]; then
|
||||
info "sent email notification for: ${host} ${chart}.${name} is ${status} to '${to_email}'"
|
||||
info "sent email to '${to_email}' for ${notification_description}"
|
||||
return 0
|
||||
else
|
||||
error "failed to send email notification for: ${host} ${chart}.${name} is ${status} to '${to_email}' with error code ${ret} (${cmd_output})."
|
||||
error "failed to send email to '${to_email}' for ${notification_description}, with error code ${ret} (${cmd_output})."
|
||||
return 1
|
||||
fi
|
||||
fi
|
||||
|
@ -1065,10 +1174,10 @@ send_pushover() {
|
|||
https://api.pushover.net/1/messages.json)
|
||||
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent pushover notification for: ${host} ${chart}.${name} is ${status} to '${user}'"
|
||||
info "sent pushover notification to '${user}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send pushover notification for: ${host} ${chart}.${name} is ${status} to '${user}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send pushover notification to '${user}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1112,10 +1221,10 @@ EOF
|
|||
) "https://api.pushbullet.com/v2/pushes" -X POST)
|
||||
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent pushbullet notification for: ${host} ${chart}.${name} is ${status} to '${userOrChannelTag}'"
|
||||
info "sent pushbullet notification to '${userOrChannelTag}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send pushbullet notification for: ${host} ${chart}.${name} is ${status} to '${userOrChannelTag}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send pushbullet notification to '${userOrChannelTag}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1136,10 +1245,10 @@ send_kafka() {
|
|||
"${KAFKA_URL}")
|
||||
|
||||
if [ "${httpcode}" = "204" ]; then
|
||||
info "sent kafka data for: ${host} ${chart}.${name} is ${status} and ip '${KAFKA_SENDER_IP}'"
|
||||
info "sent kafka data to '${KAFKA_SENDER_IP}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send kafka data for: ${host} ${chart}.${name} is ${status} and ip '${KAFKA_SENDER_IP}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send kafka data to '${KAFKA_SENDER_IP}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
|
||||
[ ${sent} -gt 0 ] && return 0
|
||||
|
@ -1237,10 +1346,10 @@ EOF
|
|||
fi
|
||||
httpcode=$(docurl -X POST --data "${payload}" ${url})
|
||||
if [ "${httpcode}" = "${response_code}" ]; then
|
||||
info "sent pagerduty notification for: ${host} ${chart}.${name} is ${status}'"
|
||||
info "sent pagerduty event for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send pagerduty notification for: ${host} ${chart}.${name} is ${status}, with HTTP response status code ${httpcode}."
|
||||
error "failed to send pagerduty event for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1266,10 +1375,10 @@ send_twilio() {
|
|||
"https://api.twilio.com/2010-04-01/Accounts/${accountsid}/Messages.json")
|
||||
|
||||
if [ "${httpcode}" = "201" ]; then
|
||||
info "sent Twilio SMS for: ${host} ${chart}.${name} is ${status} to '${user}'"
|
||||
info "sent Twilio SMS to '${user}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send Twilio SMS for: ${host} ${chart}.${name} is ${status} to '${user}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send Twilio SMS to '${user}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1315,10 +1424,10 @@ send_hipchat() {
|
|||
"https://${HIPCHAT_SERVER}/v2/room/${room}/notification")
|
||||
|
||||
if [ "${httpcode}" = "204" ]; then
|
||||
info "sent HipChat notification for: ${host} ${chart}.${name} is ${status} to '${room}'"
|
||||
info "sent HipChat notification to '${room}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send HipChat notification for: ${host} ${chart}.${name} is ${status} to '${room}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send HipChat notification to '${room}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1345,10 +1454,10 @@ send_messagebird() {
|
|||
"https://rest.messagebird.com/messages")
|
||||
|
||||
if [ "${httpcode}" = "201" ]; then
|
||||
info "sent Messagebird SMS for: ${host} ${chart}.${name} is ${status} to '${user}'"
|
||||
info "sent Messagebird SMS to '${user}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send Messagebird SMS for: ${host} ${chart}.${name} is ${status} to '${user}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send Messagebird SMS to '${user}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1372,10 +1481,10 @@ send_kavenegar() {
|
|||
--data-urlencode "message=${title} ${message}")
|
||||
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent Kavenegar SMS for: ${host} ${chart}.${name} is ${status} to '${user}'"
|
||||
info "sent Kavenegar SMS to '${user}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send Kavenegar SMS for: ${host} ${chart}.${name} is ${status} to '${user}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send Kavenegar SMS to '${user}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1416,21 +1525,21 @@ send_telegram() {
|
|||
notify_telegram=0
|
||||
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent telegram notification for: ${host} ${chart}.${name} is ${status} to '${chatid}'"
|
||||
info "sent telegram notification to '${chatid}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
elif [ "${httpcode}" = "401" ]; then
|
||||
error "failed to send telegram notification for: ${host} ${chart}.${name} is ${status} to '${chatid}': Wrong bot token."
|
||||
error "failed to send telegram notification to '${chatid}' for ${notification_description}, wrong bot token."
|
||||
elif [ "${httpcode}" = "429" ]; then
|
||||
if [ "$notify_retries" -gt 0 ]; then
|
||||
error "failed to send telegram notification for: ${host} ${chart}.${name} is ${status} to '${chatid}': rate limit exceeded, retrying after 1s."
|
||||
error "failed to send telegram notification to '${chatid}' for ${notification_description}, rate limit exceeded, retrying after 1s."
|
||||
notify_retries=$((notify_retries - 1))
|
||||
notify_telegram=1
|
||||
sleep 1
|
||||
else
|
||||
error "failed to send telegram notification for: ${host} ${chart}.${name} is ${status} to '${chatid}': rate limit exceeded."
|
||||
error "failed to send telegram notification to '${chatid}' for ${notification_description}, rate limit exceeded."
|
||||
fi
|
||||
else
|
||||
error "failed to send telegram notification for: ${host} ${chart}.${name} is ${status} to '${chatid}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send telegram notification to '${chatid}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
done
|
||||
|
@ -1487,10 +1596,10 @@ EOF
|
|||
httpcode=$(docurl -H "Content-Type: application/json" -d "${payload}" "${cur_webhook}")
|
||||
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent Microsoft team notification for: ${host} ${chart}.${name} is ${status} to '${cur_webhook}'"
|
||||
info "sent Microsoft team notification to '${cur_webhook}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send Microsoft team notification for: ${host} ${chart}.${name} is ${status} to '${cur_webhook}', with HTTP response status code ${httpcode}."
|
||||
error "failed to send Microsoft team to '${cur_webhook}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1558,10 +1667,10 @@ EOF
|
|||
|
||||
httpcode=$(docurl -X POST --data-urlencode "payload=${payload}" "${webhook}")
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent slack notification for: ${host} ${chart}.${name} is ${status} ${chstr}"
|
||||
info "sent slack notification ${chstr} for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send slack notification for: ${host} ${chart}.${name} is ${status} ${chstr}, with HTTP response status code ${httpcode}."
|
||||
error "failed to send slack notification ${chstr} for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1616,10 +1725,10 @@ EOF
|
|||
|
||||
httpcode=$(docurl -X POST --data-urlencode "payload=${payload}" "${webhook}")
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent rocketchat notification for: ${host} ${chart}.${name} is ${status} to '${channel}'"
|
||||
info "sent rocketchat notification to '${channel}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send rocketchat notification for: ${host} ${chart}.${name} is ${status} to '${channel}', with HTTP response status code ${httpcode}."
|
||||
error "failed to send rocketchat notification to '${channel}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1685,12 +1794,12 @@ EOF
|
|||
httpcode=$(docurl -X POST "${webhook}/alert" -H "Content-Type: application/json" -H "Authorization: $auth" --data "${payload}")
|
||||
|
||||
if [ "${httpcode}" = "200" ] || [ "${httpcode}" = "201" ]; then
|
||||
info "sent alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}'"
|
||||
info "sent alerta notification to '${channel}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
elif [ "${httpcode}" = "202" ]; then
|
||||
info "suppressed alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}'"
|
||||
info "suppressed alerta notification to '${channel}' for ${notification_description}"
|
||||
else
|
||||
error "failed to send alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}', with HTTP response status code ${httpcode}."
|
||||
error "failed to send alerta notification to '${channel}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1740,10 +1849,10 @@ send_flock() {
|
|||
]
|
||||
}")
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent flock notification for: ${host} ${chart}.${name} is ${status} to '${channel}'"
|
||||
info "sent flock notification to '${channel}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send flock notification for: ${host} ${chart}.${name} is ${status} to '${channel}', with HTTP response status code ${httpcode}."
|
||||
error "failed to send flock notification to '${channel}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1801,10 +1910,10 @@ EOF
|
|||
|
||||
httpcode=$(docurl -X POST --data-urlencode "payload=${payload}" "${webhook}")
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent discord notification for: ${host} ${chart}.${name} is ${status} to '${channel}'"
|
||||
info "sent discord notification to '${channel}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send discord notification for: ${host} ${chart}.${name} is ${status} to '${channel}', with HTTP response status code ${httpcode}."
|
||||
error "failed to send discord notification to '${channel}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1830,10 +1939,10 @@ send_fleep() {
|
|||
httpcode=$(docurl -X POST --data "${data}" "https://fleep.io/hook/${hook}")
|
||||
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent fleep data for: ${host} ${chart}.${name} is ${status} and user '${FLEEP_SENDER}'"
|
||||
info "sent fleep data to user '${FLEEP_SENDER}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send fleep data for: ${host} ${chart}.${name} is ${status} and user '${FLEEP_SENDER}' with HTTP response status code ${httpcode}."
|
||||
error "failed to send fleep data to user '${FLEEP_SENDER}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1875,10 +1984,10 @@ send_prowl() {
|
|||
httpcode=$(docurl -X POST --data "${data}" "https://api.prowlapp.com/publicapi/add")
|
||||
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent prowl data for: ${host} ${chart}.${name} is ${status}"
|
||||
info "sent prowl event for ${notification_description}"
|
||||
sent=1
|
||||
else
|
||||
error "failed to send prowl data for: ${host} ${chart}.${name} is ${status} with with error code ${httpcode}."
|
||||
error "failed to send prowl event for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
|
||||
[ ${sent} -gt 0 ] && return 0
|
||||
|
@ -1914,10 +2023,10 @@ send_irc() {
|
|||
done
|
||||
|
||||
if [ "${error}" -eq 0 ]; then
|
||||
info "sent irc notification for: ${host} ${chart}.${name} is ${status} to '${CHANNEL}'"
|
||||
info "sent irc notification to '${CHANNEL}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send irc notification for: ${host} ${chart}.${name} is ${status} to '${CHANNEL}', with error code ${code}."
|
||||
error "failed to send irc notification to '${CHANNEL}' for ${notification_description}, with error code ${code}."
|
||||
fi
|
||||
done
|
||||
fi
|
||||
|
@ -1942,10 +2051,10 @@ send_awssns() {
|
|||
# Extract the region from the target ARN. We need to explicitly specify the region so that it matches up correctly.
|
||||
region="$(echo ${target} | cut -f 4 -d ':')"
|
||||
if ${aws} sns publish --region "${region}" --subject "${host} ${status_message} - ${name//_/ } - ${chart}" --message "${message}" --target-arn ${target} &>/dev/null; then
|
||||
info "sent Amazon SNS notification for: ${host} ${chart}.${name} is ${status} to '${target}'"
|
||||
info "sent Amazon SNS notification to '${target}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send Amazon SNS notification for: ${host} ${chart}.${name} is ${status} to '${target}'"
|
||||
error "failed to send Amazon SNS notification to '${target}' for ${notification_description}"
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -1987,10 +2096,10 @@ EOF
|
|||
|
||||
httpcode=$(docurl -X POST --data "${payload}" "${webhook}")
|
||||
if [ "${httpcode}" == "200" ]; then
|
||||
info "sent Matrix notification for: ${host} ${chart}.${name} is ${status} to '${room}'"
|
||||
info "sent Matrix notification to '${room}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send Matrix notification for: ${host} ${chart}.${name} is ${status} to '${room}', with HTTP response status code ${httpcode}."
|
||||
error "failed to send Matrix notification to '${room}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -2089,10 +2198,10 @@ send_sms() {
|
|||
errmessage=$($sendsms $phone "$msg" 2>&1)
|
||||
errcode=$?
|
||||
if [ ${errcode} -eq 0 ]; then
|
||||
info "sent smstools3 SMS for: ${host} ${chart}.${name} is ${status} to '${user}'"
|
||||
info "sent smstools3 SMS to '${user}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send smstools3 SMS for: ${host} ${chart}.${name} is ${status} to '${user}' with error code ${errcode}: ${errmessage}."
|
||||
error "failed to send smstools3 SMS to '${user}' for ${notification_description}, with error code ${errcode}: ${errmessage}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
@ -2139,14 +2248,14 @@ EOF
|
|||
|
||||
if [ ${ret} -eq 0 ]; then
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent ${DYNATRACE_EVENT} to ${DYNATRACE_SERVER}"
|
||||
info "sent Dynatrace event '${DYNATRACE_EVENT}' to '${DYNATRACE_SERVER}' for ${notification_description}"
|
||||
return 0
|
||||
else
|
||||
warning "Dynatrace ${DYNATRACE_SERVER} responded ${httpcode} notification for: ${host} ${chart}.${name} is ${status} was not sent!"
|
||||
warning "failed to send Dynatrace event to '${DYNATRACE_SERVER}' for ${notification_description}, with HTTP response status code ${httpcode}"
|
||||
return 1
|
||||
fi
|
||||
else
|
||||
error "failed to sent ${DYNATRACE_EVENT} notification for: ${host} ${chart}.${name} is ${status} to ${DYNATRACE_SERVER} with error code ${ret}."
|
||||
error "failed to sent Dynatrace '${DYNATRACE_EVENT}' to '${DYNATRACE_SERVER}' for ${notification_description}, with code ${ret}."
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
@ -2204,9 +2313,9 @@ EOF
|
|||
httpcode=$(docurl -X POST -H "Content-Type: application/json" -d "${payload}" "${OPSGENIE_API_URL}/v1/json/integrations/webhooks/netdata?apiKey=${OPSGENIE_API_KEY}")
|
||||
# https://docs.opsgenie.com/docs/alert-api#create-alert
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent opsgenie notification for: ${host} ${chart}.${name} is ${status}"
|
||||
info "sent opsgenie event for ${notification_description}"
|
||||
else
|
||||
error "failed to send opsgenie notification for: ${host} ${chart}.${name} is ${status}, with HTTP error code ${httpcode}."
|
||||
error "failed to send opsgenie event for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
return 1
|
||||
fi
|
||||
|
||||
|
@ -2243,9 +2352,9 @@ EOF
|
|||
|
||||
httpcode=$(docurl -X POST -H "Content-Type: application/json" -d "${payload}" "${GOTIFY_APP_URL}/message?token=${GOTIFY_APP_TOKEN}")
|
||||
if [ "${httpcode}" = "200" ]; then
|
||||
info "sent gotify notification for: ${host} ${chart}.${name} is ${status}"
|
||||
info "sent gotify event for ${notification_description}"
|
||||
else
|
||||
error "failed to send gotify notification for: ${host} ${chart}.${name} is ${status}, with HTTP error code ${httpcode}."
|
||||
error "failed to send gotify event for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
return 1
|
||||
fi
|
||||
|
||||
|
@ -2298,10 +2407,10 @@ send_ntfy() {
|
|||
-d "${msg}" \
|
||||
${recipient})
|
||||
if [ "${httpcode}" == "200" ]; then
|
||||
info "sent ntfy notification for: ${host} ${chart}.${name} is ${status} to '${recipient}'"
|
||||
info "sent ntfy notification to '${recipient}' for ${notification_description}"
|
||||
sent=$((sent + 1))
|
||||
else
|
||||
error "failed to send ntfy notification for: ${host} ${chart}.${name} is ${status} to '${recipient}', with HTTP response status code ${httpcode}."
|
||||
error "failed to send ntfy notification to '${recipient}' for ${notification_description}, with HTTP response status code ${httpcode}."
|
||||
fi
|
||||
done
|
||||
|
||||
|
|
|
@ -211,8 +211,8 @@ sendsms=""
|
|||
# EMAIL_SENDER="\"User Name\" <user@domain>"
|
||||
EMAIL_SENDER=""
|
||||
|
||||
# enable/disable sending emails
|
||||
SEND_EMAIL="YES"
|
||||
# enable/disable sending emails, set this YES, or NO, AUTO to enable/disable based on sendmail availability
|
||||
SEND_EMAIL="AUTO"
|
||||
|
||||
# if a role recipient is not configured, an email will be send to:
|
||||
DEFAULT_RECIPIENT_EMAIL="root"
|
||||
|
|
|
@ -8,9 +8,11 @@ SUBDIRS = \
|
|||
aral \
|
||||
avl \
|
||||
buffer \
|
||||
buffered_reader \
|
||||
clocks \
|
||||
completion \
|
||||
config \
|
||||
datetime \
|
||||
dictionary \
|
||||
ebpf \
|
||||
eval \
|
||||
|
@ -19,6 +21,7 @@ SUBDIRS = \
|
|||
json \
|
||||
july \
|
||||
health \
|
||||
line_splitter \
|
||||
locks \
|
||||
log \
|
||||
onewayalloc \
|
||||
|
@ -31,6 +34,7 @@ SUBDIRS = \
|
|||
string \
|
||||
threads \
|
||||
url \
|
||||
uuid \
|
||||
worker_utilization \
|
||||
tests \
|
||||
$(NULL)
|
||||
|
|
|
@ -81,6 +81,7 @@ void buffer_snprintf(BUFFER *wb, size_t len, const char *fmt, ...)
|
|||
|
||||
va_list args;
|
||||
va_start(args, fmt);
|
||||
// vsnprintfz() returns the number of bytes actually written - after possible truncation
|
||||
wb->len += vsnprintfz(&wb->buffer[wb->len], len, fmt, args);
|
||||
va_end(args);
|
||||
|
||||
|
@ -89,53 +90,39 @@ void buffer_snprintf(BUFFER *wb, size_t len, const char *fmt, ...)
|
|||
// the buffer is \0 terminated by vsnprintfz
|
||||
}
|
||||
|
||||
void buffer_vsprintf(BUFFER *wb, const char *fmt, va_list args)
|
||||
{
|
||||
inline void buffer_vsprintf(BUFFER *wb, const char *fmt, va_list args) {
|
||||
if(unlikely(!fmt || !*fmt)) return;
|
||||
|
||||
size_t wrote = 0, need = 2, space_remaining = 0;
|
||||
size_t full_size_bytes = 0, need = 2, space_remaining = 0;
|
||||
|
||||
do {
|
||||
need += space_remaining * 2;
|
||||
need += full_size_bytes + 2;
|
||||
|
||||
netdata_log_debug(D_WEB_BUFFER, "web_buffer_sprintf(): increasing web_buffer at position %zu, size = %zu, by %zu bytes (wrote = %zu)\n", wb->len, wb->size, need, wrote);
|
||||
buffer_need_bytes(wb, need);
|
||||
|
||||
space_remaining = wb->size - wb->len - 1;
|
||||
|
||||
wrote = (size_t) vsnprintfz(&wb->buffer[wb->len], space_remaining, fmt, args);
|
||||
// Use the copy of va_list for vsnprintf
|
||||
va_list args_copy;
|
||||
va_copy(args_copy, args);
|
||||
// vsnprintf() returns the number of bytes required, even if bigger than the buffer provided
|
||||
full_size_bytes = (size_t) vsnprintf(&wb->buffer[wb->len], space_remaining, fmt, args_copy);
|
||||
va_end(args_copy);
|
||||
|
||||
} while(wrote >= space_remaining);
|
||||
} while(full_size_bytes >= space_remaining);
|
||||
|
||||
wb->len += wrote;
|
||||
wb->len += full_size_bytes;
|
||||
|
||||
// the buffer is \0 terminated by vsnprintf
|
||||
wb->buffer[wb->len] = '\0';
|
||||
buffer_overflow_check(wb);
|
||||
}
|
||||
|
||||
void buffer_sprintf(BUFFER *wb, const char *fmt, ...)
|
||||
{
|
||||
if(unlikely(!fmt || !*fmt)) return;
|
||||
|
||||
va_list args;
|
||||
size_t wrote = 0, need = 2, space_remaining = 0;
|
||||
|
||||
do {
|
||||
need += space_remaining * 2;
|
||||
|
||||
netdata_log_debug(D_WEB_BUFFER, "web_buffer_sprintf(): increasing web_buffer at position %zu, size = %zu, by %zu bytes (wrote = %zu)\n", wb->len, wb->size, need, wrote);
|
||||
buffer_need_bytes(wb, need);
|
||||
|
||||
space_remaining = wb->size - wb->len - 1;
|
||||
|
||||
va_start(args, fmt);
|
||||
wrote = (size_t) vsnprintfz(&wb->buffer[wb->len], space_remaining, fmt, args);
|
||||
va_end(args);
|
||||
|
||||
} while(wrote >= space_remaining);
|
||||
|
||||
wb->len += wrote;
|
||||
|
||||
// the buffer is \0 terminated by vsnprintf
|
||||
va_start(args, fmt);
|
||||
buffer_vsprintf(wb, fmt, args);
|
||||
va_end(args);
|
||||
}
|
||||
|
||||
// generate a javascript date, the fastest possible way...
|
||||
|
|
|
@ -94,6 +94,8 @@ typedef struct web_buffer {
|
|||
} json;
|
||||
} BUFFER;
|
||||
|
||||
#define CLEAN_BUFFER _cleanup_(buffer_freep) BUFFER
|
||||
|
||||
#define buffer_cacheable(wb) do { (wb)->options |= WB_CONTENT_CACHEABLE; if((wb)->options & WB_CONTENT_NO_CACHEABLE) (wb)->options &= ~WB_CONTENT_NO_CACHEABLE; } while(0)
|
||||
#define buffer_no_cacheable(wb) do { (wb)->options |= WB_CONTENT_NO_CACHEABLE; if((wb)->options & WB_CONTENT_CACHEABLE) (wb)->options &= ~WB_CONTENT_CACHEABLE; (wb)->expires = 0; } while(0)
|
||||
|
||||
|
@ -135,6 +137,10 @@ BUFFER *buffer_create(size_t size, size_t *statistics);
|
|||
void buffer_free(BUFFER *b);
|
||||
void buffer_increase(BUFFER *b, size_t free_size_required);
|
||||
|
||||
static inline void buffer_freep(BUFFER **bp) {
|
||||
if(bp) buffer_free(*bp);
|
||||
}
|
||||
|
||||
void buffer_snprintf(BUFFER *wb, size_t len, const char *fmt, ...) PRINTFLIKE(3, 4);
|
||||
void buffer_vsprintf(BUFFER *wb, const char *fmt, va_list args);
|
||||
void buffer_sprintf(BUFFER *wb, const char *fmt, ...) PRINTFLIKE(2,3);
|
||||
|
@ -210,6 +216,13 @@ static inline void buffer_fast_rawcat(BUFFER *wb, const char *txt, size_t len) {
|
|||
buffer_overflow_check(wb);
|
||||
}
|
||||
|
||||
static inline void buffer_putc(BUFFER *wb, char c) {
|
||||
buffer_need_bytes(wb, 2);
|
||||
wb->buffer[wb->len++] = c;
|
||||
wb->buffer[wb->len] = '\0';
|
||||
buffer_overflow_check(wb);
|
||||
}
|
||||
|
||||
static inline void buffer_fast_strcat(BUFFER *wb, const char *txt, size_t len) {
|
||||
if(unlikely(!txt || !*txt || !len)) return;
|
||||
|
||||
|
@ -283,6 +296,19 @@ static inline void buffer_strncat(BUFFER *wb, const char *txt, size_t len) {
|
|||
buffer_overflow_check(wb);
|
||||
}
|
||||
|
||||
static inline void buffer_memcat(BUFFER *wb, const void *mem, size_t bytes) {
|
||||
if(unlikely(!mem)) return;
|
||||
|
||||
buffer_need_bytes(wb, bytes + 1);
|
||||
|
||||
memcpy(&wb->buffer[wb->len], mem, bytes);
|
||||
|
||||
wb->len += bytes;
|
||||
wb->buffer[wb->len] = '\0';
|
||||
|
||||
buffer_overflow_check(wb);
|
||||
}
|
||||
|
||||
static inline void buffer_json_strcat(BUFFER *wb, const char *txt) {
|
||||
if(unlikely(!txt || !*txt)) return;
|
||||
|
||||
|
|
8
libnetdata/buffered_reader/Makefile.am
Normal file
8
libnetdata/buffered_reader/Makefile.am
Normal file
|
@ -0,0 +1,8 @@
|
|||
# SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
AUTOMAKE_OPTIONS = subdir-objects
|
||||
MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
|
||||
|
||||
dist_noinst_DATA = \
|
||||
README.md \
|
||||
$(NULL)
|
0
libnetdata/buffered_reader/README.md
Normal file
0
libnetdata/buffered_reader/README.md
Normal file
3
libnetdata/buffered_reader/buffered_reader.c
Normal file
3
libnetdata/buffered_reader/buffered_reader.c
Normal file
|
@ -0,0 +1,3 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
146
libnetdata/buffered_reader/buffered_reader.h
Normal file
146
libnetdata/buffered_reader/buffered_reader.h
Normal file
|
@ -0,0 +1,146 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
#ifndef NETDATA_BUFFERED_READER_H
|
||||
#define NETDATA_BUFFERED_READER_H
|
||||
|
||||
struct buffered_reader {
|
||||
ssize_t read_len;
|
||||
ssize_t pos;
|
||||
char read_buffer[PLUGINSD_LINE_MAX + 1];
|
||||
};
|
||||
|
||||
static inline void buffered_reader_init(struct buffered_reader *reader) {
|
||||
reader->read_buffer[0] = '\0';
|
||||
reader->read_len = 0;
|
||||
reader->pos = 0;
|
||||
}
|
||||
|
||||
typedef enum {
|
||||
BUFFERED_READER_READ_OK = 0,
|
||||
BUFFERED_READER_READ_FAILED = -1,
|
||||
BUFFERED_READER_READ_BUFFER_FULL = -2,
|
||||
BUFFERED_READER_READ_POLLERR = -3,
|
||||
BUFFERED_READER_READ_POLLHUP = -4,
|
||||
BUFFERED_READER_READ_POLLNVAL = -5,
|
||||
BUFFERED_READER_READ_POLL_UNKNOWN = -6,
|
||||
BUFFERED_READER_READ_POLL_TIMEOUT = -7,
|
||||
BUFFERED_READER_READ_POLL_FAILED = -8,
|
||||
} buffered_reader_ret_t;
|
||||
|
||||
|
||||
static inline buffered_reader_ret_t buffered_reader_read(struct buffered_reader *reader, int fd) {
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
if(reader->read_buffer[reader->read_len] != '\0')
|
||||
fatal("read_buffer does not start with zero");
|
||||
#endif
|
||||
|
||||
char *read_at = reader->read_buffer + reader->read_len;
|
||||
ssize_t remaining = sizeof(reader->read_buffer) - reader->read_len - 1;
|
||||
|
||||
if(unlikely(remaining <= 0))
|
||||
return BUFFERED_READER_READ_BUFFER_FULL;
|
||||
|
||||
ssize_t bytes_read = read(fd, read_at, remaining);
|
||||
if(unlikely(bytes_read <= 0))
|
||||
return BUFFERED_READER_READ_FAILED;
|
||||
|
||||
reader->read_len += bytes_read;
|
||||
reader->read_buffer[reader->read_len] = '\0';
|
||||
|
||||
return BUFFERED_READER_READ_OK;
|
||||
}
|
||||
|
||||
static inline buffered_reader_ret_t buffered_reader_read_timeout(struct buffered_reader *reader, int fd, int timeout_ms, bool log_error) {
|
||||
errno = 0;
|
||||
struct pollfd fds[1];
|
||||
|
||||
fds[0].fd = fd;
|
||||
fds[0].events = POLLIN;
|
||||
|
||||
int ret = poll(fds, 1, timeout_ms);
|
||||
|
||||
if (ret > 0) {
|
||||
/* There is data to read */
|
||||
if (fds[0].revents & POLLIN)
|
||||
return buffered_reader_read(reader, fd);
|
||||
|
||||
else if(fds[0].revents & POLLERR) {
|
||||
if(log_error)
|
||||
netdata_log_error("PARSER: read failed: POLLERR.");
|
||||
return BUFFERED_READER_READ_POLLERR;
|
||||
}
|
||||
else if(fds[0].revents & POLLHUP) {
|
||||
if(log_error)
|
||||
netdata_log_error("PARSER: read failed: POLLHUP.");
|
||||
return BUFFERED_READER_READ_POLLHUP;
|
||||
}
|
||||
else if(fds[0].revents & POLLNVAL) {
|
||||
if(log_error)
|
||||
netdata_log_error("PARSER: read failed: POLLNVAL.");
|
||||
return BUFFERED_READER_READ_POLLNVAL;
|
||||
}
|
||||
|
||||
if(log_error)
|
||||
netdata_log_error("PARSER: poll() returned positive number, but POLLIN|POLLERR|POLLHUP|POLLNVAL are not set.");
|
||||
return BUFFERED_READER_READ_POLL_UNKNOWN;
|
||||
}
|
||||
else if (ret == 0) {
|
||||
if(log_error)
|
||||
netdata_log_error("PARSER: timeout while waiting for data.");
|
||||
return BUFFERED_READER_READ_POLL_TIMEOUT;
|
||||
}
|
||||
|
||||
if(log_error)
|
||||
netdata_log_error("PARSER: poll() failed with code %d.", ret);
|
||||
return BUFFERED_READER_READ_POLL_FAILED;
|
||||
}
|
||||
|
||||
/* Produce a full line if one exists, statefully return where we start next time.
|
||||
* When we hit the end of the buffer with a partial line move it to the beginning for the next fill.
|
||||
*/
|
||||
static inline bool buffered_reader_next_line(struct buffered_reader *reader, BUFFER *dst) {
|
||||
buffer_need_bytes(dst, reader->read_len - reader->pos + 2);
|
||||
|
||||
size_t start = reader->pos;
|
||||
|
||||
char *ss = &reader->read_buffer[start];
|
||||
char *se = &reader->read_buffer[reader->read_len];
|
||||
char *ds = &dst->buffer[dst->len];
|
||||
char *de = &ds[dst->size - dst->len - 2];
|
||||
|
||||
if(ss >= se) {
|
||||
*ds = '\0';
|
||||
reader->pos = 0;
|
||||
reader->read_len = 0;
|
||||
reader->read_buffer[reader->read_len] = '\0';
|
||||
return false;
|
||||
}
|
||||
|
||||
// copy all bytes to buffer
|
||||
while(ss < se && ds < de && *ss != '\n') {
|
||||
*ds++ = *ss++;
|
||||
dst->len++;
|
||||
}
|
||||
|
||||
// if we have a newline, return the buffer
|
||||
if(ss < se && ds < de && *ss == '\n') {
|
||||
// newline found in the r->read_buffer
|
||||
|
||||
*ds++ = *ss++; // copy the newline too
|
||||
dst->len++;
|
||||
|
||||
*ds = '\0';
|
||||
|
||||
reader->pos = ss - reader->read_buffer;
|
||||
return true;
|
||||
}
|
||||
|
||||
reader->pos = 0;
|
||||
reader->read_len = 0;
|
||||
reader->read_buffer[reader->read_len] = '\0';
|
||||
return false;
|
||||
}
|
||||
|
||||
#endif //NETDATA_BUFFERED_READER_H
|
|
@ -298,8 +298,10 @@ usec_t heartbeat_next(heartbeat_t *hb, usec_t tick) {
|
|||
// TODO: The heartbeat tick should be specified at the heartbeat_init() function
|
||||
usec_t tmp = (now_realtime_usec() * clock_realtime_resolution) % (tick / 2);
|
||||
|
||||
error_limit_static_global_var(erl, 10, 0);
|
||||
error_limit(&erl, "heartbeat randomness of %"PRIu64" is too big for a tick of %"PRIu64" - setting it to %"PRIu64"", hb->randomness, tick, tmp);
|
||||
nd_log_limit_static_global_var(erl, 10, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_NOTICE,
|
||||
"heartbeat randomness of %"PRIu64" is too big for a tick of %"PRIu64" - setting it to %"PRIu64"",
|
||||
hb->randomness, tick, tmp);
|
||||
hb->randomness = tmp;
|
||||
}
|
||||
|
||||
|
@ -325,13 +327,19 @@ usec_t heartbeat_next(heartbeat_t *hb, usec_t tick) {
|
|||
|
||||
if(unlikely(now < next)) {
|
||||
errno = 0;
|
||||
error_limit_static_global_var(erl, 10, 0);
|
||||
error_limit(&erl, "heartbeat clock: woke up %"PRIu64" microseconds earlier than expected (can be due to the CLOCK_REALTIME set to the past).", next - now);
|
||||
nd_log_limit_static_global_var(erl, 10, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_NOTICE,
|
||||
"heartbeat clock: woke up %"PRIu64" microseconds earlier than expected "
|
||||
"(can be due to the CLOCK_REALTIME set to the past).",
|
||||
next - now);
|
||||
}
|
||||
else if(unlikely(now - next > tick / 2)) {
|
||||
errno = 0;
|
||||
error_limit_static_global_var(erl, 10, 0);
|
||||
error_limit(&erl, "heartbeat clock: woke up %"PRIu64" microseconds later than expected (can be due to system load or the CLOCK_REALTIME set to the future).", now - next);
|
||||
nd_log_limit_static_global_var(erl, 10, 0);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_NOTICE,
|
||||
"heartbeat clock: woke up %"PRIu64" microseconds later than expected "
|
||||
"(can be due to system load or the CLOCK_REALTIME set to the future).",
|
||||
now - next);
|
||||
}
|
||||
|
||||
if(unlikely(!hb->realtime)) {
|
||||
|
|
8
libnetdata/datetime/Makefile.am
Normal file
8
libnetdata/datetime/Makefile.am
Normal file
|
@ -0,0 +1,8 @@
|
|||
# SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
AUTOMAKE_OPTIONS = subdir-objects
|
||||
MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
|
||||
|
||||
dist_noinst_DATA = \
|
||||
README.md \
|
||||
$(NULL)
|
11
libnetdata/datetime/README.md
Normal file
11
libnetdata/datetime/README.md
Normal file
|
@ -0,0 +1,11 @@
|
|||
<!--
|
||||
title: "Datetime"
|
||||
custom_edit_url: https://github.com/netdata/netdata/edit/master/libnetdata/datetime/README.md
|
||||
sidebar_label: "Datetime"
|
||||
learn_topic_type: "Tasks"
|
||||
learn_rel_path: "Developers/libnetdata"
|
||||
-->
|
||||
|
||||
# Datetime
|
||||
|
||||
Formatting dates and timestamps.
|
81
libnetdata/datetime/iso8601.c
Normal file
81
libnetdata/datetime/iso8601.c
Normal file
|
@ -0,0 +1,81 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
size_t iso8601_datetime_ut(char *buffer, size_t len, usec_t now_ut, ISO8601_OPTIONS options) {
|
||||
if(unlikely(!buffer || len == 0))
|
||||
return 0;
|
||||
|
||||
time_t t = (time_t)(now_ut / USEC_PER_SEC);
|
||||
struct tm *tmp, tmbuf;
|
||||
|
||||
if(options & ISO8601_UTC)
|
||||
// Use gmtime_r for UTC time conversion.
|
||||
tmp = gmtime_r(&t, &tmbuf);
|
||||
else
|
||||
// Use localtime_r for local time conversion.
|
||||
tmp = localtime_r(&t, &tmbuf);
|
||||
|
||||
if (unlikely(!tmp)) {
|
||||
buffer[0] = '\0';
|
||||
return 0;
|
||||
}
|
||||
|
||||
// Format the date and time according to the ISO 8601 format.
|
||||
size_t used_length = strftime(buffer, len, "%Y-%m-%dT%H:%M:%S", tmp);
|
||||
if (unlikely(used_length == 0)) {
|
||||
buffer[0] = '\0';
|
||||
return 0;
|
||||
}
|
||||
|
||||
if(options & ISO8601_MILLISECONDS) {
|
||||
// Calculate the remaining microseconds
|
||||
int milliseconds = (int) ((now_ut % USEC_PER_SEC) / USEC_PER_MS);
|
||||
if(milliseconds && len - used_length > 4)
|
||||
used_length += snprintfz(buffer + used_length, len - used_length, ".%03d", milliseconds);
|
||||
}
|
||||
else if(options & ISO8601_MICROSECONDS) {
|
||||
// Calculate the remaining microseconds
|
||||
int microseconds = (int) (now_ut % USEC_PER_SEC);
|
||||
if(microseconds && len - used_length > 7)
|
||||
used_length += snprintfz(buffer + used_length, len - used_length, ".%06d", microseconds);
|
||||
}
|
||||
|
||||
if(options & ISO8601_UTC) {
|
||||
if(used_length + 1 < len) {
|
||||
buffer[used_length++] = 'Z';
|
||||
buffer[used_length] = '\0'; // null-terminate the string.
|
||||
}
|
||||
}
|
||||
else {
|
||||
// Calculate the timezone offset in hours and minutes from UTC.
|
||||
long offset = tmbuf.tm_gmtoff;
|
||||
int hours = (int) (offset / 3600); // Convert offset seconds to hours.
|
||||
int minutes = (int) ((offset % 3600) / 60); // Convert remainder to minutes (keep the sign for minutes).
|
||||
|
||||
// Check if timezone is UTC.
|
||||
if(hours == 0 && minutes == 0) {
|
||||
// For UTC, append 'Z' to the timestamp.
|
||||
if(used_length + 1 < len) {
|
||||
buffer[used_length++] = 'Z';
|
||||
buffer[used_length] = '\0'; // null-terminate the string.
|
||||
}
|
||||
}
|
||||
else {
|
||||
// For non-UTC, format the timezone offset. Omit minutes if they are zero.
|
||||
if(minutes == 0) {
|
||||
// Check enough space is available for the timezone offset string.
|
||||
if(used_length + 3 < len) // "+hh\0"
|
||||
used_length += snprintfz(buffer + used_length, len - used_length, "%+03d", hours);
|
||||
}
|
||||
else {
|
||||
// Check enough space is available for the timezone offset string.
|
||||
if(used_length + 6 < len) // "+hh:mm\0"
|
||||
used_length += snprintfz(buffer + used_length, len - used_length,
|
||||
"%+03d:%02d", hours, abs(minutes));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return used_length;
|
||||
}
|
18
libnetdata/datetime/iso8601.h
Normal file
18
libnetdata/datetime/iso8601.h
Normal file
|
@ -0,0 +1,18 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
#ifndef NETDATA_ISO8601_H
|
||||
#define NETDATA_ISO8601_H
|
||||
|
||||
typedef enum __attribute__((__packed__)) {
|
||||
ISO8601_UTC = (1 << 0),
|
||||
ISO8601_LOCAL_TIMEZONE = (1 << 1),
|
||||
ISO8601_MILLISECONDS = (1 << 2),
|
||||
ISO8601_MICROSECONDS = (1 << 3),
|
||||
} ISO8601_OPTIONS;
|
||||
|
||||
#define ISO8601_MAX_LENGTH 64
|
||||
size_t iso8601_datetime_ut(char *buffer, size_t len, usec_t now_ut, ISO8601_OPTIONS options);
|
||||
|
||||
#endif //NETDATA_ISO8601_H
|
29
libnetdata/datetime/rfc7231.c
Normal file
29
libnetdata/datetime/rfc7231.c
Normal file
|
@ -0,0 +1,29 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
inline size_t rfc7231_datetime(char *buffer, size_t len, time_t now_t) {
|
||||
if (unlikely(!buffer || !len))
|
||||
return 0;
|
||||
|
||||
struct tm *tmp, tmbuf;
|
||||
|
||||
// Use gmtime_r for UTC time conversion.
|
||||
tmp = gmtime_r(&now_t, &tmbuf);
|
||||
|
||||
if (unlikely(!tmp)) {
|
||||
buffer[0] = '\0';
|
||||
return 0;
|
||||
}
|
||||
|
||||
// Format the date and time according to the RFC 7231 format.
|
||||
size_t ret = strftime(buffer, len, "%a, %d %b %Y %H:%M:%S GMT", tmp);
|
||||
if (unlikely(ret == 0))
|
||||
buffer[0] = '\0';
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
size_t rfc7231_datetime_ut(char *buffer, size_t len, usec_t now_ut) {
|
||||
return rfc7231_datetime(buffer, len, (time_t) (now_ut / USEC_PER_SEC));
|
||||
}
|
12
libnetdata/datetime/rfc7231.h
Normal file
12
libnetdata/datetime/rfc7231.h
Normal file
|
@ -0,0 +1,12 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
#ifndef NETDATA_RFC7231_H
|
||||
#define NETDATA_RFC7231_H
|
||||
|
||||
#define RFC7231_MAX_LENGTH 30
|
||||
size_t rfc7231_datetime(char *buffer, size_t len, time_t now_t);
|
||||
size_t rfc7231_datetime_ut(char *buffer, size_t len, usec_t now_ut);
|
||||
|
||||
#endif //NETDATA_RFC7231_H
|
|
@ -64,6 +64,12 @@ static void *rrd_functions_worker_globals_worker_main(void *arg) {
|
|||
pthread_mutex_unlock(&wg->worker_mutex);
|
||||
|
||||
if(acquired) {
|
||||
ND_LOG_STACK lgs[] = {
|
||||
ND_LOG_FIELD_TXT(NDF_REQUEST, j->cmd),
|
||||
ND_LOG_FIELD_END(),
|
||||
};
|
||||
ND_LOG_STACK_PUSH(lgs);
|
||||
|
||||
last_acquired = true;
|
||||
j = dictionary_acquired_item_value(acquired);
|
||||
j->cb(j->transaction, j->cmd, j->timeout, &j->cancelled);
|
||||
|
|
|
@ -426,13 +426,24 @@ static inline void sanitize_json_string(char *dst, const char *src, size_t dst_s
|
|||
}
|
||||
|
||||
static inline bool sanitize_command_argument_string(char *dst, const char *src, size_t dst_size) {
|
||||
if(dst_size)
|
||||
*dst = '\0';
|
||||
|
||||
// skip leading dashes
|
||||
while (src[0] == '-')
|
||||
while (*src == '-')
|
||||
src++;
|
||||
|
||||
// escape single quotes
|
||||
while (src[0] != '\0') {
|
||||
if (src[0] == '\'') {
|
||||
while (*src != '\0') {
|
||||
if (dst_size < 1)
|
||||
return false;
|
||||
|
||||
if (iscntrl(*src) || *src == '$') {
|
||||
// remove control characters and characters that are expanded by bash
|
||||
*dst++ = '_';
|
||||
dst_size--;
|
||||
}
|
||||
else if (*src == '\'' || *src == '`') {
|
||||
// escape single quotes
|
||||
if (dst_size < 4)
|
||||
return false;
|
||||
|
||||
|
@ -440,14 +451,10 @@ static inline bool sanitize_command_argument_string(char *dst, const char *src,
|
|||
|
||||
dst += 4;
|
||||
dst_size -= 4;
|
||||
} else {
|
||||
if (dst_size < 1)
|
||||
return false;
|
||||
|
||||
dst[0] = src[0];
|
||||
|
||||
dst += 1;
|
||||
dst_size -= 1;
|
||||
}
|
||||
else {
|
||||
*dst++ = *src;
|
||||
dst_size--;
|
||||
}
|
||||
|
||||
src++;
|
||||
|
@ -456,6 +463,7 @@ static inline bool sanitize_command_argument_string(char *dst, const char *src,
|
|||
// make sure we have space to terminate the string
|
||||
if (dst_size == 0)
|
||||
return false;
|
||||
|
||||
*dst = '\0';
|
||||
|
||||
return true;
|
||||
|
@ -531,10 +539,6 @@ static inline int read_single_base64_or_hex_number_file(const char *filename, un
|
|||
}
|
||||
}
|
||||
|
||||
static inline int uuid_memcmp(const uuid_t *uu1, const uuid_t *uu2) {
|
||||
return memcmp(uu1, uu2, sizeof(uuid_t));
|
||||
}
|
||||
|
||||
static inline char *strsep_skip_consecutive_separators(char **ptr, char *s) {
|
||||
char *p = (char *)"";
|
||||
while (p && !p[0] && *ptr) p = strsep(ptr, s);
|
||||
|
|
|
@ -1269,17 +1269,19 @@ char *fgets_trim_len(char *buf, size_t buf_size, FILE *fp, size_t *len) {
|
|||
return s;
|
||||
}
|
||||
|
||||
// vsnprintfz() returns the number of bytes actually written - after possible truncation
|
||||
int vsnprintfz(char *dst, size_t n, const char *fmt, va_list args) {
|
||||
if(unlikely(!n)) return 0;
|
||||
|
||||
int size = vsnprintf(dst, n, fmt, args);
|
||||
dst[n - 1] = '\0';
|
||||
|
||||
if (unlikely((size_t) size > n)) size = (int)n;
|
||||
if (unlikely((size_t) size >= n)) size = (int)(n - 1);
|
||||
|
||||
return size;
|
||||
}
|
||||
|
||||
// snprintfz() returns the number of bytes actually written - after possible truncation
|
||||
int snprintfz(char *dst, size_t n, const char *fmt, ...) {
|
||||
va_list args;
|
||||
|
||||
|
@ -1694,53 +1696,6 @@ char *find_and_replace(const char *src, const char *find, const char *replace, c
|
|||
return value;
|
||||
}
|
||||
|
||||
inline int pluginsd_isspace(char c) {
|
||||
switch(c) {
|
||||
case ' ':
|
||||
case '\t':
|
||||
case '\r':
|
||||
case '\n':
|
||||
case '=':
|
||||
return 1;
|
||||
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
inline int config_isspace(char c) {
|
||||
switch (c) {
|
||||
case ' ':
|
||||
case '\t':
|
||||
case '\r':
|
||||
case '\n':
|
||||
case ',':
|
||||
return 1;
|
||||
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
inline int group_by_label_isspace(char c) {
|
||||
if(c == ',' || c == '|')
|
||||
return 1;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
bool isspace_map_pluginsd[256] = {};
|
||||
bool isspace_map_config[256] = {};
|
||||
bool isspace_map_group_by_label[256] = {};
|
||||
|
||||
__attribute__((constructor)) void initialize_is_space_arrays(void) {
|
||||
for(int c = 0; c < 256 ; c++) {
|
||||
isspace_map_pluginsd[c] = pluginsd_isspace((char) c);
|
||||
isspace_map_config[c] = config_isspace((char) c);
|
||||
isspace_map_group_by_label[c] = group_by_label_isspace((char) c);
|
||||
}
|
||||
}
|
||||
|
||||
bool run_command_and_copy_output_to_stdout(const char *command, int max_line_length) {
|
||||
pid_t pid;
|
||||
FILE *fp = netdata_popen(command, &pid, NULL);
|
||||
|
|
|
@ -201,6 +201,8 @@ extern "C" {
|
|||
// ----------------------------------------------------------------------------
|
||||
// netdata common definitions
|
||||
|
||||
#define _cleanup_(x) __attribute__((__cleanup__(x)))
|
||||
|
||||
#ifdef __GNUC__
|
||||
#define GCC_VERSION (__GNUC__ * 10000 + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__)
|
||||
#endif // __GNUC__
|
||||
|
@ -685,103 +687,6 @@ static inline BITMAPX *bitmapX_create(uint32_t bits) {
|
|||
#define COMPRESSION_MAX_OVERHEAD 128
|
||||
#define COMPRESSION_MAX_MSG_SIZE (COMPRESSION_MAX_CHUNK - COMPRESSION_MAX_OVERHEAD - 1)
|
||||
#define PLUGINSD_LINE_MAX (COMPRESSION_MAX_MSG_SIZE - 768)
|
||||
int pluginsd_isspace(char c);
|
||||
int config_isspace(char c);
|
||||
int group_by_label_isspace(char c);
|
||||
|
||||
extern bool isspace_map_pluginsd[256];
|
||||
extern bool isspace_map_config[256];
|
||||
extern bool isspace_map_group_by_label[256];
|
||||
|
||||
static inline size_t quoted_strings_splitter(char *str, char **words, size_t max_words, bool *isspace_map) {
|
||||
char *s = str, quote = 0;
|
||||
size_t i = 0;
|
||||
|
||||
// skip all white space
|
||||
while (unlikely(isspace_map[(uint8_t)*s]))
|
||||
s++;
|
||||
|
||||
if(unlikely(!*s)) {
|
||||
words[i] = NULL;
|
||||
return 0;
|
||||
}
|
||||
|
||||
// check for quote
|
||||
if (unlikely(*s == '\'' || *s == '"')) {
|
||||
quote = *s; // remember the quote
|
||||
s++; // skip the quote
|
||||
}
|
||||
|
||||
// store the first word
|
||||
words[i++] = s;
|
||||
|
||||
// while we have something
|
||||
while (likely(*s)) {
|
||||
// if it is an escape
|
||||
if (unlikely(*s == '\\' && s[1])) {
|
||||
s += 2;
|
||||
continue;
|
||||
}
|
||||
|
||||
// if it is a quote
|
||||
else if (unlikely(*s == quote)) {
|
||||
quote = 0;
|
||||
*s = ' ';
|
||||
continue;
|
||||
}
|
||||
|
||||
// if it is a space
|
||||
else if (unlikely(quote == 0 && isspace_map[(uint8_t)*s])) {
|
||||
// terminate the word
|
||||
*s++ = '\0';
|
||||
|
||||
// skip all white space
|
||||
while (likely(isspace_map[(uint8_t)*s]))
|
||||
s++;
|
||||
|
||||
// check for a quote
|
||||
if (unlikely(*s == '\'' || *s == '"')) {
|
||||
quote = *s; // remember the quote
|
||||
s++; // skip the quote
|
||||
}
|
||||
|
||||
// if we reached the end, stop
|
||||
if (unlikely(!*s))
|
||||
break;
|
||||
|
||||
// store the next word
|
||||
if (likely(i < max_words))
|
||||
words[i++] = s;
|
||||
else
|
||||
break;
|
||||
}
|
||||
|
||||
// anything else
|
||||
else
|
||||
s++;
|
||||
}
|
||||
|
||||
if (likely(i < max_words))
|
||||
words[i] = NULL;
|
||||
|
||||
return i;
|
||||
}
|
||||
|
||||
#define quoted_strings_splitter_query_group_by_label(str, words, max_words) \
|
||||
quoted_strings_splitter(str, words, max_words, isspace_map_group_by_label)
|
||||
|
||||
#define quoted_strings_splitter_config(str, words, max_words) \
|
||||
quoted_strings_splitter(str, words, max_words, isspace_map_config)
|
||||
|
||||
#define quoted_strings_splitter_pluginsd(str, words, max_words) \
|
||||
quoted_strings_splitter(str, words, max_words, isspace_map_pluginsd)
|
||||
|
||||
static inline char *get_word(char **words, size_t num_words, size_t index) {
|
||||
if (unlikely(index >= num_words))
|
||||
return NULL;
|
||||
|
||||
return words[index];
|
||||
}
|
||||
|
||||
bool run_command_and_copy_output_to_stdout(const char *command, int max_line_length);
|
||||
|
||||
|
@ -803,6 +708,8 @@ extern char *netdata_configured_host_prefix;
|
|||
#define XXH_INLINE_ALL
|
||||
#include "xxhash.h"
|
||||
|
||||
#include "uuid/uuid.h"
|
||||
|
||||
#include "libjudy/src/Judy.h"
|
||||
#include "july/july.h"
|
||||
#include "os.h"
|
||||
|
@ -812,7 +719,10 @@ extern char *netdata_configured_host_prefix;
|
|||
#include "circular_buffer/circular_buffer.h"
|
||||
#include "avl/avl.h"
|
||||
#include "inlined.h"
|
||||
#include "line_splitter/line_splitter.h"
|
||||
#include "clocks/clocks.h"
|
||||
#include "datetime/iso8601.h"
|
||||
#include "datetime/rfc7231.h"
|
||||
#include "completion/completion.h"
|
||||
#include "popen/popen.h"
|
||||
#include "simple_pattern/simple_pattern.h"
|
||||
|
@ -821,7 +731,9 @@ extern char *netdata_configured_host_prefix;
|
|||
#endif
|
||||
#include "socket/socket.h"
|
||||
#include "config/appconfig.h"
|
||||
#include "log/journal.h"
|
||||
#include "log/log.h"
|
||||
#include "buffered_reader/buffered_reader.h"
|
||||
#include "procfile/procfile.h"
|
||||
#include "string/string.h"
|
||||
#include "dictionary/dictionary.h"
|
||||
|
|
8
libnetdata/line_splitter/Makefile.am
Normal file
8
libnetdata/line_splitter/Makefile.am
Normal file
|
@ -0,0 +1,8 @@
|
|||
# SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
AUTOMAKE_OPTIONS = subdir-objects
|
||||
MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
|
||||
|
||||
dist_noinst_DATA = \
|
||||
README.md \
|
||||
$(NULL)
|
14
libnetdata/line_splitter/README.md
Normal file
14
libnetdata/line_splitter/README.md
Normal file
|
@ -0,0 +1,14 @@
|
|||
<!--
|
||||
title: "Log"
|
||||
custom_edit_url: https://github.com/netdata/netdata/edit/master/libnetdata/log/README.md
|
||||
sidebar_label: "Log"
|
||||
learn_status: "Published"
|
||||
learn_topic_type: "Tasks"
|
||||
learn_rel_path: "Developers/libnetdata"
|
||||
-->
|
||||
|
||||
# Log
|
||||
|
||||
The netdata log library supports debug, info, error and fatal error logging.
|
||||
By default we have an access log, an error log and a collectors log.
|
||||
|
69
libnetdata/line_splitter/line_splitter.c
Normal file
69
libnetdata/line_splitter/line_splitter.c
Normal file
|
@ -0,0 +1,69 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
|
||||
bool line_splitter_reconstruct_line(BUFFER *wb, void *ptr) {
|
||||
struct line_splitter *spl = ptr;
|
||||
if(!spl) return false;
|
||||
|
||||
size_t added = 0;
|
||||
for(size_t i = 0; i < spl->num_words ;i++) {
|
||||
if(i) buffer_fast_strcat(wb, " ", 1);
|
||||
|
||||
buffer_fast_strcat(wb, "'", 1);
|
||||
const char *s = get_word(spl->words, spl->num_words, i);
|
||||
buffer_strcat(wb, s?s:"");
|
||||
buffer_fast_strcat(wb, "'", 1);
|
||||
added++;
|
||||
}
|
||||
|
||||
return added > 0;
|
||||
}
|
||||
|
||||
inline int pluginsd_isspace(char c) {
|
||||
switch(c) {
|
||||
case ' ':
|
||||
case '\t':
|
||||
case '\r':
|
||||
case '\n':
|
||||
case '=':
|
||||
return 1;
|
||||
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
inline int config_isspace(char c) {
|
||||
switch (c) {
|
||||
case ' ':
|
||||
case '\t':
|
||||
case '\r':
|
||||
case '\n':
|
||||
case ',':
|
||||
return 1;
|
||||
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
inline int group_by_label_isspace(char c) {
|
||||
if(c == ',' || c == '|')
|
||||
return 1;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
bool isspace_map_pluginsd[256] = {};
|
||||
bool isspace_map_config[256] = {};
|
||||
bool isspace_map_group_by_label[256] = {};
|
||||
|
||||
__attribute__((constructor)) void initialize_is_space_arrays(void) {
|
||||
for(int c = 0; c < 256 ; c++) {
|
||||
isspace_map_pluginsd[c] = pluginsd_isspace((char) c);
|
||||
isspace_map_config[c] = config_isspace((char) c);
|
||||
isspace_map_group_by_label[c] = group_by_label_isspace((char) c);
|
||||
}
|
||||
}
|
120
libnetdata/line_splitter/line_splitter.h
Normal file
120
libnetdata/line_splitter/line_splitter.h
Normal file
|
@ -0,0 +1,120 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
#ifndef NETDATA_LINE_SPLITTER_H
|
||||
#define NETDATA_LINE_SPLITTER_H
|
||||
|
||||
#define PLUGINSD_MAX_WORDS 30
|
||||
|
||||
struct line_splitter {
|
||||
size_t count; // counts number of lines
|
||||
char *words[PLUGINSD_MAX_WORDS]; // an array of pointers for the words in this line
|
||||
size_t num_words; // the number of pointers used in this line
|
||||
};
|
||||
|
||||
bool line_splitter_reconstruct_line(BUFFER *wb, void *ptr);
|
||||
|
||||
static inline void line_splitter_reset(struct line_splitter *line) {
|
||||
line->num_words = 0;
|
||||
}
|
||||
|
||||
int pluginsd_isspace(char c);
|
||||
int config_isspace(char c);
|
||||
int group_by_label_isspace(char c);
|
||||
|
||||
extern bool isspace_map_pluginsd[256];
|
||||
extern bool isspace_map_config[256];
|
||||
extern bool isspace_map_group_by_label[256];
|
||||
|
||||
static inline size_t quoted_strings_splitter(char *str, char **words, size_t max_words, bool *isspace_map) {
|
||||
char *s = str, quote = 0;
|
||||
size_t i = 0;
|
||||
|
||||
// skip all white space
|
||||
while (unlikely(isspace_map[(uint8_t)*s]))
|
||||
s++;
|
||||
|
||||
if(unlikely(!*s)) {
|
||||
words[i] = NULL;
|
||||
return 0;
|
||||
}
|
||||
|
||||
// check for quote
|
||||
if (unlikely(*s == '\'' || *s == '"')) {
|
||||
quote = *s; // remember the quote
|
||||
s++; // skip the quote
|
||||
}
|
||||
|
||||
// store the first word
|
||||
words[i++] = s;
|
||||
|
||||
// while we have something
|
||||
while (likely(*s)) {
|
||||
// if it is an escape
|
||||
if (unlikely(*s == '\\' && s[1])) {
|
||||
s += 2;
|
||||
continue;
|
||||
}
|
||||
|
||||
// if it is a quote
|
||||
else if (unlikely(*s == quote)) {
|
||||
quote = 0;
|
||||
*s = ' ';
|
||||
continue;
|
||||
}
|
||||
|
||||
// if it is a space
|
||||
else if (unlikely(quote == 0 && isspace_map[(uint8_t)*s])) {
|
||||
// terminate the word
|
||||
*s++ = '\0';
|
||||
|
||||
// skip all white space
|
||||
while (likely(isspace_map[(uint8_t)*s]))
|
||||
s++;
|
||||
|
||||
// check for a quote
|
||||
if (unlikely(*s == '\'' || *s == '"')) {
|
||||
quote = *s; // remember the quote
|
||||
s++; // skip the quote
|
||||
}
|
||||
|
||||
// if we reached the end, stop
|
||||
if (unlikely(!*s))
|
||||
break;
|
||||
|
||||
// store the next word
|
||||
if (likely(i < max_words))
|
||||
words[i++] = s;
|
||||
else
|
||||
break;
|
||||
}
|
||||
|
||||
// anything else
|
||||
else
|
||||
s++;
|
||||
}
|
||||
|
||||
if (likely(i < max_words))
|
||||
words[i] = NULL;
|
||||
|
||||
return i;
|
||||
}
|
||||
|
||||
#define quoted_strings_splitter_query_group_by_label(str, words, max_words) \
|
||||
quoted_strings_splitter(str, words, max_words, isspace_map_group_by_label)
|
||||
|
||||
#define quoted_strings_splitter_config(str, words, max_words) \
|
||||
quoted_strings_splitter(str, words, max_words, isspace_map_config)
|
||||
|
||||
#define quoted_strings_splitter_pluginsd(str, words, max_words) \
|
||||
quoted_strings_splitter(str, words, max_words, isspace_map_pluginsd)
|
||||
|
||||
static inline char *get_word(char **words, size_t num_words, size_t index) {
|
||||
if (unlikely(index >= num_words))
|
||||
return NULL;
|
||||
|
||||
return words[index];
|
||||
}
|
||||
|
||||
#endif //NETDATA_LINE_SPLITTER_H
|
|
@ -5,4 +5,5 @@ MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
|
|||
|
||||
dist_noinst_DATA = \
|
||||
README.md \
|
||||
log2journal.md \
|
||||
$(NULL)
|
||||
|
|
|
@ -7,8 +7,196 @@ learn_topic_type: "Tasks"
|
|||
learn_rel_path: "Developers/libnetdata"
|
||||
-->
|
||||
|
||||
# Log
|
||||
# Netdata Logging
|
||||
|
||||
The netdata log library supports debug, info, error and fatal error logging.
|
||||
By default we have an access log, an error log and a collectors log.
|
||||
This document describes how Netdata generates its own logs, not how Netdata manages and queries logs databases.
|
||||
|
||||
## Log sources
|
||||
|
||||
Netdata supports the following log sources:
|
||||
|
||||
1. **daemon**, logs generated by Netdata daemon.
|
||||
2. **collector**, logs generated by Netdata collectors, including internal and external ones.
|
||||
3. **access**, API requests received by Netdata
|
||||
4. **health**, logs all alert transitions
|
||||
|
||||
## Log outputs
|
||||
|
||||
For each log source, Netdata supports the following output methods:
|
||||
|
||||
- **off**, to disable this log source
|
||||
- **journal**, to send the logs to systemd-journal.
|
||||
- **syslog**, to send the logs to syslog.
|
||||
- **system**, to send the output to `stderr` or `stdout` depending on the log source.
|
||||
- **stdout**, to write the logs to Netdata's `stdout`.
|
||||
- **stderr**, to write the logs to Netdata's `stderr`.
|
||||
- **filename**, to send the logs to a file.
|
||||
|
||||
For `daemon` and `collector` the default is `journal` when systemd-journal is available.
|
||||
To decide if systemd-journal is available, Netdata checks:
|
||||
|
||||
1. `stderr` is connected to systemd-journald
|
||||
2. `/run/systemd/journal/socket` exists
|
||||
3. `/host/run/systemd/journal/socket` exists (`/host` is configurable in containers)
|
||||
|
||||
If any of the above is detected, Netdata will select `journal` for `daemon` and `collector` sources.
|
||||
|
||||
All other sources default to a file.
|
||||
|
||||
## Log formats
|
||||
|
||||
Netdata supports the follow formats for its logs:
|
||||
|
||||
- **journal**, this is automatically selected when logging to systemd-journal.
|
||||
- **logfmt**, this is the default when logging to any output other than `journal`. In this format, Netdata annotates the fields to make them human readable.
|
||||
- **json**, to write logs lines in json format. The output is machine readable, similar to `journal`.
|
||||
|
||||
## Log levels
|
||||
|
||||
Each time Netdata logs, it assigns a priority to the log. It can be one of this (in order of importance):
|
||||
|
||||
- **emergency**, a fatal condition; most likely Netdata will exit immediately after,
|
||||
- **alert**, a very important issue that may affect how Netdata operates,
|
||||
- **critical**, a very important issue the user should know which, Netdata thinks it can survive,
|
||||
- **error**, an error condition indicating that Netdata is trying to do something but it fails,
|
||||
- **warning**, something that may or may not affect the operation of Netdata, but the outcome cannot be determined at the time Netdata logs,
|
||||
- **notice**, something that does not affect the operation of Netdata, but the user should notice,
|
||||
- **info**, the default log level about information the user should know,
|
||||
- **debug**, these are more verbose logs that can be ignored,
|
||||
|
||||
## Logs Configuration
|
||||
|
||||
In `netdata.conf`, there are the following settings:
|
||||
|
||||
```
|
||||
[logs]
|
||||
# logs to trigger flood protection = 600
|
||||
# logs flood protection period = 3600
|
||||
# facility = daemon
|
||||
# level = info
|
||||
# daemon = journal
|
||||
# collector = journal
|
||||
# access = /var/log/netdata/access.log
|
||||
# health = /var/log/netdata/health.log
|
||||
```
|
||||
|
||||
- `logs to trigger flood protection` and `logs flood protection period` enable logs flood protection for `daemon` and `collector` sources. It can also be configured per log source.
|
||||
- `facility` is used only when Netdata logs to syslog.
|
||||
- `level` defines the minimum [log level](#log-levels) of logs that will be logged. This setting is applied only to `daemon` and `collector` sources. It can also be configured per source.
|
||||
|
||||
### Configuring log sources
|
||||
|
||||
Each for the sources (`daemon`, `collector`, `access`, `health`), accepts the following:
|
||||
|
||||
```
|
||||
source = {FORMAT},level={LEVEL},protection={LOG}/{PERIOD}@{OUTPUT}
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
- `{FORMAT}`, is one of the [log formats](#log-formats),
|
||||
- `{LEVEL}`, is the minimum [log level](#log-levels) to be logged,
|
||||
- `{LOGS}` is the number of `logs to trigger flood protection` configured per output,
|
||||
- `{PERIOD}` is the equivalent of `logs flood protection period` configured per output,
|
||||
- `{OUTPUT}` is one of the `[log outputs](#log-outputs),
|
||||
|
||||
All parameters can be omitted, except `{OUTPUT}`. If `{OUTPUT}` is the only given parameter, `@` can be omitted.
|
||||
|
||||
### Logs rotation
|
||||
|
||||
Netdata comes with `logrotate` configuration to rotate its log files periodically.
|
||||
|
||||
The default is usually found in `/etc/logrotate.d/netdata`.
|
||||
|
||||
Sending a `SIGHUP` to Netdata, will instruct it to re-open all its log files.
|
||||
|
||||
## Log Fields
|
||||
|
||||
Netdata exposes the following fields to its logs:
|
||||
|
||||
| journal | logfmt | json | Description |
|
||||
|:--------------------------------------:|:------------------------------:|:------------------------------:|:---------------------------------------------------------------------------------------------------------:|
|
||||
| `_SOURCE_REALTIME_TIMESTAMP` | `time` | `time` | the timestamp of the event |
|
||||
| `SYSLOG_IDENTIFIER` | `comm` | `comm` | the program logging the event |
|
||||
| `ND_LOG_SOURCE` | `source` | `source` | one of the [log sources](#log-sources) |
|
||||
| `PRIORITY`<br/>numeric | `level`<br/>text | `level`<br/>numeric | one of the [log levels](#log-levels) |
|
||||
| `ERRNO` | `errno` | `errno` | the numeric value of `errno` |
|
||||
| `INVOCATION_ID` | - | - | a unique UUID of the Netdata session, reset on every Netdata restart, inherited by systemd when available |
|
||||
| `CODE_LINE` | - | - | the line number of of the source code logging this event |
|
||||
| `CODE_FILE` | - | - | the filename of the source code logging this event |
|
||||
| `CODE_FUNCTION` | - | - | the function name of the source code logging this event |
|
||||
| `TID` | `tid` | `tid` | the thread id of the thread logging this event |
|
||||
| `THREAD_TAG` | `thread` | `thread` | the name of the thread logging this event |
|
||||
| `MESSAGE_ID` | `msg_id` | `msg_id` | see [message IDs](#message-ids) |
|
||||
| `ND_MODULE` | `module` | `module` | the Netdata module logging this event |
|
||||
| `ND_NIDL_NODE` | `node` | `node` | the hostname of the node the event is related to |
|
||||
| `ND_NIDL_INSTANCE` | `instance` | `instance` | the instance of the node the event is related to |
|
||||
| `ND_NIDL_CONTEXT` | `context` | `context` | the context the event is related to (this is usually the chart name, as shown on netdata dashboards |
|
||||
| `ND_NIDL_DIMENSION` | `dimension` | `dimension` | the dimension the event is related to |
|
||||
| `ND_SRC_TRANSPORT` | `src_transport` | `src_transport` | when the event happened during a request, this is the request transport |
|
||||
| `ND_SRC_IP` | `src_ip` | `src_ip` | when the event happened during an inbound request, this is the IP the request came from |
|
||||
| `ND_SRC_PORT` | `src_port` | `src_port` | when the event happened during an inbound request, this is the port the request came from |
|
||||
| `ND_SRC_CAPABILITIES` | `src_capabilities` | `src_capabilities` | when the request came from a child, this is the communication capabilities of the child |
|
||||
| `ND_DST_TRANSPORT` | `dst_transport` | `dst_transport` | when the event happened during an outbound request, this is the outbound request transport |
|
||||
| `ND_DST_IP` | `dst_ip` | `dst_ip` | when the event happened during an outbound request, this is the IP the request destination |
|
||||
| `ND_DST_PORT` | `dst_port` | `dst_port` | when the event happened during an outbound request, this is the port the request destination |
|
||||
| `ND_DST_CAPABILITIES` | `dst_capabilities` | `dst_capabilities` | when the request goes to a parent, this is the communication capabilities of the parent |
|
||||
| `ND_REQUEST_METHOD` | `req_method` | `req_method` | when the event happened during an inbound request, this is the method the request was received |
|
||||
| `ND_RESPONSE_CODE` | `code` | `code` | when responding to a request, this this the response code |
|
||||
| `ND_CONNECTION_ID` | `conn` | `conn` | when there is a connection id for an inbound connection, this is the connection id |
|
||||
| `ND_TRANSACTION_ID` | `transaction` | `transaction` | the transaction id (UUID) of all API requests |
|
||||
| `ND_RESPONSE_SENT_BYTES` | `sent_bytes` | `sent_bytes` | the bytes we sent to API responses |
|
||||
| `ND_RESPONSE_SIZE_BYTES` | `size_bytes` | `size_bytes` | the uncompressed bytes of the API responses |
|
||||
| `ND_RESPONSE_PREP_TIME_USEC` | `prep_ut` | `prep_ut` | the time needed to prepare a response |
|
||||
| `ND_RESPONSE_SENT_TIME_USEC` | `sent_ut` | `sent_ut` | the time needed to send a response |
|
||||
| `ND_RESPONSE_TOTAL_TIME_USEC` | `total_ut` | `total_ut` | the total time needed to complete a response |
|
||||
| `ND_ALERT_ID` | `alert_id` | `alert_id` | the alert id this event is related to |
|
||||
| `ND_ALERT_EVENT_ID` | `alert_event_id` | `alert_event_id` | a sequential number of the alert transition (per host) |
|
||||
| `ND_ALERT_UNIQUE_ID` | `alert_unique_id` | `alert_unique_id` | a sequential number of the alert transition (per alert) |
|
||||
| `ND_ALERT_TRANSITION_ID` | `alert_transition_id` | `alert_transition_id` | the unique UUID of this alert transition |
|
||||
| `ND_ALERT_CONFIG` | `alert_config` | `alert_config` | the alert configuration hash (UUID) |
|
||||
| `ND_ALERT_NAME` | `alert` | `alert` | the alert name |
|
||||
| `ND_ALERT_CLASS` | `alert_class` | `alert_class` | the alert classification |
|
||||
| `ND_ALERT_COMPONENT` | `alert_component` | `alert_component` | the alert component |
|
||||
| `ND_ALERT_TYPE` | `alert_type` | `alert_type` | the alert type |
|
||||
| `ND_ALERT_EXEC` | `alert_exec` | `alert_exec` | the alert notification program |
|
||||
| `ND_ALERT_RECIPIENT` | `alert_recipient` | `alert_recipient` | the alert recipient(s) |
|
||||
| `ND_ALERT_VALUE` | `alert_value` | `alert_value` | the current alert value |
|
||||
| `ND_ALERT_VALUE_OLD` | `alert_value_old` | `alert_value_old` | the previous alert value |
|
||||
| `ND_ALERT_STATUS` | `alert_status` | `alert_status` | the current alert status |
|
||||
| `ND_ALERT_STATUS_OLD` | `alert_value_old` | `alert_value_old` | the previous alert value |
|
||||
| `ND_ALERT_UNITS` | `alert_units` | `alert_units` | the units of the alert |
|
||||
| `ND_ALERT_SUMMARY` | `alert_summary` | `alert_summary` | the summary text of the alert |
|
||||
| `ND_ALERT_INFO` | `alert_info` | `alert_info` | the info text of the alert |
|
||||
| `ND_ALERT_DURATION` | `alert_duration` | `alert_duration` | the duration the alert was in its previous state |
|
||||
| `ND_ALERT_NOTIFICATION_TIMESTAMP_USEC` | `alert_notification_timestamp` | `alert_notification_timestamp` | the timestamp the notification delivery is scheduled |
|
||||
| `ND_REQUEST` | `request` | `request` | the full request during which the event happened |
|
||||
| `MESSAGE` | `msg` | `msg` | the event message |
|
||||
|
||||
|
||||
### Message IDs
|
||||
|
||||
Netdata assigns specific message IDs to certain events:
|
||||
|
||||
- `ed4cdb8f1beb4ad3b57cb3cae2d162fa` when a Netdata child connects to this Netdata
|
||||
- `6e2e3839067648968b646045dbf28d66` when this Netdata connects to a Netdata parent
|
||||
- `9ce0cb58ab8b44df82c4bf1ad9ee22de` when alerts change state
|
||||
- `6db0018e83e34320ae2a659d78019fb7` when notifications are sent
|
||||
|
||||
You can view these events using the Netdata systemd-journal.plugin at the `MESSAGE_ID` filter,
|
||||
or using `journalctl` like this:
|
||||
|
||||
```bash
|
||||
# query children connection
|
||||
journalctl MESSAGE_ID=ed4cdb8f1beb4ad3b57cb3cae2d162fa
|
||||
|
||||
# query parent connection
|
||||
journalctl MESSAGE_ID=6e2e3839067648968b646045dbf28d66
|
||||
|
||||
# query alert transitions
|
||||
journalctl MESSAGE_ID=9ce0cb58ab8b44df82c4bf1ad9ee22de
|
||||
|
||||
# query alert notifications
|
||||
journalctl MESSAGE_ID=6db0018e83e34320ae2a659d78019fb7
|
||||
```
|
||||
|
||||
|
|
138
libnetdata/log/journal.c
Normal file
138
libnetdata/log/journal.c
Normal file
|
@ -0,0 +1,138 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "journal.h"
|
||||
|
||||
bool is_path_unix_socket(const char *path) {
|
||||
if(!path || !*path)
|
||||
return false;
|
||||
|
||||
struct stat statbuf;
|
||||
|
||||
// Check if the path is valid
|
||||
if (!path || !*path)
|
||||
return false;
|
||||
|
||||
// Use stat to check if the file exists and is a socket
|
||||
if (stat(path, &statbuf) == -1)
|
||||
// The file does not exist or cannot be accessed
|
||||
return false;
|
||||
|
||||
// Check if the file is a socket
|
||||
if (S_ISSOCK(statbuf.st_mode))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
bool is_stderr_connected_to_journal(void) {
|
||||
const char *journal_stream = getenv("JOURNAL_STREAM");
|
||||
if (!journal_stream)
|
||||
return false; // JOURNAL_STREAM is not set
|
||||
|
||||
struct stat stderr_stat;
|
||||
if (fstat(STDERR_FILENO, &stderr_stat) < 0)
|
||||
return false; // Error in getting stderr info
|
||||
|
||||
// Parse device and inode from JOURNAL_STREAM
|
||||
char *endptr;
|
||||
long journal_dev = strtol(journal_stream, &endptr, 10);
|
||||
if (*endptr != ':')
|
||||
return false; // Format error in JOURNAL_STREAM
|
||||
|
||||
long journal_ino = strtol(endptr + 1, NULL, 10);
|
||||
|
||||
return (stderr_stat.st_dev == (dev_t)journal_dev) && (stderr_stat.st_ino == (ino_t)journal_ino);
|
||||
}
|
||||
|
||||
int journal_direct_fd(const char *path) {
|
||||
if(!path || !*path)
|
||||
path = JOURNAL_DIRECT_SOCKET;
|
||||
|
||||
if(!is_path_unix_socket(path))
|
||||
return -1;
|
||||
|
||||
int fd = socket(AF_UNIX, SOCK_DGRAM, 0);
|
||||
if (fd < 0) return -1;
|
||||
|
||||
struct sockaddr_un addr;
|
||||
memset(&addr, 0, sizeof(struct sockaddr_un));
|
||||
addr.sun_family = AF_UNIX;
|
||||
strncpy(addr.sun_path, path, sizeof(addr.sun_path) - 1);
|
||||
|
||||
// Connect the socket (optional, but can simplify send operations)
|
||||
if (connect(fd, (struct sockaddr *)&addr, sizeof(addr)) < 0) {
|
||||
close(fd);
|
||||
return -1;
|
||||
}
|
||||
|
||||
return fd;
|
||||
}
|
||||
|
||||
static inline bool journal_send_with_memfd(int fd, const char *msg, size_t msg_len) {
|
||||
#if defined(__NR_memfd_create) && defined(MFD_ALLOW_SEALING) && defined(F_ADD_SEALS) && defined(F_SEAL_SHRINK) && defined(F_SEAL_GROW) && defined(F_SEAL_WRITE)
|
||||
// Create a memory file descriptor
|
||||
int memfd = (int)syscall(__NR_memfd_create, "journald", MFD_ALLOW_SEALING);
|
||||
if (memfd < 0) return false;
|
||||
|
||||
// Write data to the memfd
|
||||
if (write(memfd, msg, msg_len) != (ssize_t)msg_len) {
|
||||
close(memfd);
|
||||
return false;
|
||||
}
|
||||
|
||||
// Seal the memfd to make it immutable
|
||||
if (fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_WRITE) < 0) {
|
||||
close(memfd);
|
||||
return false;
|
||||
}
|
||||
|
||||
struct iovec iov = {0};
|
||||
struct msghdr msghdr = {0};
|
||||
struct cmsghdr *cmsghdr;
|
||||
char cmsgbuf[CMSG_SPACE(sizeof(int))];
|
||||
|
||||
msghdr.msg_iov = &iov;
|
||||
msghdr.msg_iovlen = 1;
|
||||
msghdr.msg_control = cmsgbuf;
|
||||
msghdr.msg_controllen = sizeof(cmsgbuf);
|
||||
|
||||
cmsghdr = CMSG_FIRSTHDR(&msghdr);
|
||||
cmsghdr->cmsg_level = SOL_SOCKET;
|
||||
cmsghdr->cmsg_type = SCM_RIGHTS;
|
||||
cmsghdr->cmsg_len = CMSG_LEN(sizeof(int));
|
||||
memcpy(CMSG_DATA(cmsghdr), &memfd, sizeof(int));
|
||||
|
||||
ssize_t r = sendmsg(fd, &msghdr, 0);
|
||||
|
||||
close(memfd);
|
||||
return r >= 0;
|
||||
#else
|
||||
return false;
|
||||
#endif
|
||||
}
|
||||
|
||||
bool journal_direct_send(int fd, const char *msg, size_t msg_len) {
|
||||
// Send the datagram
|
||||
if (send(fd, msg, msg_len, 0) < 0) {
|
||||
if(errno != EMSGSIZE)
|
||||
return false;
|
||||
|
||||
// datagram is too large, fallback to memfd
|
||||
if(!journal_send_with_memfd(fd, msg, msg_len))
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
void journal_construct_path(char *dst, size_t dst_len, const char *host_prefix, const char *namespace_str) {
|
||||
if(!host_prefix)
|
||||
host_prefix = "";
|
||||
|
||||
if(namespace_str)
|
||||
snprintfz(dst, dst_len, "%s/run/systemd/journal.%s/socket",
|
||||
host_prefix, namespace_str);
|
||||
else
|
||||
snprintfz(dst, dst_len, "%s" JOURNAL_DIRECT_SOCKET,
|
||||
host_prefix);
|
||||
}
|
18
libnetdata/log/journal.h
Normal file
18
libnetdata/log/journal.h
Normal file
|
@ -0,0 +1,18 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
#ifndef NETDATA_LOG_JOURNAL_H
|
||||
#define NETDATA_LOG_JOURNAL_H
|
||||
|
||||
#define JOURNAL_DIRECT_SOCKET "/run/systemd/journal/socket"
|
||||
|
||||
void journal_construct_path(char *dst, size_t dst_len, const char *host_prefix, const char *namespace_str);
|
||||
|
||||
int journal_direct_fd(const char *path);
|
||||
bool journal_direct_send(int fd, const char *msg, size_t msg_len);
|
||||
|
||||
bool is_path_unix_socket(const char *path);
|
||||
bool is_stderr_connected_to_journal(void);
|
||||
|
||||
#endif //NETDATA_LOG_JOURNAL_H
|
3294
libnetdata/log/log.c
3294
libnetdata/log/log.c
File diff suppressed because it is too large
Load diff
|
@ -9,6 +9,181 @@ extern "C" {
|
|||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
#define ND_LOG_DEFAULT_THROTTLE_LOGS 1200
|
||||
#define ND_LOG_DEFAULT_THROTTLE_PERIOD 3600
|
||||
|
||||
typedef enum __attribute__((__packed__)) {
|
||||
NDLS_UNSET = 0, // internal use only
|
||||
NDLS_ACCESS, // access.log
|
||||
NDLS_ACLK, // aclk.log
|
||||
NDLS_COLLECTORS, // collectors.log
|
||||
NDLS_DAEMON, // error.log
|
||||
NDLS_HEALTH, // health.log
|
||||
NDLS_DEBUG, // debug.log
|
||||
|
||||
// terminator
|
||||
_NDLS_MAX,
|
||||
} ND_LOG_SOURCES;
|
||||
|
||||
typedef enum __attribute__((__packed__)) {
|
||||
NDLP_EMERG = LOG_EMERG,
|
||||
NDLP_ALERT = LOG_ALERT,
|
||||
NDLP_CRIT = LOG_CRIT,
|
||||
NDLP_ERR = LOG_ERR,
|
||||
NDLP_WARNING = LOG_WARNING,
|
||||
NDLP_NOTICE = LOG_NOTICE,
|
||||
NDLP_INFO = LOG_INFO,
|
||||
NDLP_DEBUG = LOG_DEBUG,
|
||||
} ND_LOG_FIELD_PRIORITY;
|
||||
|
||||
typedef enum __attribute__((__packed__)) {
|
||||
// KEEP THESE IN THE SAME ORDER AS in thread_log_fields (log.c)
|
||||
// so that it easy to audit for missing fields
|
||||
|
||||
NDF_STOP = 0,
|
||||
NDF_TIMESTAMP_REALTIME_USEC, // the timestamp of the log message - added automatically
|
||||
NDF_SYSLOG_IDENTIFIER, // the syslog identifier of the application - added automatically
|
||||
NDF_LOG_SOURCE, // DAEMON, COLLECTORS, HEALTH, ACCESS, ACLK - set at the log call
|
||||
NDF_PRIORITY, // the syslog priority (severity) - set at the log call
|
||||
NDF_ERRNO, // the ERRNO at the time of the log call - added automatically
|
||||
NDF_INVOCATION_ID, // the INVOCATION_ID of Netdata - added automatically
|
||||
NDF_LINE, // the source code file line number - added automatically
|
||||
NDF_FILE, // the source code filename - added automatically
|
||||
NDF_FUNC, // the source code function - added automatically
|
||||
NDF_TID, // the thread ID of the thread logging - added automatically
|
||||
NDF_THREAD_TAG, // the thread tag of the thread logging - added automatically
|
||||
NDF_MESSAGE_ID, // for specific events
|
||||
NDF_MODULE, // for internal plugin module, all other get the NDF_THREAD_TAG
|
||||
|
||||
NDF_NIDL_NODE, // the node / rrdhost currently being worked
|
||||
NDF_NIDL_INSTANCE, // the instance / rrdset currently being worked
|
||||
NDF_NIDL_CONTEXT, // the context of the instance currently being worked
|
||||
NDF_NIDL_DIMENSION, // the dimension / rrddim currently being worked
|
||||
|
||||
// web server, aclk and stream receiver
|
||||
NDF_SRC_TRANSPORT, // the transport we received the request, one of: http, https, pluginsd
|
||||
|
||||
// web server and stream receiver
|
||||
NDF_SRC_IP, // the streaming / web server source IP
|
||||
NDF_SRC_PORT, // the streaming / web server source Port
|
||||
NDF_SRC_CAPABILITIES, // the stream receiver capabilities
|
||||
|
||||
// stream sender (established links)
|
||||
NDF_DST_TRANSPORT, // the transport we send the request, one of: http, https
|
||||
NDF_DST_IP, // the destination streaming IP
|
||||
NDF_DST_PORT, // the destination streaming Port
|
||||
NDF_DST_CAPABILITIES, // the destination streaming capabilities
|
||||
|
||||
// web server, aclk and stream receiver
|
||||
NDF_REQUEST_METHOD, // for http like requests, the http request method
|
||||
NDF_RESPONSE_CODE, // for http like requests, the http response code, otherwise a status string
|
||||
|
||||
// web server (all), aclk (queries)
|
||||
NDF_CONNECTION_ID, // the web server connection ID
|
||||
NDF_TRANSACTION_ID, // the web server and API transaction ID
|
||||
NDF_RESPONSE_SENT_BYTES, // for http like requests, the response bytes
|
||||
NDF_RESPONSE_SIZE_BYTES, // for http like requests, the uncompressed response size
|
||||
NDF_RESPONSE_PREPARATION_TIME_USEC, // for http like requests, the preparation time
|
||||
NDF_RESPONSE_SENT_TIME_USEC, // for http like requests, the time to send the response back
|
||||
NDF_RESPONSE_TOTAL_TIME_USEC, // for http like requests, the total time to complete the response
|
||||
|
||||
// health alerts
|
||||
NDF_ALERT_ID,
|
||||
NDF_ALERT_UNIQUE_ID,
|
||||
NDF_ALERT_EVENT_ID,
|
||||
NDF_ALERT_TRANSITION_ID,
|
||||
NDF_ALERT_CONFIG_HASH,
|
||||
NDF_ALERT_NAME,
|
||||
NDF_ALERT_CLASS,
|
||||
NDF_ALERT_COMPONENT,
|
||||
NDF_ALERT_TYPE,
|
||||
NDF_ALERT_EXEC,
|
||||
NDF_ALERT_RECIPIENT,
|
||||
NDF_ALERT_DURATION,
|
||||
NDF_ALERT_VALUE,
|
||||
NDF_ALERT_VALUE_OLD,
|
||||
NDF_ALERT_STATUS,
|
||||
NDF_ALERT_STATUS_OLD,
|
||||
NDF_ALERT_SOURCE,
|
||||
NDF_ALERT_UNITS,
|
||||
NDF_ALERT_SUMMARY,
|
||||
NDF_ALERT_INFO,
|
||||
NDF_ALERT_NOTIFICATION_REALTIME_USEC,
|
||||
// NDF_ALERT_FLAGS,
|
||||
|
||||
// put new items here
|
||||
// leave the request URL and the message last
|
||||
|
||||
NDF_REQUEST, // the request we are currently working on
|
||||
NDF_MESSAGE, // the log message, if any
|
||||
|
||||
// terminator
|
||||
_NDF_MAX,
|
||||
} ND_LOG_FIELD_ID;
|
||||
|
||||
typedef enum __attribute__((__packed__)) {
|
||||
NDFT_UNSET = 0,
|
||||
NDFT_TXT,
|
||||
NDFT_STR,
|
||||
NDFT_BFR,
|
||||
NDFT_U64,
|
||||
NDFT_I64,
|
||||
NDFT_DBL,
|
||||
NDFT_UUID,
|
||||
NDFT_CALLBACK,
|
||||
} ND_LOG_STACK_FIELD_TYPE;
|
||||
|
||||
void nd_log_set_user_settings(ND_LOG_SOURCES source, const char *setting);
|
||||
void nd_log_set_facility(const char *facility);
|
||||
void nd_log_set_priority_level(const char *setting);
|
||||
void nd_log_initialize(void);
|
||||
void nd_log_reopen_log_files(void);
|
||||
void chown_open_file(int fd, uid_t uid, gid_t gid);
|
||||
void nd_log_chown_log_files(uid_t uid, gid_t gid);
|
||||
void nd_log_set_flood_protection(size_t logs, time_t period);
|
||||
void nd_log_initialize_for_external_plugins(const char *name);
|
||||
void nd_log_set_thread_source(ND_LOG_SOURCES source);
|
||||
bool nd_log_journal_socket_available(void);
|
||||
ND_LOG_FIELD_ID nd_log_field_id_by_name(const char *field, size_t len);
|
||||
int nd_log_priority2id(const char *priority);
|
||||
|
||||
typedef bool (*log_formatter_callback_t)(BUFFER *wb, void *data);
|
||||
|
||||
struct log_stack_entry {
|
||||
ND_LOG_FIELD_ID id;
|
||||
ND_LOG_STACK_FIELD_TYPE type;
|
||||
bool set;
|
||||
union {
|
||||
const char *txt;
|
||||
struct netdata_string *str;
|
||||
BUFFER *bfr;
|
||||
uint64_t u64;
|
||||
int64_t i64;
|
||||
double dbl;
|
||||
const uuid_t *uuid;
|
||||
struct {
|
||||
log_formatter_callback_t formatter;
|
||||
void *formatter_data;
|
||||
} cb;
|
||||
};
|
||||
};
|
||||
|
||||
#define ND_LOG_STACK _cleanup_(log_stack_pop) struct log_stack_entry
|
||||
#define ND_LOG_STACK_PUSH(lgs) log_stack_push(lgs)
|
||||
|
||||
#define ND_LOG_FIELD_TXT(field, value) (struct log_stack_entry){ .id = (field), .type = NDFT_TXT, .txt = (value), .set = true, }
|
||||
#define ND_LOG_FIELD_STR(field, value) (struct log_stack_entry){ .id = (field), .type = NDFT_STR, .str = (value), .set = true, }
|
||||
#define ND_LOG_FIELD_BFR(field, value) (struct log_stack_entry){ .id = (field), .type = NDFT_BFR, .bfr = (value), .set = true, }
|
||||
#define ND_LOG_FIELD_U64(field, value) (struct log_stack_entry){ .id = (field), .type = NDFT_U64, .u64 = (value), .set = true, }
|
||||
#define ND_LOG_FIELD_I64(field, value) (struct log_stack_entry){ .id = (field), .type = NDFT_I64, .i64 = (value), .set = true, }
|
||||
#define ND_LOG_FIELD_DBL(field, value) (struct log_stack_entry){ .id = (field), .type = NDFT_DBL, .dbl = (value), .set = true, }
|
||||
#define ND_LOG_FIELD_CB(field, func, data) (struct log_stack_entry){ .id = (field), .type = NDFT_CALLBACK, .cb = { .formatter = (func), .formatter_data = (data) }, .set = true, }
|
||||
#define ND_LOG_FIELD_UUID(field, value) (struct log_stack_entry){ .id = (field), .type = NDFT_UUID, .uuid = (value), .set = true, }
|
||||
#define ND_LOG_FIELD_END() (struct log_stack_entry){ .id = NDF_STOP, .type = NDFT_UNSET, .set = false, }
|
||||
|
||||
void log_stack_pop(void *ptr);
|
||||
void log_stack_push(struct log_stack_entry *lgs);
|
||||
|
||||
#define D_WEB_BUFFER 0x0000000000000001
|
||||
#define D_WEB_CLIENT 0x0000000000000002
|
||||
#define D_LISTENER 0x0000000000000004
|
||||
|
@ -46,114 +221,75 @@ extern "C" {
|
|||
#define D_REPLICATION 0x0000002000000000
|
||||
#define D_SYSTEM 0x8000000000000000
|
||||
|
||||
extern int web_server_is_multithreaded;
|
||||
|
||||
extern uint64_t debug_flags;
|
||||
|
||||
extern const char *program_name;
|
||||
|
||||
extern int stdaccess_fd;
|
||||
extern FILE *stdaccess;
|
||||
|
||||
extern int stdhealth_fd;
|
||||
extern FILE *stdhealth;
|
||||
|
||||
extern int stdcollector_fd;
|
||||
extern FILE *stderror;
|
||||
|
||||
extern const char *stdaccess_filename;
|
||||
extern const char *stderr_filename;
|
||||
extern const char *stdout_filename;
|
||||
extern const char *stdhealth_filename;
|
||||
extern const char *stdcollector_filename;
|
||||
extern const char *facility_log;
|
||||
|
||||
#ifdef ENABLE_ACLK
|
||||
extern const char *aclklog_filename;
|
||||
extern int aclklog_fd;
|
||||
extern FILE *aclklog;
|
||||
extern int aclklog_enabled;
|
||||
#endif
|
||||
|
||||
extern int access_log_syslog;
|
||||
extern int error_log_syslog;
|
||||
extern int output_log_syslog;
|
||||
extern int health_log_syslog;
|
||||
|
||||
extern time_t error_log_throttle_period;
|
||||
extern unsigned long error_log_errors_per_period, error_log_errors_per_period_backup;
|
||||
int error_log_limit(int reset);
|
||||
|
||||
void open_all_log_files();
|
||||
void reopen_all_log_files();
|
||||
|
||||
#define LOG_DATE_LENGTH 26
|
||||
void log_date(char *buffer, size_t len, time_t now);
|
||||
|
||||
static inline void debug_dummy(void) {}
|
||||
|
||||
void error_log_limit_reset(void);
|
||||
void error_log_limit_unlimited(void);
|
||||
void nd_log_limits_reset(void);
|
||||
void nd_log_limits_unlimited(void);
|
||||
|
||||
typedef struct error_with_limit {
|
||||
time_t log_every;
|
||||
size_t count;
|
||||
time_t last_logged;
|
||||
usec_t sleep_ut;
|
||||
} ERROR_LIMIT;
|
||||
|
||||
typedef enum netdata_log_level {
|
||||
NETDATA_LOG_LEVEL_ERROR,
|
||||
NETDATA_LOG_LEVEL_INFO,
|
||||
|
||||
NETDATA_LOG_LEVEL_END
|
||||
} netdata_log_level_t;
|
||||
|
||||
#define NETDATA_LOG_LEVEL_INFO_STR "info"
|
||||
#define NETDATA_LOG_LEVEL_ERROR_STR "error"
|
||||
#define NETDATA_LOG_LEVEL_ERROR_SHORT_STR "err"
|
||||
|
||||
extern netdata_log_level_t global_log_severity_level;
|
||||
netdata_log_level_t log_severity_string_to_severity_level(char *level);
|
||||
char *log_severity_level_to_severity_string(netdata_log_level_t level);
|
||||
void log_set_global_severity_level(netdata_log_level_t value);
|
||||
void log_set_global_severity_for_external_plugins();
|
||||
|
||||
#define error_limit_static_global_var(var, log_every_secs, sleep_usecs) static ERROR_LIMIT var = { .last_logged = 0, .count = 0, .log_every = (log_every_secs), .sleep_ut = (sleep_usecs) }
|
||||
#define error_limit_static_thread_var(var, log_every_secs, sleep_usecs) static __thread ERROR_LIMIT var = { .last_logged = 0, .count = 0, .log_every = (log_every_secs), .sleep_ut = (sleep_usecs) }
|
||||
#define NDLP_INFO_STR "info"
|
||||
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
#define netdata_log_debug(type, args...) do { if(unlikely(debug_flags & type)) debug_int(__FILE__, __FUNCTION__, __LINE__, ##args); } while(0)
|
||||
#define internal_error(condition, args...) do { if(unlikely(condition)) error_int(0, "IERR", __FILE__, __FUNCTION__, __LINE__, ##args); } while(0)
|
||||
#define internal_fatal(condition, args...) do { if(unlikely(condition)) fatal_int(__FILE__, __FUNCTION__, __LINE__, ##args); } while(0)
|
||||
#define netdata_log_debug(type, args...) do { if(unlikely(debug_flags & type)) netdata_logger(NDLS_DEBUG, NDLP_DEBUG, __FILE__, __FUNCTION__, __LINE__, ##args); } while(0)
|
||||
#define internal_error(condition, args...) do { if(unlikely(condition)) netdata_logger(NDLS_DAEMON, NDLP_DEBUG, __FILE__, __FUNCTION__, __LINE__, ##args); } while(0)
|
||||
#define internal_fatal(condition, args...) do { if(unlikely(condition)) netdata_logger_fatal(__FILE__, __FUNCTION__, __LINE__, ##args); } while(0)
|
||||
#else
|
||||
#define netdata_log_debug(type, args...) debug_dummy()
|
||||
#define internal_error(args...) debug_dummy()
|
||||
#define internal_fatal(args...) debug_dummy()
|
||||
#endif
|
||||
|
||||
#define netdata_log_info(args...) info_int(0, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define collector_info(args...) info_int(1, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define infoerr(args...) error_int(0, "INFO", __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define netdata_log_error(args...) error_int(0, "ERROR", __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define collector_infoerr(args...) error_int(1, "INFO", __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define collector_error(args...) error_int(1, "ERROR", __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define error_limit(erl, args...) error_limit_int(erl, "ERROR", __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define fatal(args...) fatal_int(__FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define fatal_assert(expr) ((expr) ? (void)(0) : fatal_int(__FILE__, __FUNCTION__, __LINE__, "Assertion `%s' failed", #expr))
|
||||
#define fatal(args...) netdata_logger_fatal(__FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define fatal_assert(expr) ((expr) ? (void)(0) : netdata_logger_fatal(__FILE__, __FUNCTION__, __LINE__, "Assertion `%s' failed", #expr))
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// normal logging
|
||||
|
||||
void netdata_logger(ND_LOG_SOURCES source, ND_LOG_FIELD_PRIORITY priority, const char *file, const char *function, unsigned long line, const char *fmt, ... ) PRINTFLIKE(6, 7);
|
||||
#define nd_log(NDLS, NDLP, args...) netdata_logger(NDLS, NDLP, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define nd_log_daemon(NDLP, args...) netdata_logger(NDLS_DAEMON, NDLP, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define nd_log_collector(NDLP, args...) netdata_logger(NDLS_COLLECTORS, NDLP, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
|
||||
#define netdata_log_info(args...) netdata_logger(NDLS_DAEMON, NDLP_INFO, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define netdata_log_error(args...) netdata_logger(NDLS_DAEMON, NDLP_ERR, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define collector_info(args...) netdata_logger(NDLS_COLLECTORS, NDLP_INFO, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
#define collector_error(args...) netdata_logger(NDLS_COLLECTORS, NDLP_ERR, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
|
||||
#define log_aclk_message_bin(__data, __data_len, __tx, __mqtt_topic, __message_name) \
|
||||
nd_log(NDLS_ACLK, NDLP_INFO, \
|
||||
"direction:%s message:'%s' topic:'%s' json:'%.*s'", \
|
||||
(__tx) ? "OUTGOING" : "INCOMING", __message_name, __mqtt_topic, (int)(__data_len), __data)
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// logging with limits
|
||||
|
||||
typedef struct error_with_limit {
|
||||
SPINLOCK spinlock;
|
||||
time_t log_every;
|
||||
size_t count;
|
||||
time_t last_logged;
|
||||
usec_t sleep_ut;
|
||||
} ERROR_LIMIT;
|
||||
|
||||
#define nd_log_limit_static_global_var(var, log_every_secs, sleep_usecs) static ERROR_LIMIT var = { .last_logged = 0, .count = 0, .log_every = (log_every_secs), .sleep_ut = (sleep_usecs) }
|
||||
#define nd_log_limit_static_thread_var(var, log_every_secs, sleep_usecs) static __thread ERROR_LIMIT var = { .last_logged = 0, .count = 0, .log_every = (log_every_secs), .sleep_ut = (sleep_usecs) }
|
||||
void netdata_logger_with_limit(ERROR_LIMIT *erl, ND_LOG_SOURCES source, ND_LOG_FIELD_PRIORITY priority, const char *file, const char *function, unsigned long line, const char *fmt, ... ) PRINTFLIKE(7, 8);;
|
||||
#define nd_log_limit(erl, NDLS, NDLP, args...) netdata_logger_with_limit(erl, NDLS, NDLP, __FILE__, __FUNCTION__, __LINE__, ##args)
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
|
||||
void send_statistics(const char *action, const char *action_result, const char *action_data);
|
||||
void debug_int( const char *file, const char *function, const unsigned long line, const char *fmt, ... ) PRINTFLIKE(4, 5);
|
||||
void info_int( int is_collector, const char *file, const char *function, const unsigned long line, const char *fmt, ... ) PRINTFLIKE(5, 6);
|
||||
void error_int( int is_collector, const char *prefix, const char *file, const char *function, const unsigned long line, const char *fmt, ... ) PRINTFLIKE(6, 7);
|
||||
void error_limit_int(ERROR_LIMIT *erl, const char *prefix, const char *file __maybe_unused, const char *function __maybe_unused, unsigned long line __maybe_unused, const char *fmt, ... ) PRINTFLIKE(6, 7);;
|
||||
void fatal_int( const char *file, const char *function, const unsigned long line, const char *fmt, ... ) NORETURN PRINTFLIKE(4, 5);
|
||||
void netdata_log_access( const char *fmt, ... ) PRINTFLIKE(1, 2);
|
||||
void netdata_log_health( const char *fmt, ... ) PRINTFLIKE(1, 2);
|
||||
|
||||
#ifdef ENABLE_ACLK
|
||||
void log_aclk_message_bin( const char *data, const size_t data_len, int tx, const char *mqtt_topic, const char *message_name);
|
||||
#endif
|
||||
void netdata_logger_fatal( const char *file, const char *function, unsigned long line, const char *fmt, ... ) NORETURN PRINTFLIKE(4, 5);
|
||||
|
||||
# ifdef __cplusplus
|
||||
}
|
||||
|
|
1015
libnetdata/log/log2journal.c
Normal file
1015
libnetdata/log/log2journal.c
Normal file
File diff suppressed because it is too large
Load diff
518
libnetdata/log/log2journal.md
Normal file
518
libnetdata/log/log2journal.md
Normal file
|
@ -0,0 +1,518 @@
|
|||
# log2journal
|
||||
|
||||
`log2journal` and `systemd-cat-native` can be used to convert a structured log file, such as the ones generated by web servers, into `systemd-journal` entries.
|
||||
|
||||
By combining these tools, together with the usual UNIX shell tools you can create advanced log processing pipelines sending any kind of structured text logs to systemd-journald. This is a simple, but powerful and efficient way to handle log processing.
|
||||
|
||||
The process involves the usual piping of shell commands, to get and process the log files in realtime.
|
||||
|
||||
The overall process looks like this:
|
||||
|
||||
```bash
|
||||
tail -F /var/log/nginx/*.log |\ # outputs log lines
|
||||
log2journal 'PATTERN' |\ # outputs Journal Export Format
|
||||
sed -u -e SEARCH-REPLACE-RULES |\ # optional rewriting rules
|
||||
systemd-cat-native # send to local/remote journald
|
||||
```
|
||||
|
||||
Let's see the steps:
|
||||
|
||||
1. `tail -F /var/log/nginx/*.log`<br/>this command will tail all `*.log` files in `/var/log/nginx/`. We use `-F` instead of `-f` to ensure that files will still be tailed after log rotation.
|
||||
2. `log2joural` is a Netdata program. It reads log entries and extracts fields, according to the PCRE2 pattern it accepts. It can also apply some basic operations on the fields, like injecting new fields or duplicating existing ones. The output of `log2journal` is in Systemd Journal Export Format, and it looks like this:
|
||||
```bash
|
||||
KEY1=VALUE1 # << start of the first log line
|
||||
KEY2=VALUE2
|
||||
# << log lines separator
|
||||
KEY1=VALUE1 # << start of the second log line
|
||||
KEY2=VALUE2
|
||||
```
|
||||
3. `sed` is an optional step and is an example. Any kind of processing can be applied at this stage, in case we want to alter the fields in some way. For example, we may want to set the PRIORITY field of systemd-journal to make Netdata dashboards and `journalctl` color the internal server errors. Or we may want to anonymize the logs, to remove sensitive information from them. Or even we may want to remove the variable parts of the requests, to make them uniform. We will see below how such processing can be done.
|
||||
4. `systemd-cat-native` is a Netdata program. I can send the logs to a local `systemd-journald` (journal namespaces supported), or to a remote `systemd-journal-remote`.
|
||||
|
||||
## Real-life example
|
||||
|
||||
We have an nginx server logging in this format:
|
||||
|
||||
```bash
|
||||
log_format access '$remote_addr - $remote_user [$time_local] '
|
||||
'"$request" $status $body_bytes_sent '
|
||||
'$request_length $request_time '
|
||||
'"$http_referer" "$http_user_agent"';
|
||||
```
|
||||
|
||||
First, let's find the right pattern for `log2journal`. We ask ChatGPT:
|
||||
|
||||
```
|
||||
My nginx log uses this log format:
|
||||
|
||||
log_format access '$remote_addr - $remote_user [$time_local] '
|
||||
'"$request" $status $body_bytes_sent '
|
||||
'$request_length $request_time '
|
||||
'"$http_referer" "$http_user_agent"';
|
||||
|
||||
I want to use `log2joural` to convert this log for systemd-journal.
|
||||
`log2journal` accepts a PCRE2 regular expression, using the named groups
|
||||
in the pattern as the journal fields to extract from the logs.
|
||||
|
||||
Prefix all PCRE2 group names with `NGINX_` and use capital characters only.
|
||||
|
||||
For the $request, use the field `MESSAGE` (without NGINX_ prefix), so that
|
||||
it will appear in systemd journals as the message of the log.
|
||||
|
||||
Please give me the PCRE2 pattern.
|
||||
```
|
||||
|
||||
ChatGPT replies with this:
|
||||
|
||||
```regexp
|
||||
^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>[^"]+)" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"
|
||||
```
|
||||
|
||||
Let's test it with a sample line (instead of `tail`):
|
||||
|
||||
```bash
|
||||
# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>[^"]+)" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"'
|
||||
MESSAGE=GET /index.html HTTP/1.1
|
||||
NGINX_BODY_BYTES_SENT=4172
|
||||
NGINX_HTTP_REFERER=-
|
||||
NGINX_HTTP_USER_AGENT=Go-http-client/1.1
|
||||
NGINX_REMOTE_ADDR=1.2.3.4
|
||||
NGINX_REMOTE_USER=-
|
||||
NGINX_REQUEST_LENGTH=104
|
||||
NGINX_REQUEST_TIME=0.001
|
||||
NGINX_STATUS=200
|
||||
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
|
||||
|
||||
```
|
||||
|
||||
As you can see, it extracted all the fields.
|
||||
|
||||
The `MESSAGE` however, has 3 fields by itself: the method, the URL and the procotol version. Let's ask ChatGPT to extract these too:
|
||||
|
||||
```
|
||||
I see that the MESSAGE has 3 key items in it. The request method (GET, POST,
|
||||
etc), the URL and HTTP protocol version.
|
||||
|
||||
I want to keep the MESSAGE as it is, with all the information in it, but also
|
||||
extract the 3 items from it as separate fields.
|
||||
|
||||
Can this be done?
|
||||
```
|
||||
|
||||
ChatGPT responded with this:
|
||||
|
||||
```regexp
|
||||
^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"
|
||||
```
|
||||
|
||||
Let's test this too:
|
||||
|
||||
```bash
|
||||
# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"'
|
||||
MESSAGE=GET /index.html HTTP/1.1 # <<<<<<<<< MESSAGE
|
||||
NGINX_BODY_BYTES_SENT=4172
|
||||
NGINX_HTTP_REFERER=-
|
||||
NGINX_HTTP_USER_AGENT=Go-http-client/1.1
|
||||
NGINX_HTTP_VERSION=1.1 # <<<<<<<<< VERSION
|
||||
NGINX_METHOD=GET # <<<<<<<<< METHOD
|
||||
NGINX_REMOTE_ADDR=1.2.3.4
|
||||
NGINX_REMOTE_USER=-
|
||||
NGINX_REQUEST_LENGTH=104
|
||||
NGINX_REQUEST_TIME=0.001
|
||||
NGINX_STATUS=200
|
||||
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
|
||||
NGINX_URL=/index.html # <<<<<<<<< URL
|
||||
|
||||
```
|
||||
|
||||
Ideally, we would want the 5xx errors to be red in our `journalctl` output. To achieve that we need to add a PRIORITY field to set the log level. Log priorities are numeric and follow the `syslog` priorities. Checking `/usr/include/sys/syslog.h` we can see these:
|
||||
|
||||
```c
|
||||
#define LOG_EMERG 0 /* system is unusable */
|
||||
#define LOG_ALERT 1 /* action must be taken immediately */
|
||||
#define LOG_CRIT 2 /* critical conditions */
|
||||
#define LOG_ERR 3 /* error conditions */
|
||||
#define LOG_WARNING 4 /* warning conditions */
|
||||
#define LOG_NOTICE 5 /* normal but significant condition */
|
||||
#define LOG_INFO 6 /* informational */
|
||||
#define LOG_DEBUG 7 /* debug-level messages */
|
||||
```
|
||||
|
||||
Avoid setting priority to 0 (`LOG_EMERG`), because these will be on your terminal (the journal uses `wall` to let you know of such events). A good priority for errors is 3 (red in `journalctl`), or 4 (yellow in `journalctl`).
|
||||
|
||||
To set the PRIORITY field in the output, we can use `NGINX_STATUS` fields. We need a copy of it, which we will alter later.
|
||||
|
||||
We can instruct `log2journal` to duplicate `NGINX_STATUS`, like this: `log2journal --duplicate=STATUS2PRIORITY=NGINX_STATUS`. Let's try it:
|
||||
|
||||
```bash
|
||||
# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS
|
||||
MESSAGE=GET /index.html HTTP/1.1
|
||||
NGINX_BODY_BYTES_SENT=4172
|
||||
NGINX_HTTP_REFERER=-
|
||||
NGINX_HTTP_USER_AGENT=Go-http-client/1.1
|
||||
NGINX_HTTP_VERSION=1.1
|
||||
NGINX_METHOD=GET
|
||||
NGINX_REMOTE_ADDR=1.2.3.4
|
||||
NGINX_REMOTE_USER=-
|
||||
NGINX_REQUEST_LENGTH=104
|
||||
NGINX_REQUEST_TIME=0.001
|
||||
NGINX_STATUS=200
|
||||
STATUS2PRIORITY=200 # <<<<<<<<< STATUS2PRIORITY IS HERE
|
||||
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
|
||||
NGINX_URL=/index.html
|
||||
|
||||
```
|
||||
|
||||
Now that we have the `STATUS2PRIORITY` field equal to the `NGINX_STATUS`, we can use a `sed` command to change it to the `PRIORITY` field we want. The `sed` command could be:
|
||||
|
||||
```bash
|
||||
sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|'
|
||||
```
|
||||
|
||||
We use `-u` for unbuffered communication.
|
||||
|
||||
This command first changes all 5xx `STATUS2PRIORITY` fields to `PRIORITY=3` (error) and then changes all the rest to `PRIORITY=6` (info). Let's see the whole of it:
|
||||
|
||||
```bash
|
||||
# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS | sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|'
|
||||
MESSAGE=GET /index.html HTTP/1.1
|
||||
NGINX_BODY_BYTES_SENT=4172
|
||||
NGINX_HTTP_REFERER=-
|
||||
NGINX_HTTP_USER_AGENT=Go-http-client/1.1
|
||||
NGINX_HTTP_VERSION=1.1
|
||||
NGINX_METHOD=GET
|
||||
NGINX_REMOTE_ADDR=1.2.3.4
|
||||
NGINX_REMOTE_USER=-
|
||||
NGINX_REQUEST_LENGTH=104
|
||||
NGINX_REQUEST_TIME=0.001
|
||||
NGINX_STATUS=200
|
||||
PRIORITY=6 # <<<<<<<<< PRIORITY IS HERE
|
||||
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
|
||||
NGINX_URL=/index.html
|
||||
|
||||
```
|
||||
|
||||
Similarly, we could duplicate `NGINX_URL` to `NGINX_ENDPOINT` and then process it with sed to remove any query string, or replace IDs in the URL path with constant names, thus giving us uniform endpoints independently of the parameters.
|
||||
|
||||
To complete the example, we can also inject a `SYSLOG_IDENTIFIER` with `log2journal`, using `--inject=SYSLOG_IDENTIFIER=nginx`, like this:
|
||||
|
||||
```bash
|
||||
# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS --inject=SYSLOG_IDENTIFIER=nginx | sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|'
|
||||
MESSAGE=GET /index.html HTTP/1.1
|
||||
NGINX_BODY_BYTES_SENT=4172
|
||||
NGINX_HTTP_REFERER=-
|
||||
NGINX_HTTP_USER_AGENT=Go-http-client/1.1
|
||||
NGINX_HTTP_VERSION=1.1
|
||||
NGINX_METHOD=GET
|
||||
NGINX_REMOTE_ADDR=1.2.3.4
|
||||
NGINX_REMOTE_USER=-
|
||||
NGINX_REQUEST_LENGTH=104
|
||||
NGINX_REQUEST_TIME=0.001
|
||||
NGINX_STATUS=200
|
||||
PRIORITY=6
|
||||
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
|
||||
NGINX_URL=/index.html
|
||||
SYSLOG_IDENTIFIER=nginx # <<<<<<<<< THIS HAS BEEN ADDED
|
||||
|
||||
```
|
||||
|
||||
Now the message is ready to be sent to a systemd-journal. For this we use `systemd-cat-native`. This command can send such messages to a journal running on the localhost, a local journal namespace, or a `systemd-journal-remote` running on another server. By just appending `| systemd-cat-native` to the command, the message will be sent to the local journal.
|
||||
|
||||
|
||||
```bash
|
||||
# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS --inject=SYSLOG_IDENTIFIER=nginx | sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|' | systemd-cat-native
|
||||
# no output
|
||||
|
||||
# let's find the message
|
||||
# journalctl -o verbose SYSLOG_IDENTIFIER=nginx
|
||||
Sun 2023-11-19 04:34:06.583912 EET [s=1eb59e7934984104ab3b61f5d9648057;i=115b6d4;b=7282d89d2e6e4299969a6030302ff3e4;m=69b419673;t=60a783417ac72;x=2cec5dde8bf01ee7]
|
||||
PRIORITY=6
|
||||
_UID=0
|
||||
_GID=0
|
||||
_BOOT_ID=7282d89d2e6e4299969a6030302ff3e4
|
||||
_MACHINE_ID=6b72c55db4f9411dbbb80b70537bf3a8
|
||||
_HOSTNAME=costa-xps9500
|
||||
_RUNTIME_SCOPE=system
|
||||
_TRANSPORT=journal
|
||||
_CAP_EFFECTIVE=1ffffffffff
|
||||
_AUDIT_LOGINUID=1000
|
||||
_AUDIT_SESSION=1
|
||||
_SYSTEMD_CGROUP=/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-59780d3d-a3ff-4a82-a6fe-8d17d2261106.scope
|
||||
_SYSTEMD_OWNER_UID=1000
|
||||
_SYSTEMD_UNIT=user@1000.service
|
||||
_SYSTEMD_USER_UNIT=vte-spawn-59780d3d-a3ff-4a82-a6fe-8d17d2261106.scope
|
||||
_SYSTEMD_SLICE=user-1000.slice
|
||||
_SYSTEMD_USER_SLICE=app-org.gnome.Terminal.slice
|
||||
_SYSTEMD_INVOCATION_ID=6195d8c4c6654481ac9a30e9a8622ba1
|
||||
_COMM=systemd-cat-nat
|
||||
MESSAGE=GET /index.html HTTP/1.1 # <<<<<<<<< CHECK
|
||||
NGINX_BODY_BYTES_SENT=4172 # <<<<<<<<< CHECK
|
||||
NGINX_HTTP_REFERER=- # <<<<<<<<< CHECK
|
||||
NGINX_HTTP_USER_AGENT=Go-http-client/1.1 # <<<<<<<<< CHECK
|
||||
NGINX_HTTP_VERSION=1.1 # <<<<<<<<< CHECK
|
||||
NGINX_METHOD=GET # <<<<<<<<< CHECK
|
||||
NGINX_REMOTE_ADDR=1.2.3.4 # <<<<<<<<< CHECK
|
||||
NGINX_REMOTE_USER=- # <<<<<<<<< CHECK
|
||||
NGINX_REQUEST_LENGTH=104 # <<<<<<<<< CHECK
|
||||
NGINX_REQUEST_TIME=0.001 # <<<<<<<<< CHECK
|
||||
NGINX_STATUS=200 # <<<<<<<<< CHECK
|
||||
NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000 # <<<<<<<<< CHECK
|
||||
NGINX_URL=/index.html # <<<<<<<<< CHECK
|
||||
SYSLOG_IDENTIFIER=nginx # <<<<<<<<< CHECK
|
||||
_PID=354312
|
||||
_SOURCE_REALTIME_TIMESTAMP=1700361246583912
|
||||
|
||||
```
|
||||
|
||||
So, the log line, with all its fields parsed, ended up in systemd-journal.
|
||||
|
||||
The complete example, would look like the following script.
|
||||
Running this script with parameter `test` will produce output on the terminal for you to inspect.
|
||||
Unmatched log entries are added to the journal with PRIORITY=1 (`ERR_ALERT`), so that you can spot them.
|
||||
|
||||
We also used the `--filename-key` of `log2journal`, which parses the filename when `tail` switches output
|
||||
between files, and adds the field `NGINX_LOG_FILE` with the filename each log line comes from.
|
||||
|
||||
Finally, the script also adds the field `NGINX_STATUS_FAMILY` taking values `2xx`, `3xx`, etc, so that
|
||||
it is easy to find all the logs of a specific status family.
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
|
||||
test=0
|
||||
last=0
|
||||
send_or_show='./systemd-cat-native'
|
||||
[ "${1}" = "test" ] && test=1 && last=100 && send_or_show=cat
|
||||
|
||||
pattern='(?x) # Enable PCRE2 extended mode
|
||||
^
|
||||
(?<NGINX_REMOTE_ADDR>[^ ]+) \s - \s # NGINX_REMOTE_ADDR
|
||||
(?<NGINX_REMOTE_USER>[^ ]+) \s # NGINX_REMOTE_USER
|
||||
\[
|
||||
(?<NGINX_TIME_LOCAL>[^\]]+) # NGINX_TIME_LOCAL
|
||||
\]
|
||||
\s+ "
|
||||
(?<MESSAGE> # MESSAGE
|
||||
(?<NGINX_METHOD>[A-Z]+) \s+ # NGINX_METHOD
|
||||
(?<NGINX_URL>[^ ]+) \s+ # NGINX_URL
|
||||
HTTP/(?<NGINX_HTTP_VERSION>[^"]+) # NGINX_HTTP_VERSION
|
||||
)
|
||||
" \s+
|
||||
(?<NGINX_STATUS>\d+) \s+ # NGINX_STATUS
|
||||
(?<NGINX_BODY_BYTES_SENT>\d+) \s+ # NGINX_BODY_BYTES_SENT
|
||||
"(?<NGINX_HTTP_REFERER>[^"]*)" \s+ # NGINX_HTTP_REFERER
|
||||
"(?<NGINX_HTTP_USER_AGENT>[^"]*)" # NGINX_HTTP_USER_AGENT
|
||||
'
|
||||
|
||||
tail -n $last -F /var/log/nginx/*access.log |\
|
||||
log2journal "${pattern}" \
|
||||
--filename-key=NGINX_LOG_FILE \
|
||||
--duplicate=STATUS2PRIORITY=NGINX_STATUS \
|
||||
--duplicate=STATUS_FAMILY=NGINX_STATUS \
|
||||
--inject=SYSLOG_IDENTIFIER=nginx \
|
||||
--unmatched-key=MESSAGE \
|
||||
--inject-unmatched=PRIORITY=1 \
|
||||
| sed -u \
|
||||
-e 's|^STATUS2PRIORITY=5.*$|PRIORITY=3|' \
|
||||
-e 's|^STATUS2PRIORITY=.*$|PRIORITY=6|' \
|
||||
-e 's|^STATUS_FAMILY=\([0-9]\).*$|NGINX_STATUS_FAMILY=\1xx|' \
|
||||
-e 's|^STATUS_FAMILY=.*$|NGINX_STATUS_FAMILY=UNKNOWN|' \
|
||||
| $send_or_show
|
||||
```
|
||||
|
||||
|
||||
## `log2journal` options
|
||||
|
||||
```
|
||||
|
||||
Netdata log2journal v1.43.0-337-g116dc1bc3
|
||||
|
||||
Convert structured log input to systemd Journal Export Format.
|
||||
|
||||
Using PCRE2 patterns, extract the fields from structured logs on the standard
|
||||
input, and generate output according to systemd Journal Export Format
|
||||
|
||||
Usage: ./log2journal [OPTIONS] PATTERN
|
||||
|
||||
Options:
|
||||
|
||||
--filename-key=KEY
|
||||
Add a field with KEY as the key and the current filename as value.
|
||||
Automatically detects filenames when piped after 'tail -F',
|
||||
and tail matches multiple filenames.
|
||||
To inject the filename when tailing a single file, use --inject.
|
||||
|
||||
--unmatched-key=KEY
|
||||
Include unmatched log entries in the output with KEY as the field name.
|
||||
Use this to include unmatched entries to the output stream.
|
||||
Usually it should be set to --unmatched-key=MESSAGE so that the
|
||||
unmatched entry will appear as the log message in the journals.
|
||||
Use --inject-unmatched to inject additional fields to unmatched lines.
|
||||
|
||||
--duplicate=TARGET=KEY1[,KEY2[,KEY3[,...]]
|
||||
Create a new key called TARGET, duplicating the values of the keys
|
||||
given. Useful for further processing. When multiple keys are given,
|
||||
their values are separated by comma.
|
||||
Up to 2048 duplications can be given on the command line, and up to
|
||||
10 keys per duplication command are allowed.
|
||||
|
||||
--inject=LINE
|
||||
Inject constant fields to the output (both matched and unmatched logs).
|
||||
--inject entries are added to unmatched lines too, when their key is
|
||||
not used in --inject-unmatched (--inject-unmatched override --inject).
|
||||
Up to 2048 fields can be injected.
|
||||
|
||||
--inject-unmatched=LINE
|
||||
Inject lines into the output for each unmatched log entry.
|
||||
Usually, --inject-unmatched=PRIORITY=3 is needed to mark the unmatched
|
||||
lines as errors, so that they can easily be spotted in the journals.
|
||||
Up to 2048 such lines can be injected.
|
||||
|
||||
-h, --help
|
||||
Display this help and exit.
|
||||
|
||||
PATTERN
|
||||
PATTERN should be a valid PCRE2 regular expression.
|
||||
RE2 regular expressions (like the ones usually used in Go applications),
|
||||
are usually valid PCRE2 patterns too.
|
||||
Regular expressions without named groups are ignored.
|
||||
|
||||
The maximum line length accepted is 1048576 characters.
|
||||
The maximum number of fields in the PCRE2 pattern is 8192.
|
||||
|
||||
JOURNAL FIELDS RULES (enforced by systemd-journald)
|
||||
|
||||
- field names can be up to 64 characters
|
||||
- the only allowed field characters are A-Z, 0-9 and underscore
|
||||
- the first character of fields cannot be a digit
|
||||
- protected journal fields start with underscore:
|
||||
* they are accepted by systemd-journal-remote
|
||||
* they are NOT accepted by a local systemd-journald
|
||||
|
||||
For best results, always include these fields:
|
||||
|
||||
MESSAGE=TEXT
|
||||
The MESSAGE is the body of the log entry.
|
||||
This field is what we usually see in our logs.
|
||||
|
||||
PRIORITY=NUMBER
|
||||
PRIORITY sets the severity of the log entry.
|
||||
0=emerg, 1=alert, 2=crit, 3=err, 4=warn, 5=notice, 6=info, 7=debug
|
||||
- Emergency events (0) are usually broadcast to all terminals.
|
||||
- Emergency, alert, critical, and error (0-3) are usually colored red.
|
||||
- Warning (4) entries are usually colored yellow.
|
||||
- Notice (5) entries are usually bold or have a brighter white color.
|
||||
- Info (6) entries are the default.
|
||||
- Debug (7) entries are usually grayed or dimmed.
|
||||
|
||||
SYSLOG_IDENTIFIER=NAME
|
||||
SYSLOG_IDENTIFIER sets the name of application.
|
||||
Use something descriptive, like: SYSLOG_IDENTIFIER=nginx-logs
|
||||
|
||||
You can find the most common fields at 'man systemd.journal-fields'.
|
||||
|
||||
```
|
||||
|
||||
## `systemd-cat-native` options
|
||||
|
||||
```
|
||||
|
||||
Netdata systemd-cat-native v1.43.0-319-g4ada93a6e
|
||||
|
||||
This program reads from its standard input, lines in the format:
|
||||
|
||||
KEY1=VALUE1\n
|
||||
KEY2=VALUE2\n
|
||||
KEYN=VALUEN\n
|
||||
\n
|
||||
|
||||
and sends them to systemd-journal.
|
||||
|
||||
- Binary journal fields are not accepted at its input
|
||||
- Binary journal fields can be generated after newline processing
|
||||
- Messages have to be separated by an empty line
|
||||
- Keys starting with underscore are not accepted (by journald)
|
||||
- Other rules imposed by systemd-journald are imposed (by journald)
|
||||
|
||||
Usage:
|
||||
|
||||
./systemd-cat-native
|
||||
[--newline=STRING]
|
||||
[--log-as-netdata|-N]
|
||||
[--namespace=NAMESPACE] [--socket=PATH]
|
||||
[--url=URL [--key=FILENAME] [--cert=FILENAME] [--trust=FILENAME|all]]
|
||||
|
||||
The program has the following modes of logging:
|
||||
|
||||
* Log to a local systemd-journald or stderr
|
||||
|
||||
This is the default mode. If systemd-journald is available, logs will be
|
||||
sent to systemd, otherwise logs will be printed on stderr, using logfmt
|
||||
formatting. Options --socket and --namespace are available to configure
|
||||
the journal destination:
|
||||
|
||||
--socket=PATH
|
||||
The path of a systemd-journald UNIX socket.
|
||||
The program will use the default systemd-journald socket when this
|
||||
option is not used.
|
||||
|
||||
--namespace=NAMESPACE
|
||||
The name of a configured and running systemd-journald namespace.
|
||||
The program will produce the socket path based on its internal
|
||||
defaults, to send the messages to the systemd journal namespace.
|
||||
|
||||
* Log as Netdata, enabled with --log-as-netdata or -N
|
||||
|
||||
In this mode the program uses environment variables set by Netdata for
|
||||
the log destination. Only log fields defined by Netdata are accepted.
|
||||
If the environment variables expected by Netdata are not found, it
|
||||
falls back to stderr logging in logfmt format.
|
||||
|
||||
* Log to a systemd-journal-remote TCP socket, enabled with --url=URL
|
||||
|
||||
In this mode, the program will directly sent logs to a remote systemd
|
||||
journal (systemd-journal-remote expected at the destination)
|
||||
This mode is available even when the local system does not support
|
||||
systemd, or even it is not Linux, allowing a remote Linux systemd
|
||||
journald to become the logs database of the local system.
|
||||
|
||||
--url=URL
|
||||
The destination systemd-journal-remote address and port, similarly
|
||||
to what /etc/systemd/journal-upload.conf accepts.
|
||||
Usually it is in the form: https://ip.address:19532
|
||||
Both http and https URLs are accepted. When using https, the
|
||||
following additional options are accepted:
|
||||
|
||||
--key=FILENAME
|
||||
The filename of the private key of the server.
|
||||
The default is: /etc/ssl/private/journal-upload.pem
|
||||
|
||||
--cert=FILENAME
|
||||
The filename of the public key of the server.
|
||||
The default is: /etc/ssl/certs/journal-upload.pem
|
||||
|
||||
--trust=FILENAME | all
|
||||
The filename of the trusted CA public key.
|
||||
The default is: /etc/ssl/ca/trusted.pem
|
||||
The keyword 'all' can be used to trust all CAs.
|
||||
|
||||
NEWLINES PROCESSING
|
||||
systemd-journal logs entries may have newlines in them. However the
|
||||
Journal Export Format uses binary formatted data to achieve this,
|
||||
making it hard for text processing.
|
||||
|
||||
To overcome this limitation, this program allows single-line text
|
||||
formatted values at its input, to be binary formatted multi-line Journal
|
||||
Export Format at its output.
|
||||
|
||||
To achieve that it allows replacing a given string to a newline.
|
||||
The parameter --newline=STRING allows setting the string to be replaced
|
||||
with newlines.
|
||||
|
||||
For example by setting --newline='{NEWLINE}', the program will replace
|
||||
all occurrences of {NEWLINE} with the newline character, within each
|
||||
VALUE of the KEY=VALUE lines. Once this this done, the program will
|
||||
switch the field to the binary Journal Export Format before sending the
|
||||
log event to systemd-journal.
|
||||
|
||||
```
|
781
libnetdata/log/systemd-cat-native.c
Normal file
781
libnetdata/log/systemd-cat-native.c
Normal file
|
@ -0,0 +1,781 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "systemd-cat-native.h"
|
||||
#include "../required_dummies.h"
|
||||
|
||||
#ifdef __FreeBSD__
|
||||
#include <sys/endian.h>
|
||||
#endif
|
||||
|
||||
#ifdef __APPLE__
|
||||
#include <machine/endian.h>
|
||||
#endif
|
||||
|
||||
static void log_message_to_stderr(BUFFER *msg) {
|
||||
CLEAN_BUFFER *tmp = buffer_create(0, NULL);
|
||||
|
||||
for(size_t i = 0; i < msg->len ;i++) {
|
||||
if(isprint(msg->buffer[i]))
|
||||
buffer_putc(tmp, msg->buffer[i]);
|
||||
else {
|
||||
buffer_putc(tmp, '[');
|
||||
buffer_print_uint64_hex(tmp, msg->buffer[i]);
|
||||
buffer_putc(tmp, ']');
|
||||
}
|
||||
}
|
||||
|
||||
fprintf(stderr, "SENDING: %s\n", buffer_tostring(tmp));
|
||||
}
|
||||
|
||||
static inline buffered_reader_ret_t get_next_line(struct buffered_reader *reader, BUFFER *line, int timeout_ms) {
|
||||
while(true) {
|
||||
if(unlikely(!buffered_reader_next_line(reader, line))) {
|
||||
buffered_reader_ret_t ret = buffered_reader_read_timeout(reader, STDIN_FILENO, timeout_ms, false);
|
||||
if(unlikely(ret != BUFFERED_READER_READ_OK))
|
||||
return ret;
|
||||
|
||||
continue;
|
||||
}
|
||||
else {
|
||||
// make sure the buffer is NULL terminated
|
||||
line->buffer[line->len] = '\0';
|
||||
|
||||
// remove the trailing newlines
|
||||
while(line->len && line->buffer[line->len - 1] == '\n')
|
||||
line->buffer[--line->len] = '\0';
|
||||
|
||||
return BUFFERED_READER_READ_OK;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static inline size_t copy_replacing_newlines(char *dst, size_t dst_len, const char *src, size_t src_len, const char *newline) {
|
||||
if (!dst || !src) return 0;
|
||||
|
||||
const char *current_src = src;
|
||||
const char *src_end = src + src_len; // Pointer to the end of src
|
||||
char *current_dst = dst;
|
||||
size_t remaining_dst_len = dst_len;
|
||||
size_t newline_len = newline && *newline ? strlen(newline) : 0;
|
||||
|
||||
size_t bytes_copied = 0; // To track the number of bytes copied
|
||||
|
||||
while (remaining_dst_len > 1 && current_src < src_end) {
|
||||
if (newline_len > 0) {
|
||||
const char *found = strstr(current_src, newline);
|
||||
if (found && found < src_end) {
|
||||
size_t copy_len = found - current_src;
|
||||
if (copy_len >= remaining_dst_len) copy_len = remaining_dst_len - 1;
|
||||
|
||||
memcpy(current_dst, current_src, copy_len);
|
||||
current_dst += copy_len;
|
||||
*current_dst++ = '\n';
|
||||
remaining_dst_len -= (copy_len + 1);
|
||||
bytes_copied += copy_len + 1; // +1 for the newline character
|
||||
current_src = found + newline_len;
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
// Copy the remaining part of src to dst
|
||||
size_t copy_len = src_end - current_src;
|
||||
if (copy_len >= remaining_dst_len) copy_len = remaining_dst_len - 1;
|
||||
|
||||
memcpy(current_dst, current_src, copy_len);
|
||||
current_dst += copy_len;
|
||||
remaining_dst_len -= copy_len;
|
||||
bytes_copied += copy_len;
|
||||
break;
|
||||
}
|
||||
|
||||
// Ensure the string is null-terminated
|
||||
*current_dst = '\0';
|
||||
|
||||
return bytes_copied;
|
||||
}
|
||||
|
||||
static inline void buffer_memcat_replacing_newlines(BUFFER *wb, const char *src, size_t src_len, const char *newline) {
|
||||
if(!src) return;
|
||||
|
||||
const char *equal;
|
||||
if(!newline || !*newline || !strstr(src, newline) || !(equal = strchr(src, '='))) {
|
||||
buffer_memcat(wb, src, src_len);
|
||||
buffer_putc(wb, '\n');
|
||||
return;
|
||||
}
|
||||
|
||||
size_t key_len = equal - src;
|
||||
buffer_memcat(wb, src, key_len);
|
||||
buffer_putc(wb, '\n');
|
||||
|
||||
char *length_ptr = &wb->buffer[wb->len];
|
||||
uint64_t le_size = 0;
|
||||
buffer_memcat(wb, &le_size, sizeof(le_size));
|
||||
|
||||
const char *value = ++equal;
|
||||
size_t value_len = src_len - key_len - 1;
|
||||
buffer_need_bytes(wb, value_len + 1);
|
||||
size_t size = copy_replacing_newlines(&wb->buffer[wb->len], value_len + 1, value, value_len, newline);
|
||||
wb->len += size;
|
||||
buffer_putc(wb, '\n');
|
||||
|
||||
le_size = htole64(size);
|
||||
memcpy(length_ptr, &le_size, sizeof(le_size));
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// log to a systemd-journal-remote
|
||||
|
||||
#ifdef HAVE_CURL
|
||||
#include <curl/curl.h>
|
||||
|
||||
#ifndef HOST_NAME_MAX
|
||||
#define HOST_NAME_MAX 256
|
||||
#endif
|
||||
|
||||
char global_hostname[HOST_NAME_MAX] = "";
|
||||
char global_boot_id[UUID_COMPACT_STR_LEN] = "";
|
||||
#define BOOT_ID_PATH "/proc/sys/kernel/random/boot_id"
|
||||
|
||||
#define DEFAULT_PRIVATE_KEY "/etc/ssl/private/journal-upload.pem"
|
||||
#define DEFAULT_PUBLIC_KEY "/etc/ssl/certs/journal-upload.pem"
|
||||
#define DEFAULT_CA_CERT "/etc/ssl/ca/trusted.pem"
|
||||
|
||||
struct upload_data {
|
||||
char *data;
|
||||
size_t length;
|
||||
};
|
||||
|
||||
static size_t systemd_journal_remote_read_callback(void *ptr, size_t size, size_t nmemb, void *userp) {
|
||||
struct upload_data *upload = (struct upload_data *)userp;
|
||||
size_t buffer_size = size * nmemb;
|
||||
|
||||
if (upload->length) {
|
||||
size_t copy_size = upload->length < buffer_size ? upload->length : buffer_size;
|
||||
memcpy(ptr, upload->data, copy_size);
|
||||
upload->data += copy_size;
|
||||
upload->length -= copy_size;
|
||||
return copy_size;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
CURL* initialize_connection_to_systemd_journal_remote(const char* url, const char* private_key, const char* public_key, const char* ca_cert, struct curl_slist **headers) {
|
||||
CURL *curl = curl_easy_init();
|
||||
if (!curl) {
|
||||
fprintf(stderr, "Failed to initialize curl\n");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
*headers = curl_slist_append(*headers, "Content-Type: application/vnd.fdo.journal");
|
||||
*headers = curl_slist_append(*headers, "Transfer-Encoding: chunked");
|
||||
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, *headers);
|
||||
curl_easy_setopt(curl, CURLOPT_URL, url);
|
||||
curl_easy_setopt(curl, CURLOPT_POST, 1L);
|
||||
curl_easy_setopt(curl, CURLOPT_READFUNCTION, systemd_journal_remote_read_callback);
|
||||
|
||||
if (strncmp(url, "https://", 8) == 0) {
|
||||
if (private_key) curl_easy_setopt(curl, CURLOPT_SSLKEY, private_key);
|
||||
if (public_key) curl_easy_setopt(curl, CURLOPT_SSLCERT, public_key);
|
||||
|
||||
if (strcmp(ca_cert, "all") != 0) {
|
||||
curl_easy_setopt(curl, CURLOPT_CAINFO, ca_cert);
|
||||
} else {
|
||||
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 0L);
|
||||
}
|
||||
}
|
||||
// curl_easy_setopt(curl, CURLOPT_VERBOSE, 1L); // Remove for less verbose output
|
||||
|
||||
return curl;
|
||||
}
|
||||
|
||||
static void journal_remote_complete_event(BUFFER *msg, usec_t *monotonic_ut) {
|
||||
usec_t ut = now_monotonic_usec();
|
||||
|
||||
if(monotonic_ut)
|
||||
*monotonic_ut = ut;
|
||||
|
||||
buffer_sprintf(msg,
|
||||
""
|
||||
"__REALTIME_TIMESTAMP=%llu\n"
|
||||
"__MONOTONIC_TIMESTAMP=%llu\n"
|
||||
"_BOOT_ID=%s\n"
|
||||
"_HOSTNAME=%s\n"
|
||||
"\n"
|
||||
, now_realtime_usec()
|
||||
, ut
|
||||
, global_boot_id
|
||||
, global_hostname
|
||||
);
|
||||
}
|
||||
|
||||
static CURLcode journal_remote_send_buffer(CURL* curl, BUFFER *msg) {
|
||||
|
||||
// log_message_to_stderr(msg);
|
||||
|
||||
struct upload_data upload = {0};
|
||||
|
||||
if (!curl || !buffer_strlen(msg))
|
||||
return CURLE_FAILED_INIT;
|
||||
|
||||
upload.data = (char *) buffer_tostring(msg);
|
||||
upload.length = buffer_strlen(msg);
|
||||
|
||||
curl_easy_setopt(curl, CURLOPT_READDATA, &upload);
|
||||
curl_easy_setopt(curl, CURLOPT_INFILESIZE_LARGE, (curl_off_t)upload.length);
|
||||
|
||||
return curl_easy_perform(curl);
|
||||
}
|
||||
|
||||
typedef enum {
|
||||
LOG_TO_JOURNAL_REMOTE_BAD_PARAMS = -1,
|
||||
LOG_TO_JOURNAL_REMOTE_CANNOT_INITIALIZE = -2,
|
||||
LOG_TO_JOURNAL_REMOTE_CANNOT_SEND = -3,
|
||||
LOG_TO_JOURNAL_REMOTE_CANNOT_READ = -4,
|
||||
} log_to_journal_remote_ret_t;
|
||||
|
||||
static log_to_journal_remote_ret_t log_input_to_journal_remote(const char *url, const char *key, const char *cert, const char *trust, const char *newline, int timeout_ms) {
|
||||
if(!url || !*url) {
|
||||
fprintf(stderr, "No URL is given.\n");
|
||||
return LOG_TO_JOURNAL_REMOTE_BAD_PARAMS;
|
||||
}
|
||||
|
||||
if(timeout_ms < 10)
|
||||
timeout_ms = 10;
|
||||
|
||||
global_boot_id[0] = '\0';
|
||||
char boot_id[1024];
|
||||
if(read_file(BOOT_ID_PATH, boot_id, sizeof(boot_id)) == 0) {
|
||||
uuid_t uuid;
|
||||
if(uuid_parse_flexi(boot_id, uuid) == 0)
|
||||
uuid_unparse_lower_compact(uuid, global_boot_id);
|
||||
else
|
||||
fprintf(stderr, "WARNING: cannot parse the UUID found in '%s'.\n", BOOT_ID_PATH);
|
||||
}
|
||||
|
||||
if(global_boot_id[0] == '\0') {
|
||||
fprintf(stderr, "WARNING: cannot read '%s'. Will generate a random _BOOT_ID.\n", BOOT_ID_PATH);
|
||||
uuid_t uuid;
|
||||
uuid_generate_random(uuid);
|
||||
uuid_unparse_lower_compact(uuid, global_boot_id);
|
||||
}
|
||||
|
||||
if(global_hostname[0] == '\0') {
|
||||
if(gethostname(global_hostname, sizeof(global_hostname)) != 0) {
|
||||
fprintf(stderr, "WARNING: cannot get system's hostname. Will use internal default.\n");
|
||||
snprintfz(global_hostname, sizeof(global_hostname), "systemd-cat-native-unknown-hostname");
|
||||
}
|
||||
}
|
||||
|
||||
if(!key)
|
||||
key = DEFAULT_PRIVATE_KEY;
|
||||
|
||||
if(!cert)
|
||||
cert = DEFAULT_PUBLIC_KEY;
|
||||
|
||||
if(!trust)
|
||||
trust = DEFAULT_CA_CERT;
|
||||
|
||||
char full_url[4096];
|
||||
snprintfz(full_url, sizeof(full_url), "%s/upload", url);
|
||||
|
||||
CURL *curl;
|
||||
CURLcode res = CURLE_OK;
|
||||
struct curl_slist *headers = NULL;
|
||||
|
||||
curl_global_init(CURL_GLOBAL_ALL);
|
||||
curl = initialize_connection_to_systemd_journal_remote(full_url, key, cert, trust, &headers);
|
||||
|
||||
if(!curl)
|
||||
return LOG_TO_JOURNAL_REMOTE_CANNOT_INITIALIZE;
|
||||
|
||||
struct buffered_reader reader;
|
||||
buffered_reader_init(&reader);
|
||||
CLEAN_BUFFER *line = buffer_create(sizeof(reader.read_buffer), NULL);
|
||||
CLEAN_BUFFER *msg = buffer_create(sizeof(reader.read_buffer), NULL);
|
||||
|
||||
size_t msg_full_events = 0;
|
||||
size_t msg_partial_fields = 0;
|
||||
usec_t msg_started_ut = 0;
|
||||
size_t failures = 0;
|
||||
size_t messages_logged = 0;
|
||||
|
||||
log_to_journal_remote_ret_t ret = 0;
|
||||
|
||||
while(true) {
|
||||
buffered_reader_ret_t rc = get_next_line(&reader, line, timeout_ms);
|
||||
if(rc == BUFFERED_READER_READ_POLL_TIMEOUT) {
|
||||
if(msg_full_events && !msg_partial_fields) {
|
||||
res = journal_remote_send_buffer(curl, msg);
|
||||
if(res != CURLE_OK) {
|
||||
fprintf(stderr, "journal_remote_send_buffer() failed: %s\n", curl_easy_strerror(res));
|
||||
failures++;
|
||||
ret = LOG_TO_JOURNAL_REMOTE_CANNOT_SEND;
|
||||
goto cleanup;
|
||||
}
|
||||
else
|
||||
messages_logged++;
|
||||
|
||||
msg_full_events = 0;
|
||||
buffer_flush(msg);
|
||||
}
|
||||
}
|
||||
else if(rc == BUFFERED_READER_READ_OK) {
|
||||
if(!line->len) {
|
||||
// an empty line - we are done for this message
|
||||
if(msg_partial_fields) {
|
||||
msg_partial_fields = 0;
|
||||
|
||||
usec_t ut;
|
||||
journal_remote_complete_event(msg, &ut);
|
||||
if(!msg_full_events)
|
||||
msg_started_ut = ut;
|
||||
|
||||
msg_full_events++;
|
||||
|
||||
if(ut - msg_started_ut >= USEC_PER_SEC / 2) {
|
||||
res = journal_remote_send_buffer(curl, msg);
|
||||
if(res != CURLE_OK) {
|
||||
fprintf(stderr, "journal_remote_send_buffer() failed: %s\n", curl_easy_strerror(res));
|
||||
failures++;
|
||||
ret = LOG_TO_JOURNAL_REMOTE_CANNOT_SEND;
|
||||
goto cleanup;
|
||||
}
|
||||
else
|
||||
messages_logged++;
|
||||
|
||||
msg_full_events = 0;
|
||||
buffer_flush(msg);
|
||||
}
|
||||
}
|
||||
}
|
||||
else {
|
||||
buffer_memcat_replacing_newlines(msg, line->buffer, line->len, newline);
|
||||
msg_partial_fields++;
|
||||
}
|
||||
|
||||
buffer_flush(line);
|
||||
}
|
||||
else {
|
||||
fprintf(stderr, "cannot read input data, failed with code %d\n", rc);
|
||||
ret = LOG_TO_JOURNAL_REMOTE_CANNOT_READ;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (msg_full_events || msg_partial_fields) {
|
||||
if(msg_partial_fields) {
|
||||
msg_partial_fields = 0;
|
||||
msg_full_events++;
|
||||
journal_remote_complete_event(msg, NULL);
|
||||
}
|
||||
|
||||
if(msg_full_events) {
|
||||
res = journal_remote_send_buffer(curl, msg);
|
||||
if(res != CURLE_OK) {
|
||||
fprintf(stderr, "journal_remote_send_buffer() failed: %s\n", curl_easy_strerror(res));
|
||||
failures++;
|
||||
}
|
||||
else
|
||||
messages_logged++;
|
||||
|
||||
msg_full_events = 0;
|
||||
buffer_flush(msg);
|
||||
}
|
||||
}
|
||||
|
||||
cleanup:
|
||||
curl_easy_cleanup(curl);
|
||||
curl_slist_free_all(headers);
|
||||
curl_global_cleanup();
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
static int help(void) {
|
||||
fprintf(stderr,
|
||||
"\n"
|
||||
"Netdata systemd-cat-native " PACKAGE_VERSION "\n"
|
||||
"\n"
|
||||
"This program reads from its standard input, lines in the format:\n"
|
||||
"\n"
|
||||
"KEY1=VALUE1\\n\n"
|
||||
"KEY2=VALUE2\\n\n"
|
||||
"KEYN=VALUEN\\n\n"
|
||||
"\\n\n"
|
||||
"\n"
|
||||
"and sends them to systemd-journal.\n"
|
||||
"\n"
|
||||
" - Binary journal fields are not accepted at its input\n"
|
||||
" - Binary journal fields can be generated after newline processing\n"
|
||||
" - Messages have to be separated by an empty line\n"
|
||||
" - Keys starting with underscore are not accepted (by journald)\n"
|
||||
" - Other rules imposed by systemd-journald are imposed (by journald)\n"
|
||||
"\n"
|
||||
"Usage:\n"
|
||||
"\n"
|
||||
" %s\n"
|
||||
" [--newline=STRING]\n"
|
||||
" [--log-as-netdata|-N]\n"
|
||||
" [--namespace=NAMESPACE] [--socket=PATH]\n"
|
||||
#ifdef HAVE_CURL
|
||||
" [--url=URL [--key=FILENAME] [--cert=FILENAME] [--trust=FILENAME|all]]\n"
|
||||
#endif
|
||||
"\n"
|
||||
"The program has the following modes of logging:\n"
|
||||
"\n"
|
||||
" * Log to a local systemd-journald or stderr\n"
|
||||
"\n"
|
||||
" This is the default mode. If systemd-journald is available, logs will be\n"
|
||||
" sent to systemd, otherwise logs will be printed on stderr, using logfmt\n"
|
||||
" formatting. Options --socket and --namespace are available to configure\n"
|
||||
" the journal destination:\n"
|
||||
"\n"
|
||||
" --socket=PATH\n"
|
||||
" The path of a systemd-journald UNIX socket.\n"
|
||||
" The program will use the default systemd-journald socket when this\n"
|
||||
" option is not used.\n"
|
||||
"\n"
|
||||
" --namespace=NAMESPACE\n"
|
||||
" The name of a configured and running systemd-journald namespace.\n"
|
||||
" The program will produce the socket path based on its internal\n"
|
||||
" defaults, to send the messages to the systemd journal namespace.\n"
|
||||
"\n"
|
||||
" * Log as Netdata, enabled with --log-as-netdata or -N\n"
|
||||
"\n"
|
||||
" In this mode the program uses environment variables set by Netdata for\n"
|
||||
" the log destination. Only log fields defined by Netdata are accepted.\n"
|
||||
" If the environment variables expected by Netdata are not found, it\n"
|
||||
" falls back to stderr logging in logfmt format.\n"
|
||||
#ifdef HAVE_CURL
|
||||
"\n"
|
||||
" * Log to a systemd-journal-remote TCP socket, enabled with --url=URL\n"
|
||||
"\n"
|
||||
" In this mode, the program will directly sent logs to a remote systemd\n"
|
||||
" journal (systemd-journal-remote expected at the destination)\n"
|
||||
" This mode is available even when the local system does not support\n"
|
||||
" systemd, or even it is not Linux, allowing a remote Linux systemd\n"
|
||||
" journald to become the logs database of the local system.\n"
|
||||
"\n"
|
||||
" Unfortunately systemd-journal-remote does not accept compressed\n"
|
||||
" data over the network, so the stream will be uncompressed.\n"
|
||||
"\n"
|
||||
" --url=URL\n"
|
||||
" The destination systemd-journal-remote address and port, similarly\n"
|
||||
" to what /etc/systemd/journal-upload.conf accepts.\n"
|
||||
" Usually it is in the form: https://ip.address:19532\n"
|
||||
" Both http and https URLs are accepted. When using https, the\n"
|
||||
" following additional options are accepted:\n"
|
||||
"\n"
|
||||
" --key=FILENAME\n"
|
||||
" The filename of the private key of the server.\n"
|
||||
" The default is: " DEFAULT_PRIVATE_KEY "\n"
|
||||
"\n"
|
||||
" --cert=FILENAME\n"
|
||||
" The filename of the public key of the server.\n"
|
||||
" The default is: " DEFAULT_PUBLIC_KEY "\n"
|
||||
"\n"
|
||||
" --trust=FILENAME | all\n"
|
||||
" The filename of the trusted CA public key.\n"
|
||||
" The default is: " DEFAULT_CA_CERT "\n"
|
||||
" The keyword 'all' can be used to trust all CAs.\n"
|
||||
"\n"
|
||||
" --keep-trying\n"
|
||||
" Keep trying to send the message, if the remote journal is not there.\n"
|
||||
#endif
|
||||
"\n"
|
||||
" NEWLINES PROCESSING\n"
|
||||
" systemd-journal logs entries may have newlines in them. However the\n"
|
||||
" Journal Export Format uses binary formatted data to achieve this,\n"
|
||||
" making it hard for text processing.\n"
|
||||
"\n"
|
||||
" To overcome this limitation, this program allows single-line text\n"
|
||||
" formatted values at its input, to be binary formatted multi-line Journal\n"
|
||||
" Export Format at its output.\n"
|
||||
"\n"
|
||||
" To achieve that it allows replacing a given string to a newline.\n"
|
||||
" The parameter --newline=STRING allows setting the string to be replaced\n"
|
||||
" with newlines.\n"
|
||||
"\n"
|
||||
" For example by setting --newline='{NEWLINE}', the program will replace\n"
|
||||
" all occurrences of {NEWLINE} with the newline character, within each\n"
|
||||
" VALUE of the KEY=VALUE lines. Once this this done, the program will\n"
|
||||
" switch the field to the binary Journal Export Format before sending the\n"
|
||||
" log event to systemd-journal.\n"
|
||||
"\n",
|
||||
program_name);
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// log as Netdata
|
||||
|
||||
static void lgs_reset(struct log_stack_entry *lgs) {
|
||||
for(size_t i = 1; i < _NDF_MAX ;i++) {
|
||||
if(lgs[i].type == NDFT_TXT && lgs[i].set && lgs[i].txt)
|
||||
freez((void *)lgs[i].txt);
|
||||
|
||||
lgs[i] = ND_LOG_FIELD_TXT(i, NULL);
|
||||
}
|
||||
|
||||
lgs[0] = ND_LOG_FIELD_TXT(NDF_MESSAGE, NULL);
|
||||
lgs[_NDF_MAX] = ND_LOG_FIELD_END();
|
||||
}
|
||||
|
||||
static const char *strdupz_replacing_newlines(const char *src, const char *newline) {
|
||||
if(!src) src = "";
|
||||
|
||||
size_t src_len = strlen(src);
|
||||
char *buffer = mallocz(src_len + 1);
|
||||
copy_replacing_newlines(buffer, src_len + 1, src, src_len, newline);
|
||||
return buffer;
|
||||
}
|
||||
|
||||
static int log_input_as_netdata(const char *newline, int timeout_ms) {
|
||||
struct buffered_reader reader;
|
||||
buffered_reader_init(&reader);
|
||||
CLEAN_BUFFER *line = buffer_create(sizeof(reader.read_buffer), NULL);
|
||||
|
||||
ND_LOG_STACK lgs[_NDF_MAX + 1] = { 0 };
|
||||
ND_LOG_STACK_PUSH(lgs);
|
||||
lgs_reset(lgs);
|
||||
|
||||
size_t fields_added = 0;
|
||||
size_t messages_logged = 0;
|
||||
ND_LOG_FIELD_PRIORITY priority = NDLP_INFO;
|
||||
|
||||
while(get_next_line(&reader, line, timeout_ms) == BUFFERED_READER_READ_OK) {
|
||||
if(!line->len) {
|
||||
// an empty line - we are done for this message
|
||||
|
||||
nd_log(NDLS_HEALTH, priority,
|
||||
"added %d fields", // if the user supplied a MESSAGE, this will be ignored
|
||||
fields_added);
|
||||
|
||||
lgs_reset(lgs);
|
||||
fields_added = 0;
|
||||
messages_logged++;
|
||||
}
|
||||
else {
|
||||
char *equal = strchr(line->buffer, '=');
|
||||
if(equal) {
|
||||
const char *field = line->buffer;
|
||||
size_t field_len = equal - line->buffer;
|
||||
ND_LOG_FIELD_ID id = nd_log_field_id_by_name(field, field_len);
|
||||
if(id != NDF_STOP) {
|
||||
const char *value = ++equal;
|
||||
|
||||
if(lgs[id].txt)
|
||||
freez((void *) lgs[id].txt);
|
||||
|
||||
lgs[id].txt = strdupz_replacing_newlines(value, newline);
|
||||
lgs[id].set = true;
|
||||
|
||||
fields_added++;
|
||||
|
||||
if(id == NDF_PRIORITY)
|
||||
priority = nd_log_priority2id(value);
|
||||
}
|
||||
else {
|
||||
struct log_stack_entry backup = lgs[NDF_MESSAGE];
|
||||
lgs[NDF_MESSAGE] = ND_LOG_FIELD_TXT(NDF_MESSAGE, NULL);
|
||||
|
||||
nd_log(NDLS_COLLECTORS, NDLP_ERR,
|
||||
"Field '%.*s' is not a Netdata field. Ignoring it.",
|
||||
field_len, field);
|
||||
|
||||
lgs[NDF_MESSAGE] = backup;
|
||||
}
|
||||
}
|
||||
else {
|
||||
struct log_stack_entry backup = lgs[NDF_MESSAGE];
|
||||
lgs[NDF_MESSAGE] = ND_LOG_FIELD_TXT(NDF_MESSAGE, NULL);
|
||||
|
||||
nd_log(NDLS_COLLECTORS, NDLP_ERR,
|
||||
"Line does not contain an = sign; ignoring it: %s",
|
||||
line->buffer);
|
||||
|
||||
lgs[NDF_MESSAGE] = backup;
|
||||
}
|
||||
}
|
||||
|
||||
buffer_flush(line);
|
||||
}
|
||||
|
||||
if(fields_added) {
|
||||
nd_log(NDLS_HEALTH, priority, "added %d fields", fields_added);
|
||||
messages_logged++;
|
||||
}
|
||||
|
||||
return messages_logged ? 0 : 1;
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// log to a local systemd-journald
|
||||
|
||||
static bool journal_local_send_buffer(int fd, BUFFER *msg) {
|
||||
// log_message_to_stderr(msg);
|
||||
|
||||
bool ret = journal_direct_send(fd, msg->buffer, msg->len);
|
||||
if (!ret)
|
||||
fprintf(stderr, "Cannot send message to systemd journal.\n");
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int log_input_to_journal(const char *socket, const char *namespace, const char *newline, int timeout_ms) {
|
||||
char path[FILENAME_MAX + 1];
|
||||
int fd = -1;
|
||||
|
||||
if(socket)
|
||||
snprintfz(path, sizeof(path), "%s", socket);
|
||||
else
|
||||
journal_construct_path(path, sizeof(path), NULL, namespace);
|
||||
|
||||
fd = journal_direct_fd(path);
|
||||
if (fd == -1) {
|
||||
fprintf(stderr, "Cannot open '%s' as a UNIX socket (errno = %d)\n",
|
||||
path, errno);
|
||||
return 1;
|
||||
}
|
||||
|
||||
struct buffered_reader reader;
|
||||
buffered_reader_init(&reader);
|
||||
CLEAN_BUFFER *line = buffer_create(sizeof(reader.read_buffer), NULL);
|
||||
CLEAN_BUFFER *msg = buffer_create(sizeof(reader.read_buffer), NULL);
|
||||
|
||||
size_t messages_logged = 0;
|
||||
size_t failed_messages = 0;
|
||||
|
||||
while(get_next_line(&reader, line, timeout_ms) == BUFFERED_READER_READ_OK) {
|
||||
if (!line->len) {
|
||||
// an empty line - we are done for this message
|
||||
if (msg->len) {
|
||||
if(journal_local_send_buffer(fd, msg))
|
||||
messages_logged++;
|
||||
else {
|
||||
failed_messages++;
|
||||
goto cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
buffer_flush(msg);
|
||||
}
|
||||
else
|
||||
buffer_memcat_replacing_newlines(msg, line->buffer, line->len, newline);
|
||||
|
||||
buffer_flush(line);
|
||||
}
|
||||
|
||||
if (msg && msg->len) {
|
||||
if(journal_local_send_buffer(fd, msg))
|
||||
messages_logged++;
|
||||
else
|
||||
failed_messages++;
|
||||
}
|
||||
|
||||
cleanup:
|
||||
return !failed_messages && messages_logged ? 0 : 1;
|
||||
}
|
||||
|
||||
int main(int argc, char *argv[]) {
|
||||
clocks_init();
|
||||
nd_log_initialize_for_external_plugins(argv[0]);
|
||||
|
||||
int timeout_ms = -1; // wait forever
|
||||
bool log_as_netdata = false;
|
||||
const char *newline = NULL;
|
||||
const char *namespace = NULL;
|
||||
const char *socket = getenv("NETDATA_SYSTEMD_JOURNAL_PATH");
|
||||
#ifdef HAVE_CURL
|
||||
const char *url = NULL;
|
||||
const char *key = NULL;
|
||||
const char *cert = NULL;
|
||||
const char *trust = NULL;
|
||||
bool keep_trying = false;
|
||||
#endif
|
||||
|
||||
for(int i = 1; i < argc ;i++) {
|
||||
const char *k = argv[i];
|
||||
|
||||
if(strcmp(k, "--help") == 0 || strcmp(k, "-h") == 0)
|
||||
return help();
|
||||
|
||||
else if(strcmp(k, "--log-as-netdata") == 0 || strcmp(k, "-N") == 0)
|
||||
log_as_netdata = true;
|
||||
|
||||
else if(strncmp(k, "--namespace=", 12) == 0)
|
||||
namespace = &k[12];
|
||||
|
||||
else if(strncmp(k, "--socket=", 9) == 0)
|
||||
socket = &k[9];
|
||||
|
||||
else if(strncmp(k, "--newline=", 10) == 0)
|
||||
newline = &k[10];
|
||||
|
||||
#ifdef HAVE_CURL
|
||||
else if (strncmp(k, "--url=", 6) == 0)
|
||||
url = &k[6];
|
||||
|
||||
else if (strncmp(k, "--key=", 6) == 0)
|
||||
key = &k[6];
|
||||
|
||||
else if (strncmp(k, "--cert=", 7) == 0)
|
||||
cert = &k[7];
|
||||
|
||||
else if (strncmp(k, "--trust=", 8) == 0)
|
||||
trust = &k[8];
|
||||
|
||||
else if (strcmp(k, "--keep-trying") == 0)
|
||||
keep_trying = true;
|
||||
#endif
|
||||
else {
|
||||
fprintf(stderr, "Unknown parameter '%s'\n", k);
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
#ifdef HAVE_CURL
|
||||
if(log_as_netdata && url) {
|
||||
fprintf(stderr, "Cannot log to a systemd-journal-remote URL as Netdata. "
|
||||
"Please either give --url or --log-as-netdata, not both.\n");
|
||||
return 1;
|
||||
}
|
||||
|
||||
if(socket && url) {
|
||||
fprintf(stderr, "Cannot log to a systemd-journal-remote URL using a UNIX socket. "
|
||||
"Please either give --url or --socket, not both.\n");
|
||||
return 1;
|
||||
}
|
||||
|
||||
if(url && namespace) {
|
||||
fprintf(stderr, "Cannot log to a systemd-journal-remote URL using a namespace. "
|
||||
"Please either give --url or --namespace, not both.\n");
|
||||
return 1;
|
||||
}
|
||||
#endif
|
||||
|
||||
if(log_as_netdata && namespace) {
|
||||
fprintf(stderr, "Cannot log as netdata using a namespace. "
|
||||
"Please either give --log-as-netdata or --namespace, not both.\n");
|
||||
return 1;
|
||||
}
|
||||
|
||||
if(log_as_netdata)
|
||||
return log_input_as_netdata(newline, timeout_ms);
|
||||
|
||||
#ifdef HAVE_CURL
|
||||
if(url) {
|
||||
log_to_journal_remote_ret_t rc;
|
||||
do {
|
||||
rc = log_input_to_journal_remote(url, key, cert, trust, newline, timeout_ms);
|
||||
} while(keep_trying && rc == LOG_TO_JOURNAL_REMOTE_CANNOT_SEND);
|
||||
}
|
||||
#endif
|
||||
|
||||
return log_input_to_journal(socket, namespace, newline, timeout_ms);
|
||||
}
|
8
libnetdata/log/systemd-cat-native.h
Normal file
8
libnetdata/log/systemd-cat-native.h
Normal file
|
@ -0,0 +1,8 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
#ifndef NETDATA_SYSTEMD_CAT_NATIVE_H
|
||||
#define NETDATA_SYSTEMD_CAT_NATIVE_H
|
||||
|
||||
#endif //NETDATA_SYSTEMD_CAT_NATIVE_H
|
|
@ -24,7 +24,7 @@ static SOCKET_PEERS netdata_ssl_peers(NETDATA_SSL *ssl) {
|
|||
}
|
||||
|
||||
static void netdata_ssl_log_error_queue(const char *call, NETDATA_SSL *ssl, unsigned long err) {
|
||||
error_limit_static_thread_var(erl, 1, 0);
|
||||
nd_log_limit_static_thread_var(erl, 1, 0);
|
||||
|
||||
if(err == SSL_ERROR_NONE)
|
||||
err = ERR_get_error();
|
||||
|
@ -103,8 +103,9 @@ static void netdata_ssl_log_error_queue(const char *call, NETDATA_SSL *ssl, unsi
|
|||
ERR_error_string_n(err, str, 1024);
|
||||
str[1024] = '\0';
|
||||
SOCKET_PEERS peers = netdata_ssl_peers(ssl);
|
||||
error_limit(&erl, "SSL: %s() on socket local [[%s]:%d] <-> remote [[%s]:%d], returned error %lu (%s): %s",
|
||||
call, peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, err, code, str);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_ERR,
|
||||
"SSL: %s() on socket local [[%s]:%d] <-> remote [[%s]:%d], returned error %lu (%s): %s",
|
||||
call, peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, err, code, str);
|
||||
|
||||
} while((err = ERR_get_error()));
|
||||
}
|
||||
|
@ -179,7 +180,7 @@ void netdata_ssl_close(NETDATA_SSL *ssl) {
|
|||
}
|
||||
|
||||
static inline bool is_handshake_complete(NETDATA_SSL *ssl, const char *op) {
|
||||
error_limit_static_thread_var(erl, 1, 0);
|
||||
nd_log_limit_static_thread_var(erl, 1, 0);
|
||||
|
||||
if(unlikely(!ssl->conn)) {
|
||||
internal_error(true, "SSL: trying to %s on a NULL connection", op);
|
||||
|
@ -189,22 +190,25 @@ static inline bool is_handshake_complete(NETDATA_SSL *ssl, const char *op) {
|
|||
switch(ssl->state) {
|
||||
case NETDATA_SSL_STATE_NOT_SSL: {
|
||||
SOCKET_PEERS peers = netdata_ssl_peers(ssl);
|
||||
error_limit(&erl, "SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on non-SSL connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on non-SSL connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
return false;
|
||||
}
|
||||
|
||||
case NETDATA_SSL_STATE_INIT: {
|
||||
SOCKET_PEERS peers = netdata_ssl_peers(ssl);
|
||||
error_limit(&erl, "SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on an incomplete connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on an incomplete connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
return false;
|
||||
}
|
||||
|
||||
case NETDATA_SSL_STATE_FAILED: {
|
||||
SOCKET_PEERS peers = netdata_ssl_peers(ssl);
|
||||
error_limit(&erl, "SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on a failed connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on a failed connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@ -296,7 +300,7 @@ ssize_t netdata_ssl_write(NETDATA_SSL *ssl, const void *buf, size_t num) {
|
|||
}
|
||||
|
||||
static inline bool is_handshake_initialized(NETDATA_SSL *ssl, const char *op) {
|
||||
error_limit_static_thread_var(erl, 1, 0);
|
||||
nd_log_limit_static_thread_var(erl, 1, 0);
|
||||
|
||||
if(unlikely(!ssl->conn)) {
|
||||
internal_error(true, "SSL: trying to %s on a NULL connection", op);
|
||||
|
@ -306,8 +310,9 @@ static inline bool is_handshake_initialized(NETDATA_SSL *ssl, const char *op) {
|
|||
switch(ssl->state) {
|
||||
case NETDATA_SSL_STATE_NOT_SSL: {
|
||||
SOCKET_PEERS peers = netdata_ssl_peers(ssl);
|
||||
error_limit(&erl, "SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on non-SSL connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on non-SSL connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
return false;
|
||||
}
|
||||
|
||||
|
@ -317,15 +322,17 @@ static inline bool is_handshake_initialized(NETDATA_SSL *ssl, const char *op) {
|
|||
|
||||
case NETDATA_SSL_STATE_FAILED: {
|
||||
SOCKET_PEERS peers = netdata_ssl_peers(ssl);
|
||||
error_limit(&erl, "SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on a failed connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on a failed connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
return false;
|
||||
}
|
||||
|
||||
case NETDATA_SSL_STATE_COMPLETE: {
|
||||
SOCKET_PEERS peers = netdata_ssl_peers(ssl);
|
||||
error_limit(&erl, "SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on an complete connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING,
|
||||
"SSL: on socket local [[%s]:%d] <-> remote [[%s]:%d], attempt to %s on an complete connection",
|
||||
peers.local.ip, peers.local.port, peers.peer.ip, peers.peer.port, op);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
|
File diff suppressed because it is too large
Load diff
|
@ -134,8 +134,6 @@ size_t netdata_threads_init(void) {
|
|||
i = pthread_attr_getstacksize(netdata_threads_attr, &stacksize);
|
||||
if(i != 0)
|
||||
fatal("pthread_attr_getstacksize() failed with code %d.", i);
|
||||
else
|
||||
netdata_log_debug(D_OPTIONS, "initial pthread stack size is %zu bytes", stacksize);
|
||||
|
||||
return stacksize;
|
||||
}
|
||||
|
@ -152,12 +150,12 @@ void netdata_threads_init_after_fork(size_t stacksize) {
|
|||
if(netdata_threads_attr && stacksize > (size_t)PTHREAD_STACK_MIN) {
|
||||
i = pthread_attr_setstacksize(netdata_threads_attr, stacksize);
|
||||
if(i != 0)
|
||||
netdata_log_error("pthread_attr_setstacksize() to %zu bytes, failed with code %d.", stacksize, i);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "pthread_attr_setstacksize() to %zu bytes, failed with code %d.", stacksize, i);
|
||||
else
|
||||
netdata_log_info("Set threads stack size to %zu bytes", stacksize);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "Set threads stack size to %zu bytes", stacksize);
|
||||
}
|
||||
else
|
||||
netdata_log_error("Invalid pthread stacksize %zu", stacksize);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "Invalid pthread stacksize %zu", stacksize);
|
||||
}
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
|
@ -183,12 +181,12 @@ void rrd_collector_finished(void);
|
|||
static void thread_cleanup(void *ptr) {
|
||||
if(netdata_thread != ptr) {
|
||||
NETDATA_THREAD *info = (NETDATA_THREAD *)ptr;
|
||||
netdata_log_error("THREADS: internal error - thread local variable does not match the one passed to this function. Expected thread '%s', passed thread '%s'", netdata_thread->tag, info->tag);
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR, "THREADS: internal error - thread local variable does not match the one passed to this function. Expected thread '%s', passed thread '%s'", netdata_thread->tag, info->tag);
|
||||
}
|
||||
spinlock_lock(&netdata_thread->detach_lock);
|
||||
|
||||
if(!(netdata_thread->options & NETDATA_THREAD_OPTION_DONT_LOG_CLEANUP))
|
||||
netdata_log_info("thread with task id %d finished", gettid());
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "thread with task id %d finished", gettid());
|
||||
|
||||
rrd_collector_finished();
|
||||
sender_thread_buffer_free();
|
||||
|
@ -222,9 +220,9 @@ static void thread_set_name_np(NETDATA_THREAD *nt) {
|
|||
#endif
|
||||
|
||||
if (ret != 0)
|
||||
netdata_log_error("cannot set pthread name of %d to %s. ErrCode: %d", gettid(), threadname, ret);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot set pthread name of %d to %s. ErrCode: %d", gettid(), threadname, ret);
|
||||
else
|
||||
netdata_log_info("set name of thread %d to %s", gettid(), threadname);
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "set name of thread %d to %s", gettid(), threadname);
|
||||
|
||||
}
|
||||
}
|
||||
|
@ -247,7 +245,7 @@ void uv_thread_set_name_np(uv_thread_t ut, const char* name) {
|
|||
thread_name_get(true);
|
||||
|
||||
if (ret)
|
||||
netdata_log_info("cannot set libuv thread name to %s. Err: %d", threadname, ret);
|
||||
nd_log(NDLS_DAEMON, NDLP_NOTICE, "cannot set libuv thread name to %s. Err: %d", threadname, ret);
|
||||
}
|
||||
|
||||
void os_thread_get_current_name_np(char threadname[NETDATA_THREAD_NAME_MAX + 1])
|
||||
|
@ -264,13 +262,13 @@ static void *netdata_thread_init(void *ptr) {
|
|||
netdata_thread = (NETDATA_THREAD *)ptr;
|
||||
|
||||
if(!(netdata_thread->options & NETDATA_THREAD_OPTION_DONT_LOG_STARTUP))
|
||||
netdata_log_info("thread created with task id %d", gettid());
|
||||
nd_log(NDLS_DAEMON, NDLP_DEBUG, "thread created with task id %d", gettid());
|
||||
|
||||
if(pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL) != 0)
|
||||
netdata_log_error("cannot set pthread cancel type to DEFERRED.");
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot set pthread cancel type to DEFERRED.");
|
||||
|
||||
if(pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL) != 0)
|
||||
netdata_log_error("cannot set pthread cancel state to ENABLE.");
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot set pthread cancel state to ENABLE.");
|
||||
|
||||
thread_set_name_np(ptr);
|
||||
|
||||
|
@ -294,13 +292,13 @@ int netdata_thread_create(netdata_thread_t *thread, const char *tag, NETDATA_THR
|
|||
|
||||
int ret = pthread_create(thread, netdata_threads_attr, netdata_thread_init, info);
|
||||
if(ret != 0)
|
||||
netdata_log_error("failed to create new thread for %s. pthread_create() failed with code %d", tag, ret);
|
||||
nd_log(NDLS_DAEMON, NDLP_ERR, "failed to create new thread for %s. pthread_create() failed with code %d", tag, ret);
|
||||
|
||||
else {
|
||||
if (!(options & NETDATA_THREAD_OPTION_JOINABLE)) {
|
||||
int ret2 = pthread_detach(*thread);
|
||||
if (ret2 != 0)
|
||||
netdata_log_error("cannot request detach of newly created %s thread. pthread_detach() failed with code %d", tag, ret2);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot request detach of newly created %s thread. pthread_detach() failed with code %d", tag, ret2);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -318,9 +316,9 @@ int netdata_thread_cancel(netdata_thread_t thread) {
|
|||
int ret = pthread_cancel(thread);
|
||||
if(ret != 0)
|
||||
#ifdef NETDATA_INTERNAL_CHECKS
|
||||
netdata_log_error("cannot cancel thread. pthread_cancel() failed with code %d at %d@%s, function %s()", ret, line, file, function);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot cancel thread. pthread_cancel() failed with code %d at %d@%s, function %s()", ret, line, file, function);
|
||||
#else
|
||||
netdata_log_error("cannot cancel thread. pthread_cancel() failed with code %d.", ret);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot cancel thread. pthread_cancel() failed with code %d.", ret);
|
||||
#endif
|
||||
|
||||
return ret;
|
||||
|
@ -332,7 +330,7 @@ int netdata_thread_cancel(netdata_thread_t thread) {
|
|||
int netdata_thread_join(netdata_thread_t thread, void **retval) {
|
||||
int ret = pthread_join(thread, retval);
|
||||
if(ret != 0)
|
||||
netdata_log_error("cannot join thread. pthread_join() failed with code %d.", ret);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot join thread. pthread_join() failed with code %d.", ret);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
@ -340,7 +338,7 @@ int netdata_thread_join(netdata_thread_t thread, void **retval) {
|
|||
int netdata_thread_detach(pthread_t thread) {
|
||||
int ret = pthread_detach(thread);
|
||||
if(ret != 0)
|
||||
netdata_log_error("cannot detach thread. pthread_detach() failed with code %d.", ret);
|
||||
nd_log(NDLS_DAEMON, NDLP_WARNING, "cannot detach thread. pthread_detach() failed with code %d.", ret);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
|
8
libnetdata/uuid/Makefile.am
Normal file
8
libnetdata/uuid/Makefile.am
Normal file
|
@ -0,0 +1,8 @@
|
|||
# SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
AUTOMAKE_OPTIONS = subdir-objects
|
||||
MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
|
||||
|
||||
dist_noinst_DATA = \
|
||||
README.md \
|
||||
$(NULL)
|
13
libnetdata/uuid/README.md
Normal file
13
libnetdata/uuid/README.md
Normal file
|
@ -0,0 +1,13 @@
|
|||
<!--
|
||||
title: "UUID"
|
||||
custom_edit_url: https://github.com/netdata/netdata/edit/master/libnetdata/uuid/README.md
|
||||
sidebar_label: "UUID"
|
||||
learn_topic_type: "Tasks"
|
||||
learn_rel_path: "Developers/libnetdata"
|
||||
-->
|
||||
|
||||
# UUID
|
||||
|
||||
Netdata uses libuuid for managing UUIDs.
|
||||
|
||||
In this folder are a few custom helpers.
|
179
libnetdata/uuid/uuid.c
Normal file
179
libnetdata/uuid/uuid.c
Normal file
|
@ -0,0 +1,179 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#include "../libnetdata.h"
|
||||
|
||||
void uuid_unparse_lower_compact(const uuid_t uuid, char *out) {
|
||||
static const char *hex_chars = "0123456789abcdef";
|
||||
for (int i = 0; i < 16; i++) {
|
||||
out[i * 2] = hex_chars[(uuid[i] >> 4) & 0x0F];
|
||||
out[i * 2 + 1] = hex_chars[uuid[i] & 0x0F];
|
||||
}
|
||||
out[32] = '\0'; // Null-terminate the string
|
||||
}
|
||||
|
||||
inline int uuid_parse_compact(const char *in, uuid_t uuid) {
|
||||
if (strlen(in) != 32)
|
||||
return -1; // Invalid input length
|
||||
|
||||
for (int i = 0; i < 16; i++) {
|
||||
int high = hex_char_to_int(in[i * 2]);
|
||||
int low = hex_char_to_int(in[i * 2 + 1]);
|
||||
|
||||
if (high < 0 || low < 0)
|
||||
return -1; // Invalid hexadecimal character
|
||||
|
||||
uuid[i] = (high << 4) | low;
|
||||
}
|
||||
|
||||
return 0; // Success
|
||||
}
|
||||
|
||||
int uuid_parse_flexi(const char *in, uuid_t uu) {
|
||||
if(!in || !*in)
|
||||
return -1;
|
||||
|
||||
size_t hexCharCount = 0;
|
||||
size_t hyphenCount = 0;
|
||||
const char *s = in;
|
||||
int byteIndex = 0;
|
||||
uuid_t uuid; // work on a temporary place, to not corrupt the previous value of uu if we fail
|
||||
|
||||
while (*s && byteIndex < 16) {
|
||||
if (*s == '-') {
|
||||
s++;
|
||||
hyphenCount++;
|
||||
|
||||
if (unlikely(hyphenCount > 4))
|
||||
// Too many hyphens
|
||||
return -2;
|
||||
}
|
||||
|
||||
if (likely(isxdigit(*s))) {
|
||||
int high = hex_char_to_int(*s++);
|
||||
hexCharCount++;
|
||||
|
||||
if (likely(isxdigit(*s))) {
|
||||
int low = hex_char_to_int(*s++);
|
||||
hexCharCount++;
|
||||
|
||||
uuid[byteIndex++] = (high << 4) | low;
|
||||
}
|
||||
else
|
||||
// Not a valid UUID (expected a pair of hex digits)
|
||||
return -3;
|
||||
}
|
||||
else
|
||||
// Not a valid UUID
|
||||
return -4;
|
||||
}
|
||||
|
||||
if (unlikely(byteIndex < 16))
|
||||
// Not enough data to form a UUID
|
||||
return -5;
|
||||
|
||||
if (unlikely(hexCharCount != 32))
|
||||
// wrong number of hex digits
|
||||
return -6;
|
||||
|
||||
if(unlikely(hyphenCount != 0 && hyphenCount != 4))
|
||||
// wrong number of hyphens
|
||||
return -7;
|
||||
|
||||
// copy the final value
|
||||
memcpy(uu, uuid, sizeof(uuid_t));
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
// ----------------------------------------------------------------------------
|
||||
// unit test
|
||||
|
||||
static inline void remove_hyphens(const char *uuid_with_hyphens, char *uuid_without_hyphens) {
|
||||
while (*uuid_with_hyphens) {
|
||||
if (*uuid_with_hyphens != '-') {
|
||||
*uuid_without_hyphens++ = *uuid_with_hyphens;
|
||||
}
|
||||
uuid_with_hyphens++;
|
||||
}
|
||||
*uuid_without_hyphens = '\0';
|
||||
}
|
||||
|
||||
int uuid_unittest(void) {
|
||||
const int num_tests = 100000;
|
||||
int failed_tests = 0;
|
||||
|
||||
int i;
|
||||
for (i = 0; i < num_tests; i++) {
|
||||
uuid_t original_uuid, parsed_uuid;
|
||||
char uuid_str_with_hyphens[UUID_STR_LEN], uuid_str_without_hyphens[UUID_COMPACT_STR_LEN];
|
||||
|
||||
// Generate a random UUID
|
||||
switch(i % 2) {
|
||||
case 0:
|
||||
uuid_generate(original_uuid);
|
||||
break;
|
||||
|
||||
case 1:
|
||||
uuid_generate_random(original_uuid);
|
||||
break;
|
||||
}
|
||||
|
||||
// Unparse it with hyphens
|
||||
bool lower = false;
|
||||
switch(i % 3) {
|
||||
case 0:
|
||||
uuid_unparse_lower(original_uuid, uuid_str_with_hyphens);
|
||||
lower = true;
|
||||
break;
|
||||
|
||||
case 1:
|
||||
uuid_unparse(original_uuid, uuid_str_with_hyphens);
|
||||
break;
|
||||
|
||||
case 2:
|
||||
uuid_unparse_upper(original_uuid, uuid_str_with_hyphens);
|
||||
break;
|
||||
}
|
||||
|
||||
// Remove the hyphens
|
||||
remove_hyphens(uuid_str_with_hyphens, uuid_str_without_hyphens);
|
||||
|
||||
if(lower) {
|
||||
char test[UUID_COMPACT_STR_LEN];
|
||||
uuid_unparse_lower_compact(original_uuid, test);
|
||||
if(strcmp(test, uuid_str_without_hyphens) != 0) {
|
||||
printf("uuid_unparse_lower_compact() failed, expected '%s', got '%s'\n",
|
||||
uuid_str_without_hyphens, test);
|
||||
failed_tests++;
|
||||
}
|
||||
}
|
||||
|
||||
// Parse the UUID string with hyphens
|
||||
int parse_result = uuid_parse_flexi(uuid_str_with_hyphens, parsed_uuid);
|
||||
if (parse_result != 0) {
|
||||
printf("uuid_parse_flexi() returned -1 (parsing error) for UUID with hyphens: %s\n", uuid_str_with_hyphens);
|
||||
failed_tests++;
|
||||
} else if (uuid_compare(original_uuid, parsed_uuid) != 0) {
|
||||
printf("uuid_parse_flexi() parsed value mismatch for UUID with hyphens: %s\n", uuid_str_with_hyphens);
|
||||
failed_tests++;
|
||||
}
|
||||
|
||||
// Parse the UUID string without hyphens
|
||||
parse_result = uuid_parse_flexi(uuid_str_without_hyphens, parsed_uuid);
|
||||
if (parse_result != 0) {
|
||||
printf("uuid_parse_flexi() returned -1 (parsing error) for UUID without hyphens: %s\n", uuid_str_without_hyphens);
|
||||
failed_tests++;
|
||||
}
|
||||
else if(uuid_compare(original_uuid, parsed_uuid) != 0) {
|
||||
printf("uuid_parse_flexi() parsed value mismatch for UUID without hyphens: %s\n", uuid_str_without_hyphens);
|
||||
failed_tests++;
|
||||
}
|
||||
|
||||
if(failed_tests)
|
||||
break;
|
||||
}
|
||||
|
||||
printf("UUID: failed %d out of %d tests.\n", failed_tests, i);
|
||||
return failed_tests;
|
||||
}
|
29
libnetdata/uuid/uuid.h
Normal file
29
libnetdata/uuid/uuid.h
Normal file
|
@ -0,0 +1,29 @@
|
|||
// SPDX-License-Identifier: GPL-3.0-or-later
|
||||
|
||||
#ifndef NETDATA_UUID_H
|
||||
#define NETDATA_UUID_H
|
||||
|
||||
UUID_DEFINE(streaming_from_child_msgid, 0xed,0x4c,0xdb, 0x8f, 0x1b, 0xeb, 0x4a, 0xd3, 0xb5, 0x7c, 0xb3, 0xca, 0xe2, 0xd1, 0x62, 0xfa);
|
||||
UUID_DEFINE(streaming_to_parent_msgid, 0x6e, 0x2e, 0x38, 0x39, 0x06, 0x76, 0x48, 0x96, 0x8b, 0x64, 0x60, 0x45, 0xdb, 0xf2, 0x8d, 0x66);
|
||||
UUID_DEFINE(health_alert_transition_msgid, 0x9c, 0xe0, 0xcb, 0x58, 0xab, 0x8b, 0x44, 0xdf, 0x82, 0xc4, 0xbf, 0x1a, 0xd9, 0xee, 0x22, 0xde);
|
||||
|
||||
// this is also defined in alarm-notify.sh.in
|
||||
UUID_DEFINE(health_alert_notification_msgid, 0x6d, 0xb0, 0x01, 0x8e, 0x83, 0xe3, 0x43, 0x20, 0xae, 0x2a, 0x65, 0x9d, 0x78, 0x01, 0x9f, 0xb7);
|
||||
|
||||
#define UUID_COMPACT_STR_LEN 33
|
||||
void uuid_unparse_lower_compact(const uuid_t uuid, char *out);
|
||||
int uuid_parse_compact(const char *in, uuid_t uuid);
|
||||
int uuid_parse_flexi(const char *in, uuid_t uuid);
|
||||
|
||||
static inline int uuid_memcmp(const uuid_t *uu1, const uuid_t *uu2) {
|
||||
return memcmp(uu1, uu2, sizeof(uuid_t));
|
||||
}
|
||||
|
||||
static inline int hex_char_to_int(char c) {
|
||||
if (c >= '0' && c <= '9') return c - '0';
|
||||
if (c >= 'a' && c <= 'f') return c - 'a' + 10;
|
||||
if (c >= 'A' && c <= 'F') return c - 'A' + 10;
|
||||
return -1; // Invalid hexadecimal character
|
||||
}
|
||||
|
||||
#endif //NETDATA_UUID_H
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Reference in a new issue