0
0
Fork 0
mirror of https://github.com/netdata/netdata.git synced 2025-04-07 14:55:41 +00:00
netdata_netdata/libnetdata/dictionary/README.md
Costa Tsaousis cb7af25c09
RRD structures managed by dictionaries ()
* rrdset - in progress

* rrdset optimal constructor; rrdset conflict

* rrdset final touches

* re-organization of rrdset object members

* prevent use-after-free

* dictionary dfe supports also counting of iterations

* rrddim managed by dictionary

* rrd.h cleanup

* DICTIONARY_ITEM now is referencing actual dictionary items in the code

* removed rrdset linked list

* Revert "removed rrdset linked list"

This reverts commit 690d6a588b4b99619c2c5e10f84e8f868ae6def5.

* removed rrdset linked list

* added comments

* Switch chart uuid to static allocation in rrdset
Remove unused functions

* rrdset_archive() and friends...

* always create rrdfamily

* enable ml_free_dimension

* rrddim_foreach done with dfe

* most custom rrddim loops replaced with rrddim_foreach

* removed accesses to rrddim->dimensions

* removed locks that are no longer needed

* rrdsetvar is now managed by the dictionary

* set rrdset is rrdsetvar, fixes https://github.com/netdata/netdata/pull/13646#issuecomment-1242574853

* conflict callback of rrdsetvar now properly checks if it has to reset the variable

* dictionary registered callbacks accept as first parameter the DICTIONARY_ITEM

* dictionary dfe now uses internal counter to report; avoided excess variables defined with dfe

* dictionary walkthrough callbacks get dictionary acquired items

* dictionary reference counters that can be dupped from zero

* added advanced functions for get and del

* rrdvar managed by dictionaries

* thread safety for rrdsetvar

* faster rrdvar initialization

* rrdvar string lengths should match in all add, del, get functions

* rrdvar internals hidden from the rest of the world

* rrdvar is now acquired throughout netdata

* hide the internal structures of rrdsetvar

* rrdsetvar is now acquired through out netdata

* rrddimvar managed by dictionary; rrddimvar linked list removed; rrddimvar structures hidden from the rest of netdata

* better error handling

* dont create variables if not initialized for health

* dont create variables if not initialized for health again

* rrdfamily is now managed by dictionaries; references of it are acquired dictionary items

* type checking on acquired objects

* rrdcalc renaming of functions

* type checking for rrdfamily_acquired

* rrdcalc managed by dictionaries

* rrdcalc double free fix

* host rrdvars is always needed

* attempt to fix deadlock 1

* attempt to fix deadlock 2

* Remove unused variable

* attempt to fix deadlock 3

* snprintfz

* rrdcalc index in rrdset fix

* Stop storing active charts and computing chart hashes

* Remove store active chart function

* Remove compute chart hash function

* Remove sql_store_chart_hash function

* Remove store_active_dimension function

* dictionary delayed destruction

* formatting and cleanup

* zero dictionary base on rrdsetvar

* added internal error to log delayed destructions of dictionaries

* typo in rrddimvar

* added debugging info to dictionary

* debug info

* fix for rrdcalc keys being empty

* remove forgotten unlock

* remove deadlock

* Switch to metadata version 5 and drop
  chart_hash
  chart_hash_map
  chart_active
  dimension_active
  v_chart_hash

* SQL cosmetic changes

* do not busy wait while destroying a referenced dictionary

* remove deadlock

* code cleanup; re-organization;

* fast cleanup and flushing of dictionaries

* number formatting fixes

* do not delete configured alerts when archiving a chart

* rrddim obsolete linked list management outside dictionaries

* removed duplicate contexts call

* fix crash when rrdfamily is not initialized

* dont keep rrddimvar referenced

* properly cleanup rrdvar

* removed some locks

* Do not attempt to cleanup chart_hash / chart_hash_map

* rrdcalctemplate managed by dictionary

* register callbacks on the right dictionary

* removed some more locks

* rrdcalc secondary index replaced with linked-list; rrdcalc labels updates are now executed by health thread

* when looking up for an alarm look using both chart id and chart name

* host initialization a bit more modular

* init rrdlabels on host update

* preparation for dictionary views

* improved comment

* unused variables without internal checks

* service threads isolation and worker info

* more worker info in service thread

* thread cancelability debugging with internal checks

* strings data races addressed; fixes https://github.com/netdata/netdata/issues/13647

* dictionary modularization

* Remove unused SQL statement definition

* unit-tested thread safety of dictionaries; removed data race conditions on dictionaries and strings; dictionaries now can detect if the caller is holds a write lock and automatically all the calls become their unsafe versions; all direct calls to unsafe version is eliminated

* remove worker_is_idle() from the exit of service functions, because we lose the lock time between loops

* rewritten dictionary to have 2 separate locks, one for indexing and another for traversal

* Update collectors/cgroups.plugin/sys_fs_cgroup.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* Update collectors/cgroups.plugin/sys_fs_cgroup.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* Update collectors/proc.plugin/proc_net_dev.c

Co-authored-by: Vladimir Kobal <vlad@prokk.net>

* fix memory leak in rrdset cache_dir

* minor dictionary changes

* dont use index locks in single threaded

* obsolete dict option

* rrddim options and flags separation; rrdset_done() optimization to keep array of reference pointers to rrddim;

* fix jump on uninitialized value in dictionary; remove double free of cache_dir

* addressed codacy findings

* removed debugging code

* use the private refcount on dictionaries

* make dictionary item desctructors work on dictionary destruction; strictier control on dictionary API; proper cleanup sequence on rrddim;

* more dictionary statistics

* global statistics about dictionary operations, memory, items, callbacks

* dictionary support for views - missing the public API

* removed warning about unused parameter

* chart and context name for cloud

* chart and context name for cloud, again

* dictionary statistics fixed; first implementation of dictionary views - not currently used

* only the master can globally delete an item

* context needs netdata prefix

* fix context and chart it of spins

* fix for host variables when health is not enabled

* run garbage collector on item insert too

* Fix info message; remove extra "using"

* update dict unittest for new placement of garbage collector

* we need RRDHOST->rrdvars for maintaining custom host variables

* Health initialization needs the host->host_uuid

* split STRING to its own files; no code changes other than that

* initialize health unconditionally

* unit tests do not pollute the global scope with their variables

* Skip initialization when creating archived hosts on startup. When a child connects it will initialize properly

Co-authored-by: Stelios Fragkakis <52996999+stelfrag@users.noreply.github.com>
Co-authored-by: Vladimir Kobal <vlad@prokk.net>
2022-09-19 23:46:13 +03:00

14 KiB

Dictionaries

Netdata dictionaries associate a name with a value:

  • A name can be any string.
  • A value can be anything.

Such a pair of a name and a value consists of an item or an entry in the dictionary.

Dictionaries provide an interface to:

  • Add an item to the dictionary
  • Get an item from the dictionary (provided its name)
  • Delete an item from the dictionary (provided its name)
  • Traverse the list of items in the dictionary

Dictionaries are ordered, meaning that the order they have been added is preserved while traversing them. The caller may reverse this order by passing the flag DICT_OPTION_ADD_IN_FRONT when creating the dictionary.

Dictionaries guarantee uniqueness of all items added to them, meaning that only one item with a given name can exist in the dictionary at any given time.

Dictionaries are extremely fast in all operations. They are indexing the keys with JudyHS (or AVL when libJudy is not available) and they utilize a double-linked-list for the traversal operations. Deletion is the most expensive operation, usually somewhat slower than insertion.

Memory management

Dictionaries come with 2 memory management options:

  • Clone (copy) the name and/or the value to memory allocated by the dictionary.
  • Link the name and/or the value, without allocating any memory about them.

In clone mode, the dictionary guarantees that all operations on the dictionary items will automatically take care of the memory used by the name and/or the value. In case the value is an object that needs to have user allocated memory, the following callback functions can be registered:

1.dictionary_register_insert_callback() that will be called just after the insertion of an item to the dictionary, or after the replacement of the value of a dictionary item (but while the dictionary is write-locked - if locking is enabled). 2. dictionary_register_delete_callback() that will be called just prior to the deletion of an item from the dictionary, or prior to the replacement of the value of a dictionary item (but while the dictionary is write-locked - if locking is enabled). 3. dictionary_register_conflict_callback() that will be called when DICT_OPTION_DONT_OVERWRITE_VALUE is set and another value is attempted to be inserted for the same key.

In link mode, the name and/or the value are just linked to the dictionary item, and it is the user's responsibility to free the memory they use after an item is deleted from the dictionary or when the dictionary is destroyed.

By default, clone mode is used for both the name and the value.

To use link mode for names, add DICT_OPTION_NAME_LINK_DONT_CLONE to the flags when creating the dictionary.

To use link mode for values, add DICT_OPTION_VALUE_LINK_DONT_CLONE to the flags when creating the dictionary.

Locks

The dictionary allows both single-threaded operation (no locks - faster) and multi-threaded operation utilizing a read-write lock.

The default is multi-threaded. To enable single-threaded add DICT_OPTION_SINGLE_THREADED to the flags when creating the dictionary.

Hash table operations

The dictionary supports the following operations supported by the hash table:

  • dictionary_set() to add an item to the dictionary, or change its value.
  • dictionary_get() to get an item from the dictionary.
  • dictionary_del() to delete an item from the dictionary.

Creation and destruction

Use dictionary_create() to create a dictionary.

Use dictionary_destroy() to destroy a dictionary. When destroyed, a dictionary frees all the memory it has allocated on its own. This can be complemented by the registration of a deletion callback function that can be called upon deletion of each item in the dictionary, which may free additional resources.

dictionary_set()

This call is used to:

  • add an item to the dictionary.
  • reset the value of an existing item in the dictionary.

If resetting is not desired, add DICT_OPTION_DONT_OVERWRITE_VALUE to the flags when creating the dictionary. In this case, dictionary_set() will return the value of the original item found in the dictionary instead of resetting it and the value passed to the call will be ignored. Optionally a conflict callback function can be registered, to manipulate (probably merge or extend) the original value, based on the new value attempted to be added to the dictionary.

For multi-threaded operation, the dictionary_set() calls get an exclusive write lock on the dictionary.

The format is:

value = dictionary_set(dict, name, value, value_len);

Where:

  • dict is a pointer to the dictionary previously created.
  • name is a pointer to a string to be used as the key of this item. The name must not be NULL and must not be an empty string "".
  • value is a pointer to the value associated with this item. In clone mode, if value is NULL, a new memory allocation will be made of value_len size and will be initialized to zero.
  • value_len is the size of the value data. If value_len is zero, no allocation will be done and the dictionary item will permanently have the NULL value.

IMPORTANT
There is also an unsafe version (without locks) of this call. This is to be used when traversing the dictionary in write mode. It should never be called without an active lock on the dictionary, which can only be acquired while traversing.

dictionary_get()

This call is used to get the value of an item, given its name. It utilizes the JudyHS hash table for making the lookup.

For multi-threaded operation, the dictionary_get() call gets a shared read lock on the dictionary.

In clone mode, the value returned is not guaranteed to be valid, as any other thread may delete the item from the dictionary at any time. To ensure the value will be available, use dictionary_get_and_acquire_item(), which uses a reference counter to defer deletes until the item is released.

The format is:

value = dictionary_get(dict, name);

Where:

  • dict is a pointer to the dictionary previously created.
  • name is a pointer to a string to be used as the key of this item. The name must not be NULL and must not be an empty string "".

IMPORTANT
There is also an unsafe version (without locks) of this call. This is to be used when traversing the dictionary. It should never be called without an active lock on the dictionary, which can only be acquired while traversing.

dictionary_del()

This call is used to delete an item from the dictionary, given its name.

If there is a deletion callback registered to the dictionary (dictionary_register_delete_callback()), it is called prior to the actual deletion of the item.

For multi-threaded operation, the dictionary_del() calls get an exclusive write lock on the dictionary.

The format is:

value = dictionary_del(dict, name);

Where:

  • dict is a pointer to the dictionary previously created.
  • name is a pointer to a string to be used as the key of this item. The name must not be NULL and must not be an empty string "".

IMPORTANT
There is also an unsafe version (without locks) of this call. This is to be used when traversing the dictionary, to delete the current item. It should never be called without an active lock on the dictionary, which can only be acquired while traversing.

dictionary_get_and_acquire_item()

This call can be used the search and get a dictionary item, while ensuring that it will be available for use, until dictionary_acquired_item_release() is called.

This call does not return the value of the dictionary item. It returns an internal pointer to a structure that maintains the reference counter used to protect the actual value. To get the value of the item (the same value as returned by dictionary_get()), the function dictionary_acquired_item_value() has to be called.

Example:

// create the dictionary
DICTIONARY *dict = dictionary_create(DICT_OPTION_NONE);

// add an item to it
dictionary_set(dict, "name", "value", 6);

// find the item we added and acquire it
void *item = dictionary_get_and_acquire_item(dict, "name");

// extract its value
char *value = (char *)dictionary_acquired_item_value(dict, item);

// now value points to the string "value"
printf("I got value = '%s'\n", value);

// release the item, so that it can deleted
dictionary_acquired_item_release(dict, item);

// destroy the dictionary
dictionary_destroy(dict);

When items are acquired, a reference counter is maintained to keep track of how many users exist for it. If an item with a non-zero number of users is deleted, it is removed from the index, it can be added again to the index (without conflict), and although it exists in the linked-list, it is not offered during traversal. Garbage collection to actually delete the item happens every time a write-locked dictionary is unlocked (just before the unlock) and items are deleted only if no users are using them.

If any item is still acquired when the dictionary is destroyed, the destruction of the dictionary is also deferred until all the acquired items are released. When the dictionary is destroyed like that, all operations on the dictionary fail (traversals do not traverse, insertions do not insert, deletions do not delete, searches do not find any items, etc). Once the last item in the dictionary is released, the dictionary is automatically destroyed too.

Traversal

Dictionaries offer 3 ways to traverse the entire dictionary:

  • walkthrough, implemented by setting a callback function to be called for every item.
  • sorted walkthrough, which first sorts the dictionary and then call a callback function for every item.
  • foreach, a way to traverse the dictionary with a for-next loop.

All these methods are available in read or write mode. In read mode only lookups are allowed to the dictionary. In write lookups but also insertions and deletions are allowed.

While traversing the dictionary with any of these methods, all calls to the dictionary have to use the _unsafe versions of the function calls, otherwise deadlocks may arise.

IMPORTANT
The dictionary itself does not check to ensure that a user is actually using the right lock mode (read or write) while traversing the dictionary for each of the unsafe calls.

walkthrough (callback)

There are 4 calls:

  • dictionary_walkthrough_read() and dictionary_sorted_walkthrough_read() that acquire a shared read lock, and they call a callback function for every item of the dictionary. The callback function may use the unsafe versions of the dictionary_get() calls to lookup other items in the dictionary, but it should not attempt to add or remove items to/from the dictionary.
  • dictionary_walkthrough_write() and dictionary_sorted_walkthrough_write() that acquire an exclusive write lock, and they call a callback function for every item of the dictionary. This is to be used when items need to be added to or removed from the dictionary. The write versions can be used to delete any or all the items from the dictionary, including the currently working one. For the sorted version, all items in the dictionary maintain a reference counter, so all deletions are deferred until the sorted walkthrough finishes.**

The non sorted versions traverse the items in the same order they have been added to the dictionary (or the reverse order if the flag DICT_OPTION_ADD_IN_FRONT is set during dictionary creation). The sorted versions sort alphabetically the items based on their name, and then they traverse them in the sorted order.

The callback function returns an int. If this value is negative, traversal of the dictionary is stopped immediately and the negative value is returned to the caller. If the returned value of all callback calls is zero or positive, the walkthrough functions return the sum of the return values of all callbacks. So, if you are just interested to know how many items fall into some condition, write a callback function that returns 1 when the item satisfies that condition and 0 when it does not and the walkthrough function will return how many tested positive.

foreach (for-next loop)

The following is a snippet of such a loop:

MY_ITEM *item;
dfe_start_read(dict, item) {
   printf("hey, I got an item named '%s' with value ptr %08X", item_name, item);
}
dfe_done(item);

The item parameter gives the name of the pointer to be used while iterating the items. Any name is accepted.

The item_name is a variable that is automatically created, by concatenating whatever is given as item and _name. So, if you call dfe_start_read(dict, myvar), the name will be myvar_name.

Both dfe_start_read(dict, item) and dfe_done(item) are together inside a do { ... } while(0) loop, so that the following will work:

MY_ITEM *item;

if(x = 1)
    // do {
    dfe_start_read(dict, item)
        printf("hey, I got an item named '%s' with value ptr %08X", item_name, item);
    dfe_done(item);
    // } while(0);
else
    something else;

In the above, the if(x == 1) condition will work as expected. It will do the foreach loop when x is 1, otherwise it will run something else.

There are 2 versions of dfe_start:

  • dfe_start_read() that acquires a shared read lock to the dictionary.
  • dfe_start_write() that acquires an exclusive write lock to the dictionary.

While in the loop, depending on the read or write versions of dfe_start, the caller may lookup or manipulate the dictionary using the unsafe functions. The rules are the same with the unsorted walkthrough callback functions.

PS: DFE is Dictionary For Each.

special multi-threaded lockless case

Since the dictionary uses a hash table and a double linked list, if the contract between 2 threads is for one to use the hash table functions only (set, get - but no del) and the other to use the traversal ones only, the dictionary allows concurrent use without locks.

This is currently used in statsd:

  • the data collection thread uses only get and set. It never uses del. New items are added at the front of the linked list (DICT_OPTION_ADD_IN_FRONT).
  • the flushing thread is only traversing the dictionary up to the point it last traversed it (it uses a flag for that to know where it stopped last time). It never uses get, set or del.