healthchecks_healthchecks

mirror of https://github.com/healthchecks/healthchecks.git synced 2025-04-07 22:25:35 +00:00

Author	SHA1	Message	Date
Pēteris Caune	b685e66b71	Add a workaround for reverse() omitting script prefix when on thread https://code.djangoproject.com/ticket/35985 cc: #1091	2024-12-09 11:53:53 +02:00
Pēteris Caune	22268c1484	Move absolute URL construction to hc.lib.urls.absolute_reverse() absolute_reverse() works the same as django.urls.reverse() except it generates absolute URLs (starting with http[s]://)	2024-12-03 17:24:27 +02:00
Pēteris Caune	5e848f4976	Add index on api_flip (owner_id, created) This helps queries in hc.front.views._get_events, especially for checks that are flipping between up and down states a lot.	2024-12-03 10:37:01 +02:00
Pēteris Caune	b328c8739f	Reduce the number of Check.get_status() calls	2024-11-14 13:33:21 +02:00
Pēteris Caune	5c67c94654	Add a missing article	2024-11-08 11:31:09 +02:00
Pēteris Caune	5912758a8a	Update email alerts to mention failure reason cc: #1069	2024-11-08 11:20:44 +02:00
Pēteris Caune	9edae634c7	Add Flip.reason field cc: #1069	2024-11-08 10:24:50 +02:00
Pēteris Caune	79da9e9f4f	Fix auto-fixable ruff warnings (`ruff check --fix`)	2024-11-07 15:15:58 +02:00
Pēteris Caune	4907073c55	Remove unneeded quotes	2024-11-06 19:32:44 +02:00
Pēteris Caune	e048ec4c48	Fix "class Foo(object):" -> "class Foo:" In Python 3 these are equivalent, and shorter is better.	2024-10-29 17:57:50 +02:00
Pēteris Caune	a6ca589b34	Fix pyright warning	2024-10-29 11:54:48 +02:00
Pēteris Caune	c372e3232f	Update MS Teams legacy webhook retirement date to Jan 2025 Microsoft pushed it forward again: https://devblogs.microsoft.com/microsoft365dev/retirement-of-office-365-connectors-within-microsoft-teams/	2024-10-25 09:51:58 +03:00
Pēteris Caune	9e69b5b5f5	Fix smtp listener to reject email addresses with unexpected domain cc: #1077	2024-10-21 17:48:57 +03:00
Pēteris Caune	84f22c8978	Fix type warnings	2024-10-21 17:34:02 +03:00
Pēteris Caune	a5d4dc2db5	Fix smtp listener to reject email addresses with non-UUID local parts cc: #1077)	2024-10-21 15:56:24 +03:00
Pēteris Caune	c91213179f	Fix API to gracefully handle too long slugs	2024-10-16 12:35:30 +03:00
Pēteris Caune	8c210e151f	Update the Signal integration to retry on network errors	2024-10-14 11:19:37 +03:00
Pēteris Caune	4f9b0b11b9	Update Signal transport to log unexpected signal-cli replies When signal-cli returns an error that we are not handling yet, log the precise JSON message that signal-cli returns. This is for debug & development: We can look at the logged messages and see what additional special error handling may be needed.	2024-10-10 10:21:08 +03:00
Pēteris Caune	fd96cc794b	Remove unused bits	2024-10-04 17:34:30 +03:00
Pēteris Caune	a51420744c	Add RiskCheck: disable in SMS transport This is to reduce the chance of hitting Twilio error 30453, "Message couldn't be delivered". https://www.twilio.com/docs/api/errors/30453	2024-10-02 17:01:23 +03:00
Pēteris Caune	de4c4897e3	Remove `prunenotifications` management command Notifications are now cleaned up automatically during pinging.	2024-10-02 09:24:01 +03:00
Pēteris Caune	13f92b90ef	Update settings.py to read SECURE_PROXY_SSL_HEADER from env vars And add it to docs. And add a system check to make sure it, if set, is a tuple with 2 elements. cc: #851	2024-10-01 19:13:26 +03:00
Pēteris Caune	e73d7a1ece	Remove `pruneflips` management command Flips are now cleaned up automatically during pinging.	2024-10-01 15:33:56 +03:00
Pēteris Caune	2cb47d3742	Make the sorting of null values in Flip.select_channels() explicit	2024-09-12 10:52:06 +03:00
Pēteris Caune	f241d070e1	Update Flip.select_channels() to sort channels by last_notify_duration If a check has multiple associated channels, some are slow and some are quick, handle the quick ones first.	2024-09-12 10:44:56 +03:00
Pēteris Caune	f60af9a156	Update ntfy integration to give up db connection before network IO	2024-09-12 10:30:58 +03:00
Pēteris Caune	28af3720f4	Increase outgoing webhook timeout from 10 to 30 seconds Also simplify the retry logic: each retry attempt is now allowed to use the full 30 seconds. This means, a single webhook delivery can take up to 3*30=90 seconds.	2024-09-11 12:37:40 +03:00
Pēteris Caune	13217af304	Add --pool parameter in `manage.py sendalerts` If sendalerts receives this parameter, it reconfigures settings.DATABASES to enable db connection pooling (using psycopg_pool with default parameters). This lets us use many concurrent worker threads but not run out of database connections. For example, with `--num-workers 100 --pool`, up to 100 worker threads can run concurrently, but only 3 threads can get a database connection from the pool, the rest have to wait. When a worker thread gives up a connection (by calling `close_old_connections`), another thread can continue. A worker thread can give up a db connection before it is fully finished if it anticipates a long network IO operation ahead. The Webhook transport does this before making a curl call. psycopg_pool's default pool size is 4 connections. One connection is used up by the main thread, so 3 connections are available for the worker threads.	2024-09-10 14:58:24 +03:00
Pēteris Caune	8eecece0bb	Add db migration for the updated msteams name	2024-09-10 14:45:48 +03:00
Pēteris Caune	6bf588d984	Remove unused import	2024-09-04 10:49:09 +03:00
Pēteris Caune	9d4fc031aa	Fix sendalerts to check the self.shutdown flag more often	2024-09-03 10:30:18 +03:00
Pēteris Caune	3275e0ffaa	Update notify() to return logs instead of printing them	2024-09-03 10:23:15 +03:00
Pēteris Caune	8c56ca6dde	Update sendalerts to mark flip as processed on thread Previously this was done in process_one_flip (so on the main thread). The advantage of doing this way is the flip gets marked as processed only when the thread has started and has acquired a db connection. There is now a smaller pause between a sendalerts process claiming a flip, and actually starting work on it.	2024-09-01 15:28:48 +03:00
Pēteris Caune	fd75049e0c	Fix type warnings	2024-08-31 19:23:10 +03:00
Pēteris Caune	a463daa775	Update Webhook transport to close db connection before network IO Webhook requests can take 20+ seconds. During that time we hold on to a database connection. With this commit, the Webhook transport closes its DB connection before making a curl call. With psycopg2 this does not have much effect. But with psycopg 3 & connection pooling we will be able to use more sendalerts workers than we have database connections. While one worker is busy making a slow curl call, another worker can grab its freed up connection and do some work. Django's test runner is not happy with connections closed mid-test, so I patched out close_old_connections() in affected tests.	2024-08-31 19:18:17 +03:00
Pēteris Caune	9803d77a1d	Set explicit max_workers value for ThreadPoolExecutor This is a tricky one: the default value for max_workers is None. But it doesn't mean "unlimited", in Python 3.8+ it means "min(32, os.cpu_count() + 4)" For example on 8-core CPU the effective value would be 8 + 4 = 12, and passing anything above 12 to `--max-workers` would have no effect.	2024-08-31 19:11:39 +03:00
Pēteris Caune	4cd677536d	Remove sent notification counter The counter was slightly wrong (it counted lost races as sent notifications). Rather than complicating code to make it correct, let's rather just remove it :-)	2024-08-31 19:07:25 +03:00
Pēteris Caune	faa1a2c99f	Add logging for exceptions thrown inside notify()	2024-08-31 19:04:41 +03:00
Pēteris Caune	7641f2a9a1	Switch to using close_old_connections() instead of connection.close()	2024-08-31 19:02:11 +03:00
Pēteris Caune	d76dc53e49	Increase Signal send timeout to 60 seconds	2024-08-31 11:07:17 +03:00
Pēteris Caune	b1b0a57033	Tweak sendalerts log format	2024-08-30 17:00:30 +03:00
Pēteris Caune	8a3a9b2a7e	Fix code comments	2024-08-29 16:30:28 +03:00
Pēteris Caune	029881f3b9	Refactor sendalerts * Remove the --no-loop and --no-threads arguments * Use a threadpool to do multiple sends concurrently * Add a new `--num-workers` argument. It limits how many flips we grab from the database and process concurrently. * Do not prioritize flips with historically low send times any more (not as important now with concurrent sending, and simpler this way) * Workers close db connections when they finish (to keep the number of idle connections low) Note: concurrent.futures.ThreadPoolExecutor internally has an unbounded queue, it will accept any amount of jobs and keep them queued. We don't want that. We only want to grab a flip, and commit to processing it, if we know there's a free worker for it. Therefore we're tracking the number of jobs in flight using a semaphore (`self.seats`).	2024-08-29 16:20:36 +03:00
Pēteris Caune	3968a4f9e0	Update MS Teams Connector EOL date	2024-08-27 16:34:59 +03:00
Pēteris Caune	70b55a777b	Add migration which updates Channel.kind values This is to go with `8054191be3`, and should have been in there :-) cc: #1050	2024-08-17 12:12:47 +03:00
Pēteris Caune	d3ae4e7fac	Add support for $SLUG placeholder in webhook payloads Fixes: #1049	2024-08-16 13:24:12 +03:00
Pēteris Caune	56862a1c49	Update NotificationsAdmin to use __ lookup in list_display	2024-08-07 17:39:17 +03:00
Pēteris Caune	42b733540d	Fix type annotation It used the wrong model name and neither me nor mypy noticed until upgrade to django-stubs 5.0.4	2024-07-29 09:50:56 +03:00
Pēteris Caune	7346994ae8	Fix field name in TypedDict used for type checking	2024-07-18 18:19:01 +03:00
Pēteris Caune	bdb6f18a3d	Add "uuid" field in API responses when read/write key is used The API responses already contain ping_url, update_url, resume_url, pause_url fields where the UUID can be extracted from, so we are not exposing new information. The extraction can be finicky in, say, shell-scripting scenarios. So for API user convenience we will now also provide the check's code (UUID) as a separate field. Fixes: #1007	2024-07-18 18:15:52 +03:00

1 2 3 4 5 ...

1101 commits