0
0
Fork 0
mirror of https://github.com/healthchecks/healthchecks.git synced 2025-04-14 17:08:29 +00:00
Commit graph

411 commits

Author SHA1 Message Date
Pēteris Caune
0456a5934a
Add missing type annotation 2025-03-17 09:35:27 +02:00
Pēteris Caune
ca4253324a
Switch to sha256 hashes in email verification and unsub links
But still accept sha1 hashes, to allow the existing unsubscribe
links to still work. I'm planning to remove the sha1 backwards
compatibility in a month or so.
2025-03-17 09:34:05 +02:00
Pēteris Caune
0e6a52d7af
Add "stops working Apr 2025" note to LINE Notify integrations
https://notify-bot.line.me/closing-announce
2025-03-10 09:48:55 +02:00
Pēteris Caune
723a20ddc3
Add support for attaching labels to issues
cc: 
2025-02-25 12:30:52 +02:00
Pēteris Caune
6382bd737a
Add GitHub Issues integration (WIP)
cc: 
2025-02-24 16:24:17 +02:00
Pēteris Caune
b5d4f2aa74
Implement S3 outage mitigation
The mitigation is to not attempt GetObject calls if there have
been more than 3 S3 errors in the past minute. The implementation
uses the TokenBucket class that we normally use for rate-limiting.

An example scenario this is trying to avoid is:

* the S3 service becomes unavailable for 10 straight minutes.
  Each S3 request hangs until we hit the configured timeout
  (settings.S3_TIMEOUT)
* A client is frequently requesting the "Get ping's logged body"
  API call. Each call causes one webserver process to become
  busy for S3_TIMEOUT seconds.
* All workers become busy, request backlog fills up, our service
  starts returning 5xx errors.

With the mitigation, during an S3 outage, only the calls that
retrieve ping's logged body will return 503, the rest of the service
will (hopefully) work normally.

Fixes: 
2025-01-13 14:21:42 +02:00
Pēteris Caune
a0312a04f8
Replace percent-sign formatting expressions with f-strings 2025-01-13 11:22:13 +02:00
Pēteris Caune
7ec1b7fe1c
Fix mypy warning 2024-12-27 14:17:38 +02:00
Pēteris Caune
7d5341c5a7
Add "badge_url" field in Check API responses
Fixes: 
2024-12-27 13:44:49 +02:00
Pēteris Caune
1dbf0751ef
Remove now obsolete Check.prepare_badge_key()
cc: 
2024-12-27 12:56:26 +02:00
Pēteris Caune
bf97e3967c
Make Check.badge_key non-nullable
cc: 
2024-12-27 12:47:24 +02:00
Pēteris Caune
8dab5d8c36
Add default value (uuid.uuidv4()) for Check.badge_key
cc: 
2024-12-27 11:36:48 +02:00
Pēteris Caune
ff5b060e86
Move repeating flip reason descriptions to Flip.reason_long() 2024-12-16 14:35:36 +02:00
Pēteris Caune
22268c1484
Move absolute URL construction to hc.lib.urls.absolute_reverse()
absolute_reverse() works the same as django.urls.reverse()
except it generates absolute URLs (starting with http[s]://)
2024-12-03 17:24:27 +02:00
Pēteris Caune
5e848f4976
Add index on api_flip (owner_id, created)
This helps queries in hc.front.views._get_events,
especially for checks that are flipping between up and down
states a lot.
2024-12-03 10:37:01 +02:00
Pēteris Caune
b328c8739f
Reduce the number of Check.get_status() calls 2024-11-14 13:33:21 +02:00
Pēteris Caune
9edae634c7
Add Flip.reason field
cc: 
2024-11-08 10:24:50 +02:00
Pēteris Caune
4907073c55
Remove unneeded quotes 2024-11-06 19:32:44 +02:00
Pēteris Caune
e048ec4c48
Fix "class Foo(object):" -> "class Foo:"
In Python 3 these are equivalent, and shorter is better.
2024-10-29 17:57:50 +02:00
Pēteris Caune
c372e3232f
Update MS Teams legacy webhook retirement date to Jan 2025
Microsoft pushed it forward again:
https://devblogs.microsoft.com/microsoft365dev/retirement-of-office-365-connectors-within-microsoft-teams/
2024-10-25 09:51:58 +03:00
Pēteris Caune
2cb47d3742
Make the sorting of null values in Flip.select_channels() explicit 2024-09-12 10:52:06 +03:00
Pēteris Caune
f241d070e1
Update Flip.select_channels() to sort channels by last_notify_duration
If a check has multiple associated channels, some are slow and
some are quick, handle the quick ones first.
2024-09-12 10:44:56 +03:00
Pēteris Caune
3968a4f9e0
Update MS Teams Connector EOL date 2024-08-27 16:34:59 +03:00
Pēteris Caune
7346994ae8
Fix field name in TypedDict used for type checking 2024-07-18 18:19:01 +03:00
Pēteris Caune
bdb6f18a3d
Add "uuid" field in API responses when read/write key is used
The API responses already contain ping_url, update_url, resume_url,
pause_url fields where the UUID can be extracted from, so we are
not exposing new information. The extraction can be finicky in,
say, shell-scripting scenarios. So for API user convenience we will
now also provide the check's code (UUID) as a separate field.

Fixes: 
2024-07-18 18:15:52 +03:00
Pēteris Caune
8054191be3
Remove HipChat, Pagerteam, Zendesk channel kinds
HipChat and Pagerteam products have long been shut down,
the Zendesk integration was never fully implemented.
2024-07-18 16:21:45 +03:00
Pēteris Caune
61bdd975e8
Add "(stops working Oct 2024)" note to the old MS Teams integration 2024-07-18 10:27:51 +03:00
Pēteris Caune
e83f60cc0b
Implement Implement MS Teams Workflows integration
We already have a MS Teams integration but MS Teams is discontinuing
the incoming webhook feature used by this integration:

https://devblogs.microsoft.com/microsoft365dev/retirement-of-office-365-connectors-within-microsoft-teams/

MS Teams now recommends to use Workflows to post messages
via webhook. MS Teams does not provide backwards compatibility or
an upgrade path for existing integrations.

This commit adds a new "msteamsw" integration which uses MS Teams
Workflows to post notifications. It also updates the instructions
and illustrations in the "Add MS Teams Integration" page.

cc: 
2024-07-17 13:35:17 +03:00
Pēteris Caune
3e5080d9eb
Remove Ping.body field 2024-07-11 16:34:18 +03:00
Pēteris Caune
997154e3b0
Remove usages of Ping.body 2024-07-11 16:17:21 +03:00
Pēteris Caune
bc8fb90fed
Update Check.ping() to use select_for_update()
Without it, on MariaDB, concurrent pings can lead to a deadlock.
This results in OperationalError and HTTP 500 response to the client.

cc: 
2024-07-10 19:50:39 +03:00
Pēteris Caune
324fa10ce7
Fix Check.lock_and_delete() to gracefully handle already deleted check 2024-06-20 15:57:53 +03:00
Pēteris Caune
d486d2db14
Add uniqueness constraint to api_notification.code
This is primarily to make notification lookups by code efficient.
We look up notifications by code in hc.api.views.boundces.

This field has a default value (uuid.uuid4), so any null values
will be filled with random UUIDs during migration.
2024-05-17 10:30:01 +03:00
Pēteris Caune
e683496bed
Move reusable ping formatting code to Ping model 2024-04-19 12:38:20 +03:00
Pēteris Caune
81f202e2ac
Rename notify_flip -> notify 2024-04-12 15:49:47 +03:00
Pēteris Caune
28fdfd1362
Change Channel.notify() signature to take Flip object as an argument
... and pass it to Transport.notify_flip().

This allows us to pass flip-specific information (the flip timestamp,
the new status) to transport classes.
2024-04-12 13:54:16 +03:00
Pēteris Caune
6e130f1749
Change Transport.is_noop() to accept status:str instead of check:Check
I'm planning to change Channel.notify() signature to take a Flip
object as an argument instead of a Check object. This change is
in preparation for these changes.
2024-04-12 13:23:29 +03:00
Pēteris Caune
aaa8681fec
Update Check.prune() to also delete flip objects.
Check.prune() now deletes flips older than the oldest
retained ping *and* older than 3*31=93 days.
2024-04-11 12:56:28 +03:00
Pēteris Caune
1322bb1123
Add support for per-check status badges
Fixes: 
2024-02-27 12:55:51 +02:00
Pēteris Caune
767c3ae702
Add a management command for pruning all checks 2023-12-21 14:55:05 +02:00
Pēteris Caune
b0f8c730f5
Change query in Check.prune() to work around pg index selection issue
In prune(), we need to look up the earliest ping in the database
for a given check. The old version did:

    ping = self.ping_set.earliest("id")

The new version does:

    ping = self.ping_set.earliest("created")

Both yield the same result, but in the first case Postgres may
decide to use the index for the api_ping.id column and scan
almost the entire table.

In the second case it uses the index for the api_ping.owner_id column,
and scans just the rows associated with the check.
2023-12-21 12:00:05 +02:00
Pēteris Caune
c8897b7026
Improve the handling of StopIteration exceptions
Instead of returning a datetime in far future,
get_grace_start() now returns None which (meaning "never").
2023-12-19 14:05:10 +02:00
Pēteris Caune
1d6e7297be
Fix get_grace_start to handle StopIteration exceptions
These can happen with "one-shot" OnCalendar schedules,
for example: "2023-12-19 11:30"
2023-12-19 13:29:52 +02:00
Pēteris Caune
fc56cf2635
Add API support for OnCalendar schedules 2023-12-07 14:03:35 +02:00
Pēteris Caune
d65f41d192
Add support for systemd's OnCalendar schedules
(work-in-progress)

cc: 
2023-12-06 15:42:57 +02:00
Pēteris Caune
f8a9077c76
Fix DST handling in Check.get_grace_start() 2023-10-30 11:53:52 +02:00
Pēteris Caune
1eb92c9fe7
Switch to using Pydantic for parsing Gotify configuration 2023-10-26 12:01:42 +03:00
Pēteris Caune
cdac9b3128
Switch to using Pydantic for parsing Trello configuration 2023-10-26 11:57:49 +03:00
Pēteris Caune
dea66b85af
Switch to using Pydantic for parsing ntfy configuration
Also, fix a bug in the "Edit ntfy integration" form,
where the token was not filled in the form on page load.
2023-10-26 10:32:41 +03:00
Pēteris Caune
343e55bd4f
Improve type hints 2023-10-25 18:12:12 +03:00