healthchecks_healthchecks/hc/api/migrations/0095_check_last_start_rid.py
Pēteris Caune 7458770b41
Improve alerting logic when run IDs are used
* Add Check.last_start_rid field
* Fill Check.last_start_rid on every start event
* Clear Check.last_start on every "fail" event
* Clear Check.last_start on success event if either case is true:
 - the event's rid matches Check.last_start_rid
 - the event does not specify rid

In human terms, the alerting logic will be: we track the
execution time of the most recent "start" event only. It would
take a major redesign to track the execution time of all
concurrent "start" events and send alerts when *any* of them
overshoots the time budget. So, whenever we see a "start" event,
the timer resets.

Example:

* 00:00 client sends start signal with rid=A, timer starts
* 00:10 client sends start signal with rid=B, timer resets
* 00:20 client sends success signal with rid=A, timer
  does not reset because rid A does not match the rid seen in
  the most recent start signal (it was B)
* 00:30 the grace time runs out, the check's status shows
  as started + failed

At this point the check can be reset to a healthy state in 3
different ways:

* send a success signal with rid=B
* send a failure signal with any rid value or without it
* send a success signal without a rid value
2022-11-09 19:01:22 +02:00

19 lines
392 B
Python

# Generated by Django 4.1.2 on 2022-11-09 12:57
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
("api", "0094_ping_rid_alter_channel_kind"),
]
operations = [
migrations.AddField(
model_name="check",
name="last_start_rid",
field=models.UUIDField(null=True),
),
]