Skip to content

Rework Lorry alerting

Currently, alerting happens on per-mirror basis. For each failing mirror the recipients in a mirror group will receive 1 notification per 1 failing mirror

It turns out that often these notification arrive because a whole forge goes down, so the users receive multiple emails for what is a singular forge failure.

For alerting we have to not have too much noise while highlighting actual failures. For this I see 2 approaches

Batch alerts per alert group

This looks like it's the easier solution. When we run the send() routine, we iterate through alert groups and select all unsent alerts that could be sent to this group and send 1 email to that group with a list of failing mirrors. This solution cuts down on noise but also decreases readability. If we have a huge pool of alerts that triggered because a forge went down and 1 real alert that triggered because something horrible happened, it might be hard to find the needle of a real alert in a haystack of also real, but unimportant alerts.

Batch alerts per forge

This solution is a bit harder to implement. Alerts are still separated by alert groups, however, each alert group receives emails with alerts batched into forges. i.e. a git.kernel.org alert, a git.savannah.gnu.org alert, and so for every forge.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information