Reintroduce service finish telemetry
Problem
Service finish telemetry was disabled after it generated a huge spike in events from 60k to 2.5m per day. The spike was caused by crash looping services sending telemetry events every time they exited, even when the exit code didn't change.
Impact
Without this telemetry, we can't see when services actually fail and debugging becomes harder.
Proposal
Reintroduce service finish telemetry but only emit events when the service exit code changes.
Impacted categories
The following categories relate to this issue:
-
gdk-reliability - e.g. When a GDK action fails to complete. -
gdk-usability - e.g. Improvements or suggestions around how the GDK functions. -
gdk-performance - e.g. When a GDK action is slow or times out.
Edited by Nao Hashizume