Add last error to service finish telemetry
What does this merge request do and why?
This MR adds the last error to service finish telemetry. It uses GDK::ConfigRedactor to protect sensitive data by removing entire lines that contain sensitive keywords and replacing other sensitive values with "[redacted]".
Related to #2958 (closed)
How to set up and validate locally
- Start a crash-looping service (e.g., temporarily misconfigure runner like
mv ~/.gitlab-runner/config.toml ~/.gitlab-runner/config.toml.bak) - Check for the
last_errorvalue in ClickHouse with this query:
Click to expand
SELECT
derived_tstamp,
visitParamExtractString(custom_event_props, 'value') as service,
JSONExtractString(custom_event_props, 'extras', 'exit_code') as exit_code,
JSONExtractString(custom_event_props, 'extras', 'last_error') as last_error
FROM snowplow_events
WHERE custom_event_name = 'Custom service_finish'
AND JSONExtractString(custom_event_props, 'extras', 'exit_code') != '0'
AND derived_tstamp >= '2025-09-09 11:00:00'
AND user_id = '<your-username>'
ORDER BY derived_tstamp DESC
LIMIT 10
Impacted categories
The following categories relate to this merge request:
-
gdk-reliability - e.g. When a GDK action fails to complete. -
gdk-usability - e.g. Improvements or suggestions around how the GDK functions. -
gdk-performance - e.g. When a GDK action is slow or times out.
Merge request checklist
-
This MR references an issue describing the change. -
This change is backward compatible. If not, please include steps to communicate to our users. -
Tests added for new functionality. If not, please raise an issue to follow-up. -
Documentation added/updated, if needed. -
Announcement added, if change is notable. -
gdk doctortest added, if needed.
Edited by Nao Hashizume