Rollout Plan for Error Tracking backed by Clickhouse
Goal
Progressively Rollout the Error Tracking backed by ClickHouse service from Staging to Production, including an early-adopter program
Progressive Rollout plan
High Level Steps:
Step | Description |
---|---|
1 | Enable error tracking API on staging.gitlab.com |
2 | Enable error tracking feature flag for staging load testing |
3 | Deploy error tracking API on production gitlab.com |
4 | Enable error tracking feature flag for production load testing of internal project |
5 | Enable error tracking feature flag for production, internal project availability |
6 | Enable error tracking feature flag for production, customer opt-in project / closed beta availability |
7 | Enable error tracking feature flag for production, open beta availability |
Detailed Rollout Steps:
✅ Staging
-
Staging Infrastructure is provisioned -
Deploy error tracking API custom GitLab instance -
Deploy error tracking API on staging.gitlab.com -
Admin setting configuration (In Progress)
-
-
Create an OAuth application at instance level -
Enable error tracking feature flag for staging.gitlab.com test project ( /chatops run feature set --project=observability-team-test/error-tracking-test integrated_error_tracking true --staging
) -
Enable clickhouse error tracking feature flag for staging.gitlab.com test project ( /chatops run feature set --project=observability-team-test/error-tracking-test use_click_house_database_for_error_tracking true --staging
) -
Review error budgets and performance metrics for anomalies -
Perform load testing on staging (see #1672 for stages) -
Socialize status in #development and #infra-lounge channels and EWIR
✅ Production Prerequisites
-
Production Infrastructure is provisioned -
Domain configured -
Ensure documentation has been updated gitlab-org/gitlab!95239 (merged) -
Enable error tracking ingestion API on gitlab.com #1753 -
Enable error tracking feature flag for gitlab.com test project ( /chatops run feature set --project=theoretick/error-tracking-test integrated_error_tracking true
) -
Enable clickhouse error tracking feature flag for gitlab.com test project ( /chatops run feature set --project=theoretick/error-tracking-test use_click_house_database_for_error_tracking true
) -
Enable error tracking feature flag for gitlab org group ( /chatops run feature set --project=gitlab-org integrated_error_tracking true --production
) -
Enable clickhouse error tracking feature flag for gitlab org group ( /chatops run feature set --group=gitlab-org use_click_house_database_for_error_tracking true --production
) -
Enable error tracking feature flag for gitlab.com opt-in internal projects ( /chatops run feature set --project=gitlab-org/release-tools integrated_error_tracking true
) -
Enable error tracking feature flag for gitlab.com opt-in internal projects ( /chatops run feature set --project=gitlab-org/customers-gitlab-com integrated_error_tracking true
) -
Review error budgets and performance metrics for anomalies -
Ensure Error Tracking support for Java, Python, and JS -
InfraSec review -
Production Monitoring -
Enable Error Tracking on Release tools project -
Identify Thanos queries that could be useful for debugging -
Add Runbook entry outlining thanos queries for SRE oncall
✅ Production Open Beta rollout (Completed 9/29)
-
Coordinate a time to enable the flag with the SRE oncall and release managers - In
#production
mention@sre-oncall
and@release-managers
. Once an SRE on call and Release Manager on call confirm, you can proceed with the rollout
- In
-
Enable on GitLab.com by running chatops command in #production
-
Rollout to 100% -
Cross post chatops Slack command to #support_gitlab-com
(more guidance when this is necessary in the dev docs) and in your team channel -
Announce on the issue that the flag has been enabled (https://docs.gitlab.com/ee/development/feature_flags/controls.html#cleaning-up) by running chatops command in #production
channel -
Review error budgets and performance metrics for anomalies
Rollback Steps
-
This feature can be disabled by running the following Chatops command:
/chatops run feature set --project=gitlab-org/gitlab use_click_house_database_for_error_tracking false
/chatops run feature set --project=gitlab-org/gitlab integrated_error_tracking false
-
Error tracking ingestion API can be disabled directly