Evaluate timed incremental rollout on hosting servers
We're going to ship the delayed job feature https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/21767 for AutoDevOps timed incremental rollout https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/22023.
We're going to evaluate it works on our hosting servers with AutoDevOps timed incremental rollout mode.
Evaluation plan
dev.gitlab.org
- Enabled term: 8th, Oct. ~
- Evaluation date: 8th, Oct.
-
Wait for dev.gitlab.org daily sync -
Create a sample project with new AutoDevOps deployment strategy to make sure it's fully functional. -
Check health (metrics, logs and crash reports)
staging.gitlab.com
- Enabled term: 10th?, Oct. ~ (It depends on RM's plan)
- Evaluation date: 10th, Oct.
-
RC with the new code has been deployed -
Create a sample project with new AutoDevOps deployment strategy to make sure it's fully functional. -
Check health (metrics, logs and crash reports)
gitlab.com
- Enabled term: 10th?, Oct. ~ (It depends on RM's plan)
- Evaluation date: 10th, Oct. ~ 10th, Nov.
-
RC with the new code has been deployed -
Create a sample project with new AutoDevOps deployment strategy to make sure it's fully functional. -
Check health (metrics, logs and crash reports)
NOTE:
- This feature is behind the feature flag
ci_enable_scheduled_build
, but it's enabled by default (See https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/21767#note_106332776) - This feature doesn't run unless users manually changed their deployment strategy in AutoDevOps. Presumably, after we published a release post of 11.4 at 22nd, Oct., users would try to use it, and the usage of the new sidekiq-workers will gradually increase so that the time we should keep eyes on server's health.
Check health (metrics, logs and crash reports)
- Does the new worker
Ci::BuildScheduleWorker
run properly? This usespipeline_processing
namespace (priority: 5 (highest)). - Does not the new worker
Ci::BuildScheduleWorker
pressurize other workers in the same namespace? - Are there any stale delayed jobs?
Ci::Build.stale_schedule.count
should be zero. - Are there any crash reports related to this feature on Sentry?
- StuckCIJobWorker should use Index Scan properly? (Ref: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/21767#note_106811708) (cc @abrandl)
Feature flag
The feature flag's name is ci_enable_scheduled_build
. The new AutoDevOps deployment strategy - Timed incremental rollout is based on the delayed job feature. By disabling ci_enable_scheduled_build
, we can effectively revert the timed incremental rollout to manual incremental rollout (Also, stops new creation for delayed jobs)
Feature.enabled?('ci_enable_scheduled_build') # Check if it's enabled
Feature.enable('ci_enable_scheduled_build') # Enable the feature
Feature.disable('ci_enable_scheduled_build') # Disable the feature
Edited by Shinya Maeda