Refactor coordinated pipeline to use bridge-jobs
We recently confirmed bridge jobs fit our purposes for coordinated pipelines #1715 (closed). Using bridge jobs has some major benefits:
- It allows us to quickly react to downstream pipeline failures and save on CI minutes by removing the active waiting jobs.
- It simplifies the tasks migration from deployer to release-tools.
Before moving more items from deployer to release-tools we need to refactor the jobs we have today to use bridge-jobs instead of active waiting.
To do
-
Refactor .gitlab-ci.yml gitlab-org/release-tools!1448 (merged) -
Test the refactor of .gitlab-ci.yml
to ensure everything is working as intended #1720 (comment 574282416) -
Implement bridge-jobs on the gitlab-ci.yml gitlab-org/release-tools!1452 (merged) -
Testing
Testing
❌ With env variable
-
Merge gitlab-org/release-tools!1452 (merged) - [-]
ChangeBRIDGE_JOBS
fromfalse
totrue
on https://ops.gitlab.net/gitlab-org/release/tools/-/settings/ci_cd - [-]
Ensure deployments are triggered individually to our different environments
Results
Using environment variables made the process hard to reason about. Several errors occurred while trying to enable/disable the environment variable.
Details
- release/tools configuration in ops failed https://ops.gitlab.net/gitlab-org/release/tools/-/pipelines/624303
-
optional:true
was added to prevent the errors in ops gitlab-org/release-tools!1465 (merged) - Additional configuration caused errors in different processes https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/3967802
- With that it was decided to revert the
bridge
implementation gitlab-org/release-tools!1466 (merged) and consider a different approach for testing.
✅ With a different branch
-
Prepare a branch with only the bridge-jobs configuration gitlab-org/release-tools!1467 (diffs) -
Push gitlab-ci-next-gen
to ops https://ops.gitlab.net/gitlab-org/release/tools/-/tree/gitlab-ci-next-gen -
Add gitlab-ci-next-gen
to protected branches on ops https://ops.gitlab.net/gitlab-org/release/tools/-/settings/repository -
Add a fake commit to the current auto-deploy branch. -
Modify auto_deploy:tag
to use that branch instead ofmaster
https://ops.gitlab.net/gitlab-org/release/tools/-/pipeline_schedules/74/edit -
Ensure deployments are triggered individually to our different environments - #1720 (comment 600692414)
Results
- May 31, 2021 - A deployment pipeline using bridge-jobs was successfully triggered. There are some details to fix before the next run. Details on #1720 (comment 589377368)
- June 4, 2021 - Deployments to staging and canary were successfully executed using bridge jobs. Next step is to use the new configuration for a longer period of time. Details on #1720 (comment 593386339)
- June 7, 2021 - A deployment to canary was not triggered automatically after the deployment to staging failed, retried and then succeeded. We're currently investigating why #1720 (comment 594797580)
- June 8, 2021 - It was discovered the feature flag was not enabled on ops, after enabling it, bridge-jobs behave as expected. #1720 (comment 595802756)
-
June 14th, 2021 - Deployments have continued as usual for a week without any issue, details on #1720 (comment 600692414). With that, we're ready to merge gitlab-org/release-tools!1467 (merged) into
master
Clean up
-
Remove gitlab-ci-next-gen
from ops -
Remove gitlab-ci-next-gen
from protected branches -
Modify auto_deploy:tag
to usemaster
What do to if something goes wrong?
- Restore
auto_deploy:tag
to usemaster
branch - The next package will be built using the regular configuration.
Development log
-
May 10th, 2021 - An MR refactoring
.gitlab-ci.yml
into multiple files was submitted gitlab-org/release-tools!1448 (merged) - May 11th, 2021 - gitlab-org/release-tools!1448 (merged) was merged
- May 12th, 2021 - Testing was performed to ensure the refactor didn't affect our workflows #1720 (comment 574282416)
- May 14th, 2021 - An MR implementing bridge-jobs was created gitlab-org/release-tools!1452 (merged)
- May 21th, 2021 - gitlab-org/release-tools!1452 (merged) was submitted for review.
- May 27th, 2021 - Testing using environment variable was not performed due to several errors, see Testing > With env variable
-
May 31th, 2021
- An MR was prepared with only bridge jobs for the CI config was created gitlab-org/release-tools!1467 (diffs). For testing purposes, we'll adjust
auto_deploy:tag
to use this branch. - Results of the testing here #1720 (comment 589377368)
- An MR was prepared with only bridge jobs for the CI config was created gitlab-org/release-tools!1467 (diffs). For testing purposes, we'll adjust
- June 4th, 2021 - A round of testing was done using bridge-jobs. Staging and canary deployments were performed successfully. Details here #1720 (comment 593386339)
- June 7th, 2021 - A deployment to canary was not triggered automatically after the deployment to staging failed, retried and then succeeded. We're currently investigating why #1720 (comment 594797580)
- June 8th, 2021 - The feature flag that fixed bridge-jobs was not enabled on ops.gitlab.net. After enabling it, the bridge-jobs behave as expected.
- June 14th, 2021 - Deployments have continued as usual. We're moving forward with gitlab-org/release-tools!1467 (merged)
Follow-ups
Edited by Mayra Cabrera