Reviewing the Gitlab handbook for any needed changes
Do an announcement and possible Q&A ahead of the final change over
Do announcement once change over is complete
Changes that have been identified as needing to be done
Documentation about staging-canary as an environment, and how to use it
Add a full page about canary environments and how they work (and how to do some specific tasks) under the environments section of the handbook gitlab-com/www-gitlab-com!99839 (merged)
Update environments page with complete information about staging-canary, making sure to include information about the short names we use (e.g. "gstg-cny") gitlab-com/www-gitlab-com!99435 (merged)
Documentation/runbooks about what to when various parts of the deploy pipeline fail/go wrong
Update hotpatch documentation to be accurate and provide information on what to do or consider post pipeline-reorder here
Write up announcements relating to start of testing/cutover #2280 (closed)
Acceptance Criteria
Developers at Gitlab have accurate documentation about the staging-canary environment, and how to use it
Developers at Gitlab have accurate documentation about the new pipeline order, and can determine the order in which their code progresses to production
Release Managers have accurate documentation about what the new pipeline looks like
Release Managers have up to date documentation about how deploy a release to a specific environment, that covers any considerations with the pipeline re-order
Release Managers have up to date documentation about rollbacks that covers any considerations with the pipeline re-order
Release Managers have up to date documentation about hotpatches that covers any considerations with the pipeline re-order
@rspeicher If you think there is anything else that needs to be added in order to make sure Gitlab developers are across the changes and impact of what we are doing let me know
Graeme Gillieschanged title from Do communication and documentation updates for new auto-deploy pipeline to Stage 4: Do communication and documentation updates for new auto-deploy pipeline
changed title from Do communication and documentation updates for new auto-deploy pipeline to Stage 4: Do communication and documentation updates for new auto-deploy pipeline
In production#6179 (comment 815845454) we had some extra confusion caused by not many people knowing that the gstg-cny tests are also hitting gstg. It could be a good idea to get a first iteration of the gstg-cny set up and use merged into the handbook to help with future incidents.
Using this issue to track the announcements and links we want to use.
I am thinking of the following
Using issue #2004 (closed) as the main issue for providing context around this change. It has the diagram of what we are doing, why, and the history/context behind the decision
Using the following issue #2235 (closed) to capture feedback/issues/questions about the cutover (to keep everything related to the cutover in 1 place). We could also offer to ask in #g_delivery in slack
Is what we now have in the handbook good enough to capture how things are with the pipeline reorder?
It gives a basic idea, although I suspect that for engineers not familiar with the auto-deploy process, picturing the coordinated pipeline might be complicated. From the issue description:
Add documentation about the new deploy pipeline to the handbook here
It'd be nice to have a general diagram that shows the auto-deploy process. For starters, we could add this diagram along with a explanation into release/docs and link it from the release page.
Using the following issue #2235 (closed) to capture feedback/issues/questions about the cutover (to keep everything related to the cutover in 1 place). We could also offer to ask in #g_delivery in slack
I think we should use a dedicated issue for the announcement and to capture questions if any. #2235 (closed) has testing details and long conversations that might not interest most people.
This is a good start, however, I believe we need to be more specific, the issues pointed out there are lengthy, and I don't expect a lot of people to read throughout those . We should have a dedicated announcement issue that offers guidance about the changes and highlight the information we want engineers to know, for example:
What changed in the auto-deploy process? => Briefly describe the new order and how it differs from the previous one.
Why are we changing the order? => Explains the mixed-deployment problem we're tackling
As an engineer, how does this impact me? => There isn't any impact on development velocity and we should be clear about it.
How does this impact QA tests? => It doesn't, QA smoke and full pipelines are still scheduled after each deployment.
Where I can find more information? => Links to all the related issues
Here are some examples of announcements we've made in the past:
@mayra-cabrera thanks for the feedback and linking to previous issues, this gives me the full clarity on what we have done in the past and how (with some more context).
Thanks @ggillies! I noticed the As an engineer, how does this impact me? links to the new canary page (being worked on gitlab-com/www-gitlab-com!99839 (merged)), what do you think if we're more transparent/direct about where the Canary information can be found?
For a detailed overview of the canary stage and how to access to it, please read over the [dedicated canary documentation](link to the handbook.)
Updated description to include a concrete list of things that need to be done based off feedback from @amyphillips on the main epic (this list will likely grow and/or change)
Graeme Gillieschanged the descriptionCompare with previous version
changed the description
Graeme Gilliesmarked the checklist item Add a full page about canary environments and how they work (and how to do some specific tasks) under the environments section of the handbook gitlab-com/www-gitlab-com!99839 (merged) as completed
marked the checklist item Add a full page about canary environments and how they work (and how to do some specific tasks) under the environments section of the handbook gitlab-com/www-gitlab-com!99839 (merged) as completed
@ggillies and @mayra-cabrera can we add a bit to the acceptance criterion for this issue? Unless existing documentation is already around, I cannot find any, it would benefit our development teams such that they understand the need to ensure that their code is backward compat w/i stages. Perhaps bolstering the fact that code should be backward compatible between versions with some added information regarding the fact that we have stages, canary and main. What do you think?
@skarbek I've updated the issue description with some acceptance criteria, in my mind, I am trying to focus on documentation relating to the pipeline re-order, and the new state of things.
As for making sure code is backward compatible, as part of the pipeline re-order, nothing has changed on that front. With our current pipeline they have to be aware of this (with production-canary and production). The closest I have found to making developers aware of this is the General Gitlab compatibility docs [https://docs.gitlab.com/ee/development/multi_version_compatibility.html] which calls out we support zero-downtime upgrades (meaning multiple versions of the same code running at once). Also highlighting the QA suites we have in each environment now cover mixed-deployment testing should hopefully make that clearer to developers as well. Do you think that is enough?
Unfortunately what I have discovered while doing this documentation is there are a lot of pieces missing generally over all many parts. I'm trying to be sensitive here with delaying the pipeline reorder due to trying to fix all gaps in our documentation, vs focusing on the documentation that covers what has changed and keeps people moving day to day, and lodging issues to better documentation later.
As a release-manager (and someone in delivery outside the epic), more than happy to do further documentation in order to make sure you feel confident you can keep doing RM work around auto-deploys for Gitlab.com post reorder.
Graeme Gilliesmarked the checklist item Update environments page with complete information about staging-canary, making sure to include information about the short names we use (e.g. "gstg-cny") gitlab-com/www-gitlab-com!99435 (merged) as completed
marked the checklist item Update environments page with complete information about staging-canary, making sure to include information about the short names we use (e.g. "gstg-cny") gitlab-com/www-gitlab-com!99435 (merged) as completed
Graeme Gilliesmarked the checklist item Documentation about staging-canary as an environment, and how to use it as completed
marked the checklist item Documentation about staging-canary as an environment, and how to use it as completed
I ponder if we should specifically call out our new labeling order. This is expected given the behavior of pipelines, but for those that are not yet initiated yet with the pipeline reorder the labels being added by our tooling are a bit goofy:
@skarbek I'm obviously too deep in this project and the re-order to understand the ask here. To me the order above looks correct? As part of the re-order things go staging-canary, canary, staging/prod. How are the labels goofy (beyond the fact that the re-order itself could be considered goofy with things going to canary before staging).
Also where/what extra documentation would you like to see here? I've updated the section at #2280 (closed) to highlight that our automation will be putting comments and labels on things differently than before (but should match the new order), is there any other place you would like to see me call this out? Thanks!
Update hotpatch documentation to be accurate and provide information on what to do or consider post pipeline-reorder here
Is there actually anything we need to add here? Looking at patcher and thinking closer, I don't think anything has changed as part of the pipeline re-order
Patches are applied in sorted order. To specify the order prefix the patch with a sort key such as
I recall hot-patches are applied to gstg -> gprd-cny -> production. Do we need to change the order to match the new deployment order? (gprd-cny -> gstg -> production)? I think not . Thoughts?
Graeme Gilliesmarked the checklist item Update hotpatch documentation to be accurate and provide information on what to do or consider post pipeline-reorder here as completed
marked the checklist item Update hotpatch documentation to be accurate and provide information on what to do or consider post pipeline-reorder here as completed
Graeme Gilliesmarked the checklist item Documentation/runbooks specifically about the pipeline re-order as completed
marked the checklist item Documentation/runbooks specifically about the pipeline re-order as completed
Developers at Gitlab have accurate documentation about the new pipeline order, and can determine the order in which their code progresses to production
Release Managers have up to date documentation about how deploy a release to a specific environment, that covers any considerations with the pipeline re-order