Executable Runbooks for Releases MVC
Problem to solve
Managing releases in GitLab from a traditional, manual or semi-automated release checklist point of view is difficult. We even have this ourselves at GitLab - for an example of how this is done here today, you can see how we managed the 11.4 release at gitlab-org/release/tasks#462 (closed) and gitlab-org/release/tasks#460 (closed). There are manual tasks defined in the issue description in markdown, and these are discussed and checked off as things go. This is a pain for a few reasons:
- Context is minimal, detailed instructions could be linked to but aren't naturally in place in the checklist (it would get too long)
- Clicking on checkboxes is error prone, there's no separate way to validate that something actually happened
- It's not possible to see how the plan is changing over time
- It's not possible to measure the performance/efficiency of the plan
We should provide a better way to support these kind of workflows within GitLab, without breaking pipelines and turning them into manual processes.
Release Managers and teams involved in executing releases. The main difference between this and a pipeline implementer for a
.gitlab-ci.yml is that the authors of these kinds of pipelines are much less technical, and even editing yaml might be a challenge (though they will still need to understand markdown.)
Our internal customers are the #production team (for runbooks in general) and #delivery team (for release plans).
We have already delivered Jupyter Runbooks which allow for mixing code and documentation, via the issue &380. The configure team is also adding the ability to check these in to source control via https://gitlab.com/gitlab-org/gitlab-ce/issues/47138. Finally, we are also targeting "draft/in-progress releases" to be created via https://gitlab.com/gitlab-org/gitlab-ce/issues/38105 in this same release, which will create a natural home for a release-focused runbook to be associated.
What this MVC will introduce is the following:
- The ability to tie a release to an instance of a runbook. We won't enforce that the release is
pending, but that will be the main use case.
- Release pages/lists should link to the runbook and show at least minimal status of the runbook (running, complete, etc.) More info is better, but we need to keep to MVC level scope.
Because runbooks aren't distributed with GitLab, we need to ensure that the button/UX makes points to the documentation required to get it set up: https://docs.gitlab.com/ee/user/project/clusters/runbooks/
Finally, this feature will be heavier than usual on the documentation side to ensure people know what to do with the MVC version, and also what a good release runbook would look like. @jlenny will help with the documentation on this item, and also write a blog post for awareness/additional help, perhaps focused on how to translate an Excel runbook into a Jupyter runbook.
Why Runbooks for Releases
You can look at a traditional release plan as a kind of state machine for deploying applications. This could be done in something like Excel (which more people are still using than you might expect), or a workflow tool like Electric Flow. In short, a state machine like this lets you:
- Be able to see what the status is
- Which tasks on track
- What is failing
Typically, releases will have an overall due date and a work-back plan for delivering. Some items may depend on or block other items. Some of these items may be automation happening in GitLab, some may be issue/MR states, some may be wholly manual tasks. The power of the runbook is that you can combine all of these things into one plan, without building manual jobs and that sort of thing into the automation pipeline.
Runbooks vs. Excel
There are a couple ways what we build can be better than using Excel:
- Automatic value stream analysis (how long are things taking, how much is automated, what tasks are taking a long time and could be opportunities to improve efficiency). This is what release orchestration products on the market primarily offer.
- Built in code execution - push a button directly in the runbook to perform a task, whether implemented inside or outside of GitLab.
- Integrating with our releases feature and tie in with capabilities like https://gitlab.com/gitlab-org/gitlab-ce/issues/56030 (evidence collection in releases), or calls out to pipelines. This is powerful, and takes advantage of our 'single-application' nature to offer better features.
- ChatOps integration
What does success look like, and how can we measure that?
- Metrics on runbooks is not included in this iteration, but it should be possible to do things like generate a value stream map for a runbook, or show % automated/not automated and how that is progressing over time, for example. Also possible for the future are embedded approval tasks (requiring approval from specific people).
- More GitLab-native features built into the runbooks (for example,
wait for all issues in a milestone to be closed)
Links / references
- Original discussion: https://docs.google.com/document/d/1QCcJ4M1Wb3i474RDg4rzWc-NmZxKVr6gVwvjw3a3VcA/edit
- Live Runbooks are a similar idea and could be involved: https://blog.amirathi.com/2018/03/27/codify-infra-runbooks-with-jupyter-style-notebook/
- Video of us discussing this issue https://www.youtube.com/watch?v=ZxDQ9UhjCrU