CI/CD kanban process

Problem to Solve

We want to improve the planning process for CI/CD teams (Verify, Package, and Release) so that there's always a list of items ready to start, but not everything gets assigned out on day one of the milestone, instead, when one item is finished one item will be started. This will help us:

  • Reduce WIP so that everything doesn't end up finishing on the same day, or backing up on reviewers late in the cycle, resulting in items missing the release at the last minute.
  • Avoid the stack rank becoming scrambled on the first day of the milestone, resulting in items being delivered out of priority order.

Reference

  • Video discussion with Fabian about Geo team process: https://youtu.be/15eJ5UGxtlA
  • Geo team documentation: https://about.gitlab.com/handbook/engineering/development/enablement/geo/planning.html#kanban

Proposal

Everything is still based on the product development timeline: https://about.gitlab.com/handbook/engineering/workflow/#product-development-timeline, and this is important so we don't diverge from important calendar milestones that are important to the company. The differences are in what we'll be doing exactly during each phase.

Board Flow

In a Kanban, it's important we're all looking at the same board so everyone will use this one: https://gitlab.com/groups/gitlab-org/-/boards/364216?scope=all&utf8=%E2%9C%93&state=opened&milestone_title=12.5&label_name[]=group%3A%3Acontinuous%20integration (replacing the group and milestone filters with the appropriate values.) Note that one difference between Geo and CI/CD, for those familiar with that process, is that all milestone planning is done on a single board.

The key integration point where the magic happens is in these two columns:

  • workflow::scheduling contains items that have been pulled over during the scheduling call as a buffer for what's going to be ready to work on next; any finishing touches are added while in this column before being moved to workflow::ready for development
  • workflow::ready for development contains items that are fully ready for engineers to begin working on and can be picked up to start at any time.

Both of these columns have a WIP limit that is up to each engineering team to decide.

Planning

Planning works the same as it does today, with a couple key differences:

  • During the week of capacity and technical discussions with engineering/UX, items are not taken out of the stack rank and shuffled between developers. We still need to determine roughly how much we want to include in the kickoff from all the items in the open column; during this week, those items will get the Deliverable label but they will not be assigned out. How we will determine this is still an open question, but is ultimately up to each EM to decide.
  • At the end of this week, PM will take all the extra open items (at least the ones that aren't Community contribution, pure documentation, Quality team tasks, etc.) and move those back to the backlog since they didn't make planning.

Day-to-Day Operation

After the above planning is completed, we can have the kickoff as normal. When development starts for the new milestone, things proceed as follows:

  • PM will continuously ensure that there are always sufficient items in the workflow::scheduling column, ready to be picked, and that they are always stack ranked in order of importance.
  • Any time an engineer needs something new to work on, they will take the top item from workflow::scheduling that they are capable of working on and move it to the workflow::ready for development column.

This continues until all the items in the milestone are completed or we reach the end. As mentioned above, both of these columns should have a WIP limit that is defined by the engineering team.

Scheduling Call

Each PM/EM pair should have a weekly Scheduling Call that the team is also welcome to attend. The purpose of the call is to make sure that our expectations are aligned. The PM is able to see what the team has done that week and what they are currently working on, and the EM can review what's coming next. Together everyone should agree on the priorities for the week and we both have an opportunity to understand why specific issues are important, or not. It also helps to surface blocked issues, or issues that are taking longer than expected. You should aalso write a summary of the call to share with the team along with the recording to make sure that everyone is in the loop.

Reference

Columns

The complete set of columns are defined as follows:

  • Open contains items not handled by the engineering team (Community contribution, pure documentation, Quality team tasks, etc.) or, before planning is complete, the list of potential items to include.
  • Deliverable represents the cut-line after planning. Items included here are likely (but not guaranteed) to be picked up at some point in the milestone.
  • Stretch should not really be used.
  • workflow::planning breakdown contains items that are being analyzed for how they can be broken down into pieces for delivery
  • workflow::scheduling is populated by PM and contains the next several tems that are most important to start.
  • workflow::ready for development is contains items that an engineer has selected to own and has pulled in herself. Note that all subsequent columns (except Closed) count as the same WIP.
  • workflow::in dev means development has started.
  • workflow::ready for review means it is waiting for a reviewer.
  • workflow::in review means the review is now in progress.
  • workflow::verification means verification is now underway.
  • workflow::staging means the item has been deployed to staging.
  • workflow::canary means the item has been deployed to canary.
  • workflow::production means the item has been deployed to production.
  • Closed contains items that have made it all the way to production and are validated as meeting the success criteria/definition of done.

Related Boards

  • Boards for the validation track (used by PMs/UX) look like https://gitlab.com/groups/gitlab-org/-/boards/1351258?label_name[]=group%3A%3Acontinuous%20integration
Edited Oct 31, 2019 by Jason Yavorsky
Assignee Loading
Time tracking Loading