Key Result: Lower MTTP from days to single digit hours
View options
- Truncate descriptions
Key Result: Lower MTTP from days to single digit hours
Dependencies
- Child epic Use GitLab environments & deployments for deploying to GitLab.com
- Child epic Move security development from dev.gitlab.org to gitlab.com
- Issue MTTP: https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7647
Plan
As a next step in more frequent deploys plan, we want to ensure that all MR's merged into master get propagated to GitLab.com between 1-9 hours. This can be achieved with the current tooling by ensuring that we create a new auto-deploy branch every day.
In order for that to be achieved, we need to wrap up the work on the dependencies listed above. The reasons for this are as follows:
- 90% - delivery#620 (closed)
- If we deploy even more frequently, Developers, SRE's and Product managers need to be able to easily understand when was a MR deployed to any of the environments. This is why work on gitlab-org&1936 is currently being executed on.
- Measuring MTTP is currently done manually in https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7647 . In order to know how quickly we roll out changes, we need to have this measured before we roll out any other changes. This needs to be done in order to understand whether more frequent deploys are having any positive impact.
- 60% - &121 (closed)
- Security releases are currently blocking. We need to ensure that work done as part of the migration of development to gitlab.com enables us to freely merge security patches and deploy GitLab.com without any slow down.
Retrospective
- Exposing deployment status in MR's proved to be more complex than anticipated. This is mostly due to the way deployment environments were created, and the fact that external environments were never envisioned to be included in the product. This required prep work that was unexpected, and the cleanup tasks that have to be executed at a future point to prevent major regressions
- Metrics data in https://app.periscopedata.com/app/gitlab/573702/WIP:-Delivery-team-PIs is not 100% precise. Partly because of the way our application is deployed (in bundles), and partly due to the way we deploy high priority fixes (fixes jump the queue, but they get overwritten with new deployment when the go into a different bundle). To address this we are tackling gitlab-org/gitlab#36130 (closed)
- We are are not fully ready as an organisation to rapidly increase our deployment cadence yet:
- Deployment decisions need to be made based on the output of full QA results, instead of only smoke tests to prevent incidents such as https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8729 . Quality dept. has a rapid action group to stabilise the full test suite runs gitlab-org/gitlab#39208 (closed)
- Unit test runtimes are high and the directly affect fix turnaround times. Duration is tracked in https://app.periscopedata.com/app/gitlab/564156/Engineering-Productivity---Pipeline?widget=7501112&udv=0 and it is trending down, but we need to go from p90 ~70 mins to minimum of half that. This is due to need to run pipelines on commits 2 times: one when the MR is being developed for merging, and another time when the same commit is being prepared for deployment
- Security releases are incredibly complex, which forced us to pivot to solving only one part of the equation with &121 (closed) . This allowed the Development department to decrease amount of work they need to do for any given security release, but did not decrease the amount of delay security releases induce on all processes.
Corrective actions
From: https://app.periscopedata.com/app/gitlab/573702/WIP:-Delivery-team-PIs?widget=7481390&udv=0 our average MTTP is hovering around 100 hours. To change the trend down, we will:
- Increase the branch creation frequency from 1 time per week to 2 times per week delivery#627 (closed), and allow Quality rapid action gitlab-org/gitlab#39208 (closed) to complete in parallel before we increase the frequency further
- Drive changes necessary to safely automate deployments to production, replacing the manual approval process. delivery#579 (closed)
- Focus on removing the blocking nature of security releases &109 (closed)
- Show labels
- Show closed items
Link items together to show that they're related or that one is blocking others.