FY21-Q3 Infrastructure Department OKRs
Objective (IACV): Continually improving spend efficiency for GitLab.com (98%)
-
Key Result: GitLab.com on financial plan (Infrastructure)
=> 100% On target - Key Results: Work with CI/CD - Verify team to improve insight into CI efficiency, pt 2
=> 90% - Key Result: Work with Package Team to implement service-level spend metrics, pt 2
=> 100% - Key Result: Work with Gitaly Team to implement service-level spend metrics, pt 2
=> 100% - Key Result: Design and Implement cloud account, tag, label hierarchy & requirements
=> 100%
Objective (Product): Reliability Improvements - Supporting Consistent Weighted SLA of 99.95% (53%)
- Key Result: Updated roadmap for DR
=> 100% - Key Result: Elevate MTTD to KPI
=> 5% - Key Result: Elevate MTTR to KPI
=> 5% - Key Result: Remove manual deployment approval in favour of automated system health metrics approval
=> 85% - **Key Result: All stateless services on unmodified Helm Chart**
=> 3/6 50% Key Result: (Q2Continue) Dogfooding - Migrate all public dashboards to GitLab monitoring=> OBE-
Key Result: Dogfooding - Incident Management on Monitor:Health
=> 100%
Objective (Team): Improve long term knowledge and alignment of Infrastructure (100%)
- Key Result: Provide clear career paths for RE roles to engage and retain our team
=> 100% -
Key Result: Dogfooding - Improve runbooks experience using Jupyter notebooks
=> 100% -
Key Result: All managers to complete DIB training
=> 10/10 100%
[Note: KR which are bold roll through to EVP Eng KRs (#8303 (closed))]
Retrospective
Good
- Generally a lot of accomplishment of our IACV targets. We made the actual financial targets, but also ended up with progress in the stage group spend efficiency metrics which had been slow to progress in Q2.
- We now have a clearer picture for overall SaaS product strategy, including insight into how DR capabilities will iterate in FY22 and how we will present these to customers.
- We have improved automation for releases, resulting in consistent lower MTTP and putting us in good position to continue iterating on this in Q4.
- We have better foundation of our job roles and career path established which will help as we go into performance reviews.
Bad
- Significant impact from unplanned work, mostly for the SRE teams related to incidents. We continued to plan too much work into this quarter.
- We're continuing to make choices which are misaligned with the stated handbook priorities. A key example of this is the fact what we accomplished quite a lot of project (and KR) work this quarter yet we have a growing backlog of Incident Reviews and even more Corrective Actions.
- While we made progress with the spend metrics for some stage groups, we still have more to do for others and this work tends to take longer than we'd like due to underlying instrumentation deficiencies. On the bright side, actually doing the work fills in the gaps.
Try
- Various engagements with dogfooding have led to us putting more focus on how and what we're dogfooding, as well as bringing visibility to the ongoing dogfooding work that can be overlooked when it isn't a dept level KR.
Edited by Steve Loyd