Skip to content

Error Budget Exception Request for Import

Liam McAndrew requested to merge lmcandrew-master-patch-53362 into master

As per https://gitlab.com/gitlab-org/error-budget-reports/-/issues/3#note_755299340

To set better expectations regarding the Error Budget for groupimport I propose we allow an exception request, setting a revised target of 99.85% until the end of Q1'23 (April 30th 2022).

Why?

The primary contributor towards groupimport spend is sidekiq apdex failures [data]. The architecture of our Importers/Exporters was never designed to optimize for splitting data in to small jobs. Instead, the design often relies heavily on nested transactions, which can be as complex as the data being imported/exported. The complexity, and therefore time, to handle these transactions has a direct influence on the sidekiq apdex.

The group are directly addressing long database transactions in gitlab-org/gitlab#343458 (closed), which is a Q123 OKR gitlab-org/manage/general-discussion#17447 (closed). However, resolving this issue is not trivial and a proof-of-concept will be worked on first to get a better understanding of the effort required. It is also worth noting that addressing database transactions alone won't solve the apdex failures - once nested transations are split-up, they will also need to be processed in separate jobs.

Given the above, it is hard to predict when the problem will be solved. groupimport are also extremely capacity constrained1, with just 1 BE IC from 1st February. Improving the Error Budget is the top focus for the group.

1 _This exception request isn't the solution to the capacity problem. We are actively trying to hire in the team and we are considering all options to increase capacity quickly (for example, a headcount reset)._

Edited by Liam McAndrew

Merge request reports