Imagine running e2e tests in pipelines took just 20 minutes. How do we get there?
Context
In my 1:1 with @kwiebers last week, he described a walk through he had with @sgregory2 to compile info on one example pipeline/job to understand where time was being taken up and where we might improve package-and-qa. It is illuminating.
- The journey to start running tests in https://gitlab.com/gitlab-org/gitlab-qa-mirror/-/jobs/2338839982 as an example from gitlab-org/gitlab!85239 (merged) . Here’s the critical path of dependencies:
- compile-production-assets started in the GitLab MR pipeline and took 16:59
- build-assets-images took 1:41
-
Package-and-qa started
- fetch-assets took 1:45
- trigger:package took 33:19
- trigger:gitlab-docker took 5:30
-
GitLab qa pipeline started
- ee:actioncable started initializing the environment at 23:14
- ee:actioncable started running tests at 23:22:09
- Tests completed at 23:26:00
Summary: It took around 68 minutes to enable 4 minutes of test execution.
Discussion prompt
@gl-quality/qe-sub-dept @gitlab-org/quality/engineering-productivity
Is this an outlier, or is this roughly average/expected timing?
Some concerns:
- I believe the package part of this is only run within package-and-qa and omnibus build success is not tested elsewhere. Is this correct?
- Within the GitLab qa pipeline, each job does roughly the same thing - spinning up an environment, albeit each slightly different from those in other jobs - which proportionally takes much longer than the tests themselves. This duplication equals costs.
- We of course want to lean on selective test execution to customize test feedback, but even if we do, we're targeting either the total job count without reducing the time to completion, or the last 4 minutes of these jobs and not the first 68 minutes.
- We cannot currently run most orchestrated tests directly in review apps due to these tests' specific needs (e.g. Jira integration, etc).
Looking at these timings, I don't know how we can achieve large-scale improvement in the time it takes from starting testing activities to getting the results. However, I'm not the expert, you all are. Can we get start-to-finish time down to 20 minutes given the current structure? Is it possible to make more than incremental gains in speed? Do we need to radically rethink how we approach these tests so that we can test earlier, faster, and with more reliability?
I chose 20 minutes to inspire radical thinking. We may not ever get that low, but small, incremental improvements to the ~75 minutes it takes now is not the impact we should be seeking. We need an ambitious vision for the future to work towards.