Rails 5.1.5 introduced parallel tests but we can't use that with RSpec so we'd need to use that gem.
This makes us fully utilize our runners and potentially save a lot of machine time.
If we have dual-core runners, we could theoretically cut our machine time by almost 50% by splitting to only 25 jobs and keep the running time almost the same.
That is huge and I think is worth an experiment to see how hard it is to get something like that setup.
@rymai I think this issue could be closed right? We now use the parallel: CI job parameter, so I don't think using parallel_tests would add any further improvements?
There is an rspec core issue here tracking the feasibility of adding this, and some interesting things have come up. Two gems have surfaced:
turbo_tests extracted from Discourse because of the same need
flatware - which is more geared toward Cucumber tests but might be helpful
Perhaps these could be investigated1 and we can branch this off into its own epic? It would be nice to dodge fatal: remote error: GitLab is currently unable to handle this request due to load. with more regularity
It would be nice to dodge fatal: remote error: GitLab is currently unable to handle this request due to load. with more regularity
Could you please elaborate this more? I thought if we're running more tests in parallel, it'll increase the load for GitLab.com, therefore making it more difficult to handle requests.
I also don't really understand how running tests in parallels in a single job is better than running tests in parallels in multiple jobs like we're doing now. If our runners have multiple cores then indeed that could utilize more CPU I think I thought they're all virtualized so maybe it's not too different than using more parallels jobs though?
Given the complexity (mostly around database and other persistent states), I might try using artifacts to pass the repository first then I think this will be simpler.
Yeah, if it's only to lessen load on Gitaly, I think it's not worth the complexity.
I might try using artifacts to pass the repository first then
We already implemented something like this before using CI_PRE_CLONE_SCRIPT to fetch the repo from a GCS bucket. But we removed it when we implemented caching at the Gitaly level. #39134 (comment 804417099)
I think the main benefit here is reducing the feedback loop during development. Improving CI will be a side effect. We have powerful multi core machines, but still execute single threaded tests, which is quite painful.
We already implemented something like this before using CI_PRE_CLONE_SCRIPT to fetch the repo from a GCS bucket. But we removed it when we implemented caching at the Gitaly level. #39134 (comment 804417099)
Yeah, I am aware and I wonder if we want to introduce something similar back.
I think the main benefit here is reducing the feedback loop during development. Improving CI will be a side effect. We have powerful multi core machines, but still execute single threaded tests, which is quite painful.
This sounds like the main goal is for local development? In that case we need to update GDK as well. Since the issue title says "on the CI", I was always under the impression that this was about speeding up CI (as well as the epic about pipeline).
This sounds like the main goal is for local development? In that case we need to update GDK as well. Since the issue title says "on the CI", I was always under the impression that this was about speeding up CI (as well as the epic about pipeline).
If we introduce parallelization with parallel_tests, turbo_tests , or some other gem, then it should work with minimum effort required in CI and GDK. My intention was to highlight the importance of this change and remind that we should keep local development in mind. I don't think it's very important in what epic we put this issue
It does not contain scripts to set up multiple database, including PostgreSQL (Edited: Actually, it probably does contain scripts to set up multiple database: https://github.com/grosser/parallel_tests#create-additional-databases), Redis, Gitaly, and so on mentioned in the issue description. Setting up in CI and GDK can be much different as well.
We should start from either CI or GDK, and they solve much different problems, and we need to know what problems we're trying to solve in order to decide which we try first and if it's indeed solving the problem.
Specifically, I am unsure what we can solve for the CI issue, and for local development, that's a separate concern and if that's the main goal here, we can repurpose the issue (or create a new issue so that we don't mix ideas)
Given the complexity (mostly around database and other persistent states), I might try using artifacts to pass the repository first then I think this will be simpler.
It's not relevant to this issue, but since it's brought up, I made a merge request to do this for RSpec jobs: !140330 (merged)