Speed up CI feedback sent to GitHub for CI
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
During last repository mirroring incident we've received a lot of reports, that it highly affected CI for GitHub feature.
While there are multiple problems that we need to think about here (including not gitlab-ce~8338241 specific but system-wide problems like how Sidekiq and Redis are used and how their stability can affect different parts of the application), there seems to be one thing that we should be able to quickly improve, making users experience much better.
Let's copy here part of the report from gitlab-ee#10844:
The delay between pushing code and having the build start is annoying to start with, but what makes it worse is that the GitLab CI check on the GitHub PR:
Doesn't actually show up until the build starts, which means that a PR that breaks the build might spend 25 minutes with a tick next to it because the GitLab build hasn't started yet. If a new commit gets pushed to the PR, the GitLab CI check actually disappears.
So what is the problem here?
Let's consider a scenario:
- We have a GitHub project with GitLab CI/CD configured.
- We have an open PR with some changes and last CI report from GitLab showing success.
- We push a commit that introduces a bug, that our CI tests will be able to catch and report the Pipeline as failed.
So what is happening after the buggy commit is pushed?
- GitHub sends a webhook request to GitLab mentioning a new commit that was made.
- GitLab, if received and processed the request properly, schedules a background task for project mirror update and responds to GitHub with a
200 OKHTTP response. - On GitLab now we get a delay which depends on repository size, Sidekiq configuration, current system load, ongoing issues etc. Basically - repository mirror update is not an immediate operation!
- On GitHub side however we have a PR that is updated with new commit (so with the bug!) but the checks are still showing the previous state. This means that the reviewer can merge the PR, because nothing prevents him from doing this. More - the reviewer can miss the fact that the checks are pointing previous commit and can be fully convinced, that the current PR state is valid.
- When repository on GitLab side is updated other hooks are started. If project contains the
.gitlab-ci.ymlfile, a background task that creates and starts the CI Piepline is started. Again, it may be delayed by the configuration and state of Sidekiq. - When the Pipeline is started, it's transitioned from
createdtopendingstate. This triggers pipeline hooks. Theproject.execute_services()iterates over all configured project services, and - since this is CI/CD for GitHub - it triggers the GitHub integration service, which updates the status. We can find, that withstatus_message.statusGitLab is translating own status (taken from current Pipeline status) to GitHub's status. - Since the Pipeline is in one of the initial statuses (
created,pending,runningormanual), GitHub receives it'spendingstatus and changes the PR check to be a yellow one that should block PR merge (note: probably, I haven't used it a long time - Tomasz) and definitely gives the reviewer a clear information: CI is working and you should wait for the result.
As it can be seen, after GitLab received information from GitHub about new commit, there are a lot of things happening before it will set the initial pending status. The operation is mostly delayed by:
- Two Sidekiq schedule->execute operations: first one for project mirroring, second one for creating and starting the Pipeline.
- The project mirroring itself, which in case of big repositories can take a long of time (and is vulnerable for any networking/CPU/disk/etc problems).
How we can improve this
Let's also quote a proposition from user's report, which independently was raised during today's call that we had with @erushton:
- Most simply, is it possible to show a yellow (in progress) Gitlab CI check indicator on a Github PR as soon as a new commit is pushed, even if the repository is still syncing?
What we could do, is move the status update much earlier than it is now. In short, when GitLab receives the initial request from GitHub, and project mirroring is scheduled, we could schedule another background job that will update the remote status. As it can be seen in the GitHub service code, the only required information are SHA and Status, which initially can be just set manually to pending.
There is however one problem that needs to be taken into consideration. While at this moment GitLab knows that there is GitHub integration added (which is the requirement to be able to send any updates to GitHub) and it can even know that this specific project was configured as CI/CD for GitHub, the CI Piepeline will be created only, if project contains a valid .gitlab-ci.yml file.
If the file is not present (e.g. Project never had it or the commit removes the file), a manual status update to pending will be never updated to success or failure, because there will be never a CI Pipeline that would have a success/failure transition. This problem needs to be addressed somehow.
If the file is invalid, GitLab will create a Pipeline with an information that it has an invalid YAML file, which should trigger the transition, hooks and finally - update GitHub with error status. However this would need to be tested.
