Spike - Inconsistent external pull request pipelines
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
When using pipelines for external pull requests, we're experiencing inconsistencies where sometimes pipelines are triggered, and other times they are not even when utilizing the same .gitlab-ci.yml file. The sender does receive a 200 response from CloudFlare so seemingly the payload is at least making it to us.
For pipelines that are successfully triggered on the GitLab side, we're seeing PostReceive > UpdateExternalPullRequestsWorker > Ci::InitialPipelineProcessWorker as expected within 5 minutes of the open Pull Request and/or mirror update.
For unsuccessful pipelines, we're not seeing any of these workers within 5 minutes of the expected time. Initially, we had thought that maybe the Rules had come into play and we were failing the pipeline silently, but we should still see UpdateExternalPullRequestsWorker being run if it were an issue with rules. Considering there are no workers, this appears to be failing prior to the parsing and evaluation of the .gitlab-ci.yml file.
Additional context in Slack (Internal)
Steps to reproduce
Not able to directly reproduce in my testing project. Opening a pull request on GitHub with the same type of rules triggered the pipeline as expected on the GitLab side. For an example project, see below.
Example Project
ZD Ticket: https://gitlab.zendesk.com/agent/tickets/245591 (Internal) Follow-up ZD Ticket: https://gitlab.zendesk.com/agent/tickets/291741 (Internal)
What is the current bug behavior?
Opening a Pull Request on GitHub does not always trigger a pipeline in GitLab when configuring CI for external pull requests.
What is the expected correct behavior?
Opening a pull request with the appropriate configuration should trigger a pipeline in GitLab or expose a failure on the front end and in logs.
Relevant logs and/or screenshots
Please see internal notes for full details.
- (Working) https://log.gprd.gitlab.net/goto/4f282a9d9c3abae2211b0a4e17d9caba
- (Not Working) https://log.gprd.gitlab.net/goto/ef088069c0277bf7a6e79b73821691ae
- (POST from GitHub) https://log.gprd.gitlab.net/goto/b118c9d7e5e25e6289b270fcd01ea943
UPDATE:
More logs of continued problem for same customer several months later:
https://log.gprd.gitlab.net/goto/5b45c340-d6eb-11ec-aade-19e9974a7229
https://log.gprd.gitlab.net/goto/66b43ef0-d6eb-11ec-aade-19e9974a7229
Output of checks
This happens on GitLab.com 14.5.0-pre 97ee889c
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:env:info`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:check SANITIZE=true)(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)(we will only investigate if the tests are passing)
