Artifact download sporadically fails with 401 for some CI jobs
Summary
Some jobs within our pipelines fail intermittently when attempting to download artifacts from prior jobs. Although retries often mitigate the issue temporarily, the problem persists and disrupts our Integrations. The failures have no discernible pattern and have persisted across several iterations of GitLab Runner, and GitLab itself suggesting that the runner version is not at the root of the problem.
We suspect the job token might be getting revoked prematurely, but the sporadic nature of the issue complicates diagnosis.
We are seeking insights into potential causes or solutions and are open to collaboration with the community for further troubleshooting.
What we have tried to reproduce this issue:
- Enable debug mode on the runner
- Enable
CI_DEBUG_TRACEin the job - Ensure the secrets.json file is in sync across all nodes
Steps to reproduce
The issue cannot be reproduced reliably.
Example Project
What is the current bug behavior?
Some jobs fail randomly with
ERROR: Downloading artifacts from coordinator... unauthorized host=gitlab.example.com id=57265799 responseStatus=401 Unauthorized status=401 Unauthorized token=64_Lmcdx
What is the expected correct behavior?
Artifacts should be downloaded with the job token
Relevant logs and/or screenshots
This section was moved to an internal comment.
GitLab CI Yaml used for investigation
include:
- project: 'devops/ci-templates'
ref: master
file:
- '/templates/Bash/Bash.gitlab-ci.yml'
.generator:
extends: .bash
variables:
COUNT: 10
BUFFER_SIZE: 1M
CI_DEBUG_TRACE: "true"
script:
- env
- dd if=/dev/urandom of=${CI_JOB_NAME}.txt bs="1M" count=10
artifacts:
untracked: false
when: on_success
expire_in: "1 day"
paths:
- ${CI_JOB_NAME}.txt
tags:
- debug
job1:
extends: .generator
job2:
extends: .generator
needs:
- job: job1
artifacts: true
job3:
extends: .generator
needs:
- job: job1
artifacts: true
- job: job2
artifacts: true