Use primary DB for authenticating artifacts downloads
What does this MR do and why?
A CI job downloads job artifacts from the /api/v4/jobs/:id/artifacts
endpoint. Previously the endpoint used any replica to authenticate the
current user via a job token, but that token depends on the job record
being in the database. However, there are no guarantees that the
replica has an up-to-date record of that job. As a result, users could
see intermittent 401 errors due to replication lag.
To avoid this, use the primary database when authenticating the build.
This commit adds a ci_job_artifacts_use_primary_to_authenticate
feature flag to roll this out.
Note that the runner API attempts to select an up-to-date replica for the job that produced the artifacts, but it has no good way of determining the job ID that originated the request for downloading artifacts. In addition, the user authentication happens before the replica selection happens.
Relates to #466138 (closed)
Changelog: fixed
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
- Create a PostgreSQL replica (https://medium.com/@umairhassan27/setting-up-postgresql-replication-on-slave-server-a-step-by-step-guide-1ff36bb9a47f). You can do this on your GDK or on an Omnibus instance.
- I took a shortcut and used some of features in GitLab Geo (https://docs.gitlab.com/ee/administration/geo/setup/database.html) for Ominbus GitLab. In my instance, I added:
postgresql['listen_address'] = '127.0.0.1'
postgresql['port'] = 5432
postgresql['sql_replication_password'] = '950233c0dfc2f39c64cf30457c3b7f1e'
postgresql['md5_auth_cidr_addresses'] = ['127.0.0.1/32', '192.168.2.1/32']
Run gitlab-ctl reconfigure. This creates a gitlab_replicator account with the password password.
- I created a
pg_basebackupin my home dir under thedbreplicadir:
/opt/gitlab/embedded/bin/pg_basebackup -h localhost -D dbreplica -U gitlab_replicator -v -P --wal-method=stream
- Since DB load balancing requires hosts using the same port (5432), I created a dummy Ethernet device under IP 192.168.2.1:
sudo ip link add eth_dummy type dummy
sudo ip address add 192.168.2.1/24 dev eth_dummy
- Once that completed, I edited
dbreplica/postgresql.confand added a 30-second delay:
primary_conninfo = 'host=127.0.0.1 port=5432 user=gitlab_replicator password=password'
recovery_min_apply_delay = '30s'
listen_addresses = '192.168.2.1'
hot_standby = on
- Then I ran
touch dbreplica/standby.signal. - To start up the
postgres -D dbreplicato start up the replica. - With the replica up, I added this to
/etc/gitlab/gitlab.rb:
gitlab_rails['db_load_balancing'] = { 'hosts' => ['192.168.2.1'] }
- Run
gitlab-ctl reconfigureandgitlab-ctl restart puma. - Confirm that the host is detected by GitLab Rails:
# grep "Host is online" /var/log/gitlab/gitlab-rails/database_load_balancing.log
{"severity":"INFO","time":"2024-06-09T06:37:06.379Z","correlation_id":"13aab9ee8aaba21a9e925fa693851355","event":"host_online","message":"Host is online after replica status check","db_host":"192.168.2.1","db_port":null}
- On the GitLab server, create a CI pipeline that has two jobs: one that creates artifacts, and another that downloads them:
image: ruby:latest
stages:
- test
- deploy
test:
stage: test
script:
- echo "hello" > test.txt
cache:
paths:
- test.txt
artifacts:
paths:
- test.txt
deploy:
stage: deploy
script:
- echo "Test deploy"
- There's a good chance there is a
401 Unauthorizedwill be hit by the deploy job, but if it doesn't fail retry again. - Enable the feature flag in
gitlab-rails console:Feature.enable(:ci_job_artifacts_use_primary_to_authenticate) - Retry the deploy job several times and verify that the job passes.