Skip to content

Improve the contention of pipeline ref creation (write-ref RPC)

Shinya Maeda requested to merge reduce-pipeline-ref-creation into master

What does this MR do and why?

This MR improves the contention of the Persistent Pipeline Ref creation. Here is the difference of the process flow between before and after:

Before:

  1. Creating a pipeline ref when a job starts running.
  2. Deleting a pipeline ref when a pipeline has finished.

The problem is 1. that it sends write-ref RPC for the number of jobs. For example, there are 100 jobs, then 100 RPCs are executed, which is subject of contention.

After:

  1. Creating a pipeline ref when a pipeline is created.
  2. Deleting a pipeline ref when a pipeline has finished.
  3. Re-creating a pipeline ref when a job starts running if a pipeline ref does not exist.

As long as pipelines are running automatically, we just execute 1 RPC for write-ref. In order to cover minor cases, such as manual execution and retrying a job, we fall back to the previous process if a pipeline has ever finished once.

Related #352750 (closed)

Manual QA

  • Result: Passed
  • Date: Thu 21 Apr 2022 08:46:55 AM UTC

Enable the feature flag:

[12] pry(main)> ::Feature.enabled?(:ci_reduce_persistent_ref_writes)
=> true

In each test case, 100 pipelines are created for verifying race conditions:

for i in {1..100}
do
export identifier=$(date +%s) &&
  git checkout main &&
  git co -b "feature-$identifier" &&
  echo 'a' > $identifier.txt &&
  git add . &&
  git commit -m "Test MR $identifier" &&
  git push origin -o merge_request.create
done

Test Case 1: Run the pipeline with the following .gitlab-ci.yml:

build:
    script: echo
[5] pry(main)> Ci::Pipeline.where('created_at > ?', Ci::Build.find(1326).updated_at).success.count
   (1.0ms)  SELECT COUNT(*) FROM "ci_pipelines" WHERE (created_at > '2022-04-20 07:42:07.568888') AND ("ci_pipelines"."status" IN ('manual')) /*application:console,db_config_name:ci,line:/devkitkat/services/rails/cache/ruby/2.7.0/gems/marginalia-1.10.0/lib/marginalia/comment.rb:25:in `block in construct_comment'*/
=> 100

No left-over advertised pipeline refs:

shinya@shinya-B550-VISION-D:~/workspace/test/pipeline-playground$ g ls-remote origin | grep pipeline

Test Case 2: Manual pipelines. Run the pipeline with the following .gitlab-ci.yml:

build:
    script: echo
    when: manual
    allow_failure: false

100 manual pipelines in database records:

[5] pry(main)> Ci::Pipeline.where('created_at > ?', 1.day.ago).manual.count
   (1.0ms)  SELECT COUNT(*) FROM "ci_pipelines" WHERE (created_at > '2022-04-20 07:42:07.568888') AND ("ci_pipelines"."status" IN ('manual')) /*application:console,db_config_name:ci,line:/devkitkat/services/rails/cache/ruby/2.7.0/gems/marginalia-1.10.0/lib/marginalia/comment.rb:25:in `block in construct_comment'*/
=> 100

No left-over advertised pipeline refs:

shinya@shinya-B550-VISION-D:~/workspace/test/pipeline-playground$ g ls-remote origin | grep pipeline

Test Case 3: Play those manual pipelines

user = User.first
Ci::Build.where('created_at > ?', 1.day.ago).manual.each { |b| b.play(user) }

Some pipeline refs were created during the job runs:

shinya@shinya-B550-VISION-D:~/workspace/test/pipeline-playground$ g ls-remote origin | grep pipeline
5834d15dbafd157c0b40e4eb46ac9f22fbb3d96e	refs/pipelines/348
8495e9bbe78347712e1aeea35ec127c8ea731716	refs/pipelines/349
9c25bc8ab8a2877864872446146b12d7b1ecee8b	refs/pipelines/350

And deleted after the execution:

[14] pry(main)> Ci::Build.where('created_at > ?', 1.day.ago).success.count
   (0.9ms)  SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND (created_at > '2022-04-20 08:29:01.006692') AND ("ci_builds"."status" IN ('success')) /*application:console,db_config_name:ci,line:/devkitkat/services/rails/cache/ruby/2.7.0/gems/marginalia-1.10.0/lib/marginalia/comment.rb:25:in `block in construct_comment'*/
=> 100

No left-over advertised pipeline refs:

shinya@shinya-B550-VISION-D:~/workspace/test/pipeline-playground$ g ls-remote origin | grep pipeline

How to set up and validate locally

See above

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Shinya Maeda

Merge request reports