Skip to content

Sidekiq: Prevent `ArchiveTraceWorker` from holding client transactions open

What does this MR do?

Issue #322142 (closed)

At a high level this MR prevents transactions from being open for a long time while file upload completes. This has caused infra incidents in the past.

Technical Details

By default, the version of carrierwave that we are using generates an after_save callback on the model which has the effect of running file upload inside a transaction and adding new callbacks to ensure file storage of the artifact trace happens before the job artifact record is saved inside a transaction.

This MR disables those callbacks that are causing long transactions and instead moves file storage (upload) so that it happenes before save via setting a new before_save callback.

To verify that it it no longer holding open the transaction while the file is stored I added some logging to the file storage via:

    def store_file!
      Gitlab::AppLogger.info('storing file')
      puts 'storing file'

      super

      Gitlab::AppLogger.info('done storing file')
      puts 'done storing file'
    end

Logs

You can see in the logs that the transaction is opened after the file storage is completed. Abbreviated logs:

2021-08-11T19:47:04.424Z: storing file
2021-08-11T19:47:04.505Z: done storing file
...
TRANSACTION (0.1ms)  BEGIN
...
Ci::JobArtifact Create

Full logs:

  Ci::JobArtifact Load (0.5ms)  SELECT "ci_job_artifacts".* FROM "ci_job_artifacts" WHERE "ci_job_artifacts"."project_id" = 29 AND "ci_job_artifacts"."file_type" = 3 AND "ci_job_artifacts"."job_id" = 771 AND "ci_job_artifacts"."file_location" = 3 LIMIT 1 /*application:sidekiq,correlation_id:6a4b039e5726ccc8367cae8766d30a88,jid:ea27284dc414c3175ddd9c0a,endpoint_id:ArchiveTraceWorker,line:/lib/gitlab/ci/trace.rb:254:in `block in create_build_trace!'*/
  ↳ lib/gitlab/ci/trace.rb:254:in `block in create_build_trace!'

==> log/application.log <==
2021-08-11T19:47:04.424Z: storing file

==> log/application_json.log <==
{"severity":"INFO","time":"2021-08-11T19:47:04.424Z","correlation_id":"6a4b039e5726ccc8367cae8766d30a88","message":"storing file"}

==> log/application.log <==
2021-08-11T19:47:04.505Z: done storing file

==> log/application_json.log <==
{"severity":"INFO","time":"2021-08-11T19:47:04.505Z","correlation_id":"6a4b039e5726ccc8367cae8766d30a88","message":"done storing file"}

==> log/development.log <==
  TRANSACTION (0.1ms)  BEGIN /*application:sidekiq,correlation_id:6a4b039e5726ccc8367cae8766d30a88,jid:ea27284dc414c3175ddd9c0a,endpoint_id:ArchiveTraceWorker,line:/lib/gitlab/ci/trace.rb:262:in `block in create_build_trace!'*/
  ↳ lib/gitlab/ci/trace.rb:262:in `block in create_build_trace!'
  Ci::JobArtifact Create (4.1ms)  INSERT INTO "ci_job_artifacts" ("project_id", "job_id", "file_type", "size", "created_at", "updated_at", "file", "file_sha256", "file_location") VALUES (29, 771, 3, 1188, '2021-08-11 19:47:04.505258', '2021-08-11 19:47:04.505258', 'job.log', '\x65343164666532366132363361633864346364663965363434616439396565393132366636366464376162393636326334376161346232613332306566653136', 3) RETURNING "id" /*application:sidekiq,correlation_id:6a4b039e5726ccc8367cae8766d30a88,jid:ea27284dc414c3175ddd9c0a,endpoint_id:ArchiveTraceWorker,line:/lib/gitlab/ci/trace.rb:262:in `block in create_build_trace!'*/
  ↳ lib/gitlab/ci/trace.rb:262:in `block in create_build_trace!'
  Ci::JobArtifact Update (0.6ms)  UPDATE "ci_job_artifacts" SET "file_store" = 2 WHERE "ci_job_artifacts"."id" = 245 /*application:sidekiq,correlation_id:6a4b039e5726ccc8367cae8766d30a88,jid:ea27284dc414c3175ddd9c0a,endpoint_id:ArchiveTraceWorker,line:/app/models/concerns/file_store_mounter.rb:19:in `update_file_store'*/
  ↳ app/models/concerns/file_store_mounter.rb:19:in `update_file_store'
  TRANSACTION (0.8ms)  COMMIT /*application:sidekiq,correlation_id:6a4b039e5726ccc8367cae8766d30a88,jid:ea27284dc414c3175ddd9c0a,endpoint_id:ArchiveTraceWorker,line:/lib/gitlab/ci/trace.rb:262:in `block in create_build_trace!'*/

Screenshots or Screencasts (strongly suggested)

New style paths in object storage(local mini-io) Screen_Shot_2021-08-11_at_4.21.08_PM

How to test?

Ensure gdk has object storage configured and has numerous old style traces in object storage by running jobs on master.

Switch to the new branch and run jobs. Access to newly run job logs should work as well as access to older job logs.

You can also verify the file storage is happening outide of the transaction by adding the store_file! function from the technical details section and running a job while tailing the logs.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by Allison Browne

Merge request reports