Skip to content

Fix missing pipeline e-mails when job logs moved to object storage

Stan Hu requested to merge sh-pipeline-notification-fix into master

As discussed in #195430 (closed), it's possible that a race condition occurs when a build finishes and a failed pipeline email goes out. This is what was happening before:

  1. Build finishes and causes a pipeline failure. The pipeline failure transition causes the pipeline to run PipelineNotificationWorker, which schedules an ActiveJob to e-mail a user with a failed pipeline ID.

  2. BuildFinishedWorker schedules ArchiveTraceWorker to move the job log, stored locally on a filesystem, to object storage.

  3. The ActiveJob runs and loads the pipeline and failed build logs. Some builds have been moved to object storage, but the ActiveJob has a stale record and attempts to load the file from the filesystem.

  4. The file doesn't exist, an Errno::ENOENT exception is raised.

To fix this, we now attempt a refresh the job from the database if we encounter this exception.

Edited by Stan Hu

Merge request reports