Skip to content

Make pipeline schedule worker resilient

What does this MR do?

Make pipeline schedule worker resilient

Currently, pipeline schedule worker is unstable because it's sometimes killed by excessive memory consumption. You can read more detailed report in https://gitlab.com/gitlab-org/gitlab-ce/issues/61955 gitlab-com/gl-infra/production#805 (closed).

In order to make the performance consistent, we add the following fixes:

  1. next_run_at is always a real value, which based on the PipelineScheduleWorker's cron schedule (Currently, 19 * * * * by default, on gitlab.com). This prevents from duplicate pipelines creation caused by Sidekiq retry with Memory Killer.
  2. Remove exclusive lock. This is already covered by real_next_run change. Rather, this is backfiring as described in https://gitlab.com/gitlab-org/gitlab-ce/issues/61955.
  3. Use RunPipelineScheduleWorker to spread memory consumption to multiple Sidekiq jobs.

Close https://gitlab.com/gitlab-org/gitlab-ce/issues/61955 gitlab-com/gl-infra/production#805 (closed)

Does this MR meet the acceptance criteria?

Conformity

Performance and testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Shinya Maeda

Merge request reports