Deprecate supporting null `scheduling_type`
Summary
When implementing empty needs, we tried to make it backward compatible. It means we considered existing ci_builds
without scheduling_type
, and implemented populating those on the fly. That's because our code always needs to ensure scheduling_type
existence of processables.
We're currently creating roughly 30000 keys in Redis for avoiding repeated backfills (ExclusiveLock with 1 hour expiration without ensuring key removal. 30000 is a average number of pipeline creation per hour). This should be removed once we've confirmed that all legacy builds have been archived. This cannot be achieved any time soon as gitlab.com doesn't start archiving builds yet.
In this issue, we will:
- Deprecate handling null
scheduling_type
ofci_builds
and hide behind a FF. - Add background migration that populates null
scheduling_type
s.
Past discussion
The following discussions from !22246 (merged) should be addressed:
-
We need to populate the data for two different purposes:
- Retrying pipeline
- Sidekiq
PipelineProcessWorker
jobs that enqueued before deploy but run after deploy.
First, we thought that we could populate in PipelineProcessWorker and RetryPipelineService. Then we decided that why not put this logic in only
ProcessPipelineService
.Populating in
PipelineProcessWorker
should be temporary, however we can not be sure for self-hosted installations.Also, we cannot remove
populate_scheduling_type!
from retrying pipeline unless we ensure all builds have ascheduling_type
, which will never happen without a big migration.That's why we decided to have a permanent logic like this (maybe until a breaking change version).
-
@dosuken123 discussion
Thank you for explanation. So we're effectively doing a lazy background migration on runtime. I think it's fine as old builds will eventually degenerated and becomes an unprocessable job, so probably we can drop the old code somewhere in the future.
What I was asking is to create an issue to explain this background and put the link in the comment in code as I feel it's a bit too early to determine it as a permanent logic. Maybe as you say, we can drop immediately when we're allowed to introduce a breaking change at a major version update. I think what's important here is to allows anyone to track the status instead of burying the problem.