Run CI/CD pipelines on a schedule - Implement Cron project hash and special intervals

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem to solve

  • With "Scheduled Pipeline runs" as implemented e.g. in gitlab-ce#30882 and proposed firstly in gitlab-ce#2989 the load on shared runners could peak at certain times.
  • A lot of projects need at least daily builds for integration tests but something like 0 0 * * * would lead to a stampede of pipelines at midnight when naively manually entered for a lot of projects.
  • The current implementation (as of GitLab 12.0.1) replaces selection "daily" with 04:00 UTC and does not only cause a stampede but indirectly even causes unwanted starts of multiple pipelines for one schedule (gitlab-ce#61141).
  • Additionally a failing dev pipeline on week ends will most of the time only lead to an email notification without anyone responding to it.
  • Other CI systems like Jenkins have specific syntax to avoid this.

Proposal

  • Implement project hash: when H in minute or hour is entered, hash "group name/project name" and distribute the execution accordingly over 60 for minutes resp. 24 for hours.
  • This will ensure, that when all projects entered H H * * *, the pipeline of a project would run predictably at the same time of day but distributed over 24*60 slots for all projects.
  • Add a special identifier @daily which would just mean something H H * * *.
  • Add a special identifier @weekdaily which would just mean something H H * * 1-5.
  • Add a special identifier @nightly which would just mean H H(19-23),H(0-6) * * *, so the pipeline would run "after office hours", i.e. from 19:00 to 06:59.
  • Add a special identifier @off_hours which would just mean something H H(19-23),H(0-6) * * 1-5.
  • Implement H at least for minutes and hours, for day/month/weekday I do not see a use case.

Unclear:

  • Are @daily and @nightly good identifiers (people could expect execution at midnight because some cron implementations use it like this), maybe once_per_day or once_per_night would be better and randomized relative based on the user who owns the jobs location/time zone?

Links / references

Documentation blurb

  • To avoid a stampede of daily pipelines running at midnight, use H H * * * as expression (the same is achieved by entering @daily).
  • H will be replaced by a number in the range 0-59 for minutes and 0-23 for hours automatically. The number is evenly distributed based on the hash of "group name/project name".
  • Entering H H(19-23),H(0-6) * * 1-5 would run your pipeline Monday to Friday during "after office hours", i.e. between 19:00 and 06:59 (the same is achieved by entering @off_hours).
  • Note that H is only supported for minutes and hours.
Edited by 🤖 GitLab Bot 🤖