Run CI/CD pipelines on a schedule - Implement Cron project hash and special intervals
<!--IssueSummary start--> <details> <summary> Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards. </summary> - [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=17799) </details> <!--IssueSummary end--> ### Problem to solve * With "Scheduled Pipeline runs" as implemented e.g. in gitlab-ce#30882 and proposed firstly in gitlab-ce#2989 the load on shared runners could peak at certain times. * A lot of projects need at least daily builds for integration tests but something like `0 0 * * *` would lead to a stampede of pipelines at midnight when naively manually entered for a lot of projects. * The current implementation (as of GitLab 12.0.1) replaces selection "daily" with 04:00 UTC and does not only cause a stampede but indirectly even causes unwanted starts of multiple pipelines for one schedule (gitlab-ce#61141). * Additionally a failing dev pipeline on week ends will most of the time only lead to an email notification without anyone responding to it. * Other CI systems like Jenkins have specific syntax to avoid this. ### Proposal * Implement project hash: when `H` in minute or hour is entered, hash "group name/project name" and distribute the execution accordingly over 60 for minutes resp. 24 for hours. * This will ensure, that when all projects entered `H H * * *`, the pipeline of a project would run predictably at the same time of day but distributed over 24*60 slots for all projects. * Add a special identifier `@daily` which would just mean something `H H * * *`. * Add a special identifier `@weekdaily` which would just mean something `H H * * 1-5`. * Add a special identifier `@nightly` which would just mean `H H(19-23),H(0-6) * * *`, so the pipeline would run "after office hours", i.e. from 19:00 to 06:59. * Add a special identifier `@off_hours` which would just mean something `H H(19-23),H(0-6) * * 1-5`. * Implement `H` at least for minutes and hours, for day/month/weekday I do not see a use case. Unclear: * Are `@daily` and `@nightly` good identifiers (people could expect execution at midnight because some cron implementations use it like this), maybe once_per_day or once_per_night would be better and randomized relative based on the user who owns the jobs location/time zone? ### Links / references * Sample implementation for puppet https://github.com/pradels/puppet-parser/blob/master/lib/puppet/puppet/parser/functions/fqdn_rand.rb * Jenkins help page: https://github.com/jenkinsci/jenkins/blob/master/core/src/main/resources/hudson/triggers/TimerTrigger/help-spec.html * Deterministic "random" numbers for Ansible: https://gist.github.com/ptman/9bd8223272e2c0e27b2b ### Documentation blurb * To avoid a stampede of daily pipelines running at midnight, use `H H * * *` as expression (the same is achieved by entering `@daily`). * `H` will be replaced by a number in the range 0-59 for minutes and 0-23 for hours automatically. The number is evenly distributed based on the hash of "group name/project name". * Entering `H H(19-23),H(0-6) * * 1-5` would run your pipeline Monday to Friday during "after office hours", i.e. between 19:00 and 06:59 (the same is achieved by entering `@off_hours`). * Note that `H` is only supported for minutes and hours.
issue