Cause our group can not afford too many runners, we are using the pipeline schedules to build the project every 30 minutes.
We found all though the latest version of the code was built already, gitlab will still rebuild it after 30 minutes later.
Proposal
When setting pipeline schedules, there would be a checkbox let the user choose should gitlab skip the job if the latest code is built in previous.
Workaround
Create a job in the .pre stage of the pipeline that uses the Pipeline API and checks if the latest completed pipeline for the same ref had the same sha as the current sha. If so, then the pipeline had already run and the job failed, causing the rest of the jobs in the pipeline to be skipped.
@SharuruZ So if the SHA is identical to that used for the previous scheduled pipeline's builds, then the new scheduled pipeline should not be carried out?
Thanks @SharuruZ for the proposal, it looks really interesting.
One main concern is that we normally try to avoid adding configuration options when possible, but in this case I'm not sure that having this as the only way to act is correct. There are probably scenarios where users want to run the pipeline even if nothing has changed, and also in this case looking at the SHA is a partial solution since we can have changes in secret variables or external dependencies or registry images that are used during jobs.
There's probably a tricky way to avoid running a full pipeline every time, just adding a job at the beginning that checks and store somewhere the latest SHA and in case just stop the whole pipeline (but I understand an option is really better for your needs ).
In this case this could also save wasted resources on .com, so it could be worth considering maybe a project-level option in Settings > Pipelines (only for schedules, regular runs will do it anyway).
@dosuken123 are you able to extract some statistics on .com about pipelines schedules run on the same SHA? It could be a good starting point for a discussion about a new option.
We have a some nightly scheduled pipelines that take a long time to run and produce fairly heavy snapshots to our artifact server, but sometimes the branch that the pipeline is ran on does not change that frequently. Ultimately, these nightly scheduled pipelines that operate on the same commit are a waste of our internally hosted CI infrastructure's resources
I would like something similar to this. Instead of only running a scheduled job if the repository has changed, I would like to only run a scheduled job if that pipeline hasn't ran by any method in the past 30 days. Basically, I want to only run a scheduled job on stale branches.
There are two use cases for this:
The first is to do cleanup/teardown/alerts of stale branches/environments.
The second is to rebuild golden images based on upstream dependencies. Just because a repository hasn't changed doesn't mean that the base os, packages, security updates, etc.. haven't changed.
The only way to currently do this would be to have a daily scheduled job spin up, check the latest pipeline status via API, and then abort if last run was recent. I'm not sure the pipeline API keeps enough history to be able to see when the last successful pipeline run was and it doesn't have details on individual jobs only on the total pipeline.
Why interested: They want to be able to only start a pipeline if there has been a change since the last pipeline has run. That way, they can fully automate their build process
Hello @thaoyeager, a Premium customer(internal only) is strongly interested in this feature.
They have complex and long CI jobs, and they want to optimize the usage of the runners. (And I can connect them with you if you need more information about their usages)
A couple of valid reasons not to do this have been presented, but couldn't people in those situation not just not check the selected checkbox? Implementing this would help many tremendously and I don't see why adding this as an option would hurt anybody.
On my side, this feature would be useful not really from the resource conservation perspective (we can have code inside the pipeline jobs which checks if there's something to do, and bail out quickly), but from API pureness point of view.
We have scheduled pipelines + we have some internal tooling which uses Gitlab API to visualise most recent pipelines on most recent commits, to fit out workflows. The scheduled pipelines run every N minutes to check if there's something to do, but sometimes no new commits are merged for hours, so the scheduled pipeline runs over and over all the time on the same commit. (A restrictive cron schedule is not a very good solution here, because it's difficult to know upfront during which hours the PRs would be merged).
This simply adds noise and complexity to the tooling to filter out the pipelines that were effectively no-op.
Having a simple checkbox to "do not run the schedule if no new commits" would be very useful to avoid the noise.
I haven't yet validated it, but thinking a bit more about it, the repeated scheduled pipelines could perhaps be skipped programmatically, although in a bit convoluted way:
A scheduled pipeline, once successful (in .post phase?), creates & force-pushes a git tag named LATEST_SCHEDULE_RUN pointing to the current sha ($CI_COMMIT_SHA).
All jobs in that scheduled pipeline have a rule which makes them show up only if there are any changes compared to that tag:
Also needing this feature. Our job runs quite long (3 hours, it's a game project), so it's rather annoying that it does it when it's not needed, blocking other jobs from completing since our concurrency limit is 1
+1 to this: we have a long nightly pipeline (~12 hours), so avoiding rerunning it on same commit is very desirable.
FYI: Jenkins does this automatically.
It would be beneficial if we can have this.
For us (@AOMediaCodec), this is a very important and useful feature for nightly testing and for reliable migration from Jenkins to GitLab.
but if we are trying to run it from the pipeline of the schedule, it's not really reliable since the value will already be replaced by the time the pipeline runs, and we still need to basically have 1 .pre job to check this and skip everything else instead of being able to not create the pipeline at all.
Currently, we already have a workaround, Basically what we did is tracking the current COMMIT_SHA on the pipeline schedule Env Variable via GitLab API. Then have the rules like this
rules:if:$PREVIOUS_COMMIT_SHA != $CI_COMMIT_SHA
But since GitLab 14.8 editing schedule can only be done by the schedule owner.
It's quite tricky for us as our bot account with the personal access token that we are using to call the GitLab API needs to take ownership first before it can track the current COMMIT_SHA on the schedule pipeline env variable, and the original owner needs to take ownership of the schedule first if they need to edit the schedule variable, and they also need to be repo maintainer first in order to be able to take ownership of the schedule.
It would be really helpful if there's a built-in option for this.
But there's also a downside of using this approach, when a new schedule pipeline is running before the last one is finished, the $PREVIOUS_COMMIT_SHA might not have been updated on the schedule pipeline env variable, and it will still run on the same commit SHA, even though it has been running previously.