Check repository status when using pipeline schedules

@SharuruZ So if the SHA is identical to that used for the previous scheduled pipeline's builds, then the new scheduled pipeline should not be carried out?

/cc @dosuken123

IIRC we discussed this at somewhere, but I couldn't find the relevant issue.

As @markglenfletcher suggested, we can check the commit-sha for this skipping logic. Something like,

diff --git a/app/workers/pipeline_schedule_worker.rb b/app/workers/pipeline_schedule_worker.rb
index 7b485b3363..f6541307e3 100644
--- a/app/workers/pipeline_schedule_worker.rb
+++ b/app/workers/pipeline_schedule_worker.rb
@@ -11,6 +11,10 @@ class PipelineScheduleWorker
           next
         end

+        if pipeline_schedule.last_pipeline.sha == GetHeadSha(pipeline_schedule.ref)
+          next
+        end
+
         Ci::CreatePipelineService.new(schedule.project,
                                       schedule.owner,
                                       ref: schedule.ref)

But let's ask production team at first if there are any concerns.

/cc @bikebilly

Thanks @SharuruZ for the proposal, it looks really interesting.

One main concern is that we normally try to avoid adding configuration options when possible, but in this case I'm not sure that having this as the only way to act is correct. There are probably scenarios where users want to run the pipeline even if nothing has changed, and also in this case looking at the SHA is a partial solution since we can have changes in secret variables or external dependencies or registry images that are used during jobs.

There's probably a tricky way to avoid running a full pipeline every time, just adding a job at the beginning that checks and store somewhere the latest SHA and in case just stop the whole pipeline (but I understand an option is really better for your needs ).

In this case this could also save wasted resources on .com, so it could be worth considering maybe a project-level option in Settings > Pipelines (only for schedules, regular runs will do it anyway).

@dosuken123 are you able to extract some statistics on .com about pipelines schedules run on the same SHA? It could be a good starting point for a discussion about a new option.

Prospect interested in this feature https://gitlab.my.salesforce.com/0016100001VFWZb?srPos=0&srKp=001

added 1 deleted label and removed 1 deleted label

Internal ZD: https://gitlab.zendesk.com/agent/tickets/107971

We have a some nightly scheduled pipelines that take a long time to run and produce fairly heavy snapshots to our artifact server, but sometimes the branch that the pipeline is ran on does not change that frequently. Ultimately, these nightly scheduled pipelines that operate on the same commit are a waste of our internally hosted CI infrastructure's resources

added customer label

This might be resolved by https://gitlab.com/gitlab-org/gitlab-ce/issues/32741 effectively.

I would like something similar to this. Instead of only running a scheduled job if the repository has changed, I would like to only run a scheduled job if that pipeline hasn't ran by any method in the past 30 days. Basically, I want to only run a scheduled job on stale branches.

There are two use cases for this:

The first is to do cleanup/teardown/alerts of stale branches/environments.
The second is to rebuild golden images based on upstream dependencies. Just because a repository hasn't changed doesn't mean that the base os, packages, security updates, etc.. haven't changed.

The only way to currently do this would be to have a daily scheduled job spin up, check the latest pipeline status via API, and then abort if last run was recent. I'm not sure the pipeline API keeps enough history to be able to see when the last successful pipeline run was and it doesn't have details on individual jobs only on the total pipeline.

added auto updated potential proposal labels

added [deprecated] Accepting merge requests label

added Category:Continuous Integration label

added grouppipeline execution label

removed [deprecated] Accepting merge requests label

changed milestone to %Backlog

added [deprecated] Accepting merge requests label

changed milestone to %Backlog

Mentioned by a Commercial customer as being important: https://gitlab.my.salesforce.com/0016100001F2Gsw (Internal)

mentioned in issue #196744

added 1 deleted label

Editing as I posted the wrong feedback earlier

Customer: https://gitlab.my.salesforce.com/0016100001ecpOy
Why interested: They want to be able to only start a pipeline if there has been a change since the last pipeline has run. That way, they can fully automate their build process
How important to them: Very important
Questions: When will this be available?
PM to mention: @bikebilly (I think)

added sectionops label

Hello @thaoyeager, a Premium customer(internal only) is strongly interested in this feature. They have complex and long CI jobs, and they want to optimize the usage of the runners. (And I can connect them with you if you need more information about their usages)

A GitLab Ultimate self-managed customer with 2500 seats is asking for this feature.

Link to request: https://gitlab.my.salesforce.com/00161000004yxj9
Why interested: Want ability to skip jobs where possible
Current solution for this problem: NA
Impact to the customer of not having this: Longer times for retries / etc
PM to mention: @samdbeckham (I think)

Thanks for the mention @cupini I'm an EM for ~"group::continuous integration" so it's @jreporter you're looking for here.

Yep that's me thanks @samdbeckham

Premium customer interested in this feature

A couple of valid reasons not to do this have been presented, but couldn't people in those situation not just not check the selected checkbox? Implementing this would help many tremendously and I don't see why adding this as an option would hurt anybody.

On my side, this feature would be useful not really from the resource conservation perspective (we can have code inside the pipeline jobs which checks if there's something to do, and bail out quickly), but from API pureness point of view.

We have scheduled pipelines + we have some internal tooling which uses Gitlab API to visualise most recent pipelines on most recent commits, to fit out workflows. The scheduled pipelines run every N minutes to check if there's something to do, but sometimes no new commits are merged for hours, so the scheduled pipeline runs over and over all the time on the same commit. (A restrictive cron schedule is not a very good solution here, because it's difficult to know upfront during which hours the PRs would be merged).

This simply adds noise and complexity to the tooling to filter out the pipelines that were effectively no-op.

Having a simple checkbox to "do not run the schedule if no new commits" would be very useful to avoid the noise.

~~I haven't yet validated it, but thinking a bit more about it, the repeated scheduled pipelines could perhaps be skipped programmatically, although in a bit convoluted way:~~

~~A scheduled pipeline, once successful (in .post phase?), creates & force-pushes a git tag named LATEST_SCHEDULE_RUN pointing to the current sha ($CI_COMMIT_SHA).~~
~~All jobs in that scheduled pipeline have a rule which makes them show up only if there are any changes compared to that tag:~~

  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
      changes:
        paths:
          - *
        compare_to: 'refs/tags/LATEST_SCHEDULE_RUN'

~~(this requires GitLab 15.3 with a flag, or 15.5 without flag) https://docs.gitlab.com/ee/ci/yaml/#ruleschangescompare_to~~

Edit: As per these docs, in scheduled pipelines, rules:changes always evaluates to true, so it wouldn't work.

Also needing this feature. Our job runs quite long (3 hours, it's a game project), so it's rather annoying that it does it when it's not needed, blocking other jobs from completing since our concurrency limit is 1

+1 to this: we have a long nightly pipeline (~12 hours), so avoiding rerunning it on same commit is very desirable. FYI: Jenkins does this automatically.

It would be beneficial if we can have this. For us (@AOMediaCodec), this is a very important and useful feature for nightly testing and for reliable migration from Jenkins to GitLab.

PS: We are Gitlab Ultimate customers.

Another GitLab Premium customer would benefit by this, more details in Zendesk #454318(internal only).

I'd like to also have this feature, as we also have a very long pipeline that costs a lot of resources

We have tried to use the schedule API last_pipeline from #get-a-single-pipeline-schedule

{
    "last_pipeline": {
        "id": 332,
        "sha": "0e788619d0b5ec17388dffb973ecd505946156db",
        "ref": "main",
        "status": "pending"
    }
}

but if we are trying to run it from the pipeline of the schedule, it's not really reliable since the value will already be replaced by the time the pipeline runs, and we still need to basically have 1 .pre job to check this and skip everything else instead of being able to not create the pipeline at all.

Currently, we already have a workaround, Basically what we did is tracking the current COMMIT_SHA on the pipeline schedule Env Variable via GitLab API. Then have the rules like this

  rules:
    if: $PREVIOUS_COMMIT_SHA != $CI_COMMIT_SHA

But since GitLab 14.8 editing schedule can only be done by the schedule owner.

It's quite tricky for us as our bot account with the personal access token that we are using to call the GitLab API needs to take ownership first before it can track the current COMMIT_SHA on the schedule pipeline env variable, and the original owner needs to take ownership of the schedule first if they need to edit the schedule variable, and they also need to be repo maintainer first in order to be able to take ownership of the schedule.

It would be really helpful if there's a built-in option for this.

But there's also a downside of using this approach, when a new schedule pipeline is running before the last one is finished, the $PREVIOUS_COMMIT_SHA might not have been updated on the schedule pipeline env variable, and it will still run on the same commit SHA, even though it has been running previously.

I have a GitLab Premium self-managed customer very interested in this issue:

Link to request: Salesforce internal only
Priority: customer priority6
Why interested: Ensures that they only generate builds when there are changes and removes confusion of having builds that contain no changes.
PM to mention: @jreporter for visibility and prioritization

Check repository status when using pipeline schedules

Description

Proposal

Workaround

Feature checklist

Designs

Child items ...

Activity