Add option to skip scheduled pipelines after downtime
Release notes
GitLab now provides an option to skip execution of scheduled pipelines that were missed during server downtime. This prevents maintenance tasks from running outside their intended time windows.
Problem to solve
As a GitLab administrator, I want to prevent scheduled pipelines that were missed during server downtime from automatically executing after service restoration.
Currently, when a GitLab server is restored after downtime, any scheduled pipelines that were supposed to run during the downtime period will execute shortly after the server comes back online. This can cause maintenance operations to run at inappropriate times, potentially affecting production environments.
Intended users
User experience goal
The user should be able to configure GitLab to skip missed scheduled pipelines after a server restoration, with the ability to set this preference at the instance level and for specific pipelines.
Proposal
Add a new configuration option that allows administrators to control the behavior of scheduled pipelines after server restoration:
- Add an instance-wide setting in Admin Area > Settings > CI/CD called "Skip missed scheduled pipelines after server restart" (default: disabled).
- Add a sub-field for the setting to configure a threshold (default: 1 hour)
- Add an override in the Project when creating / editing a pipeline under Build > Scheduled Pipelines called "Skip missed scheduled pipelines after server restart" (default: disabled).
- Add a sub-field for the setting to configure a threshold (default: 1 hour)
- Add a new API parameter for scheduled pipeline creation/editing to set this preference per pipeline schedule.
When enabled, GitLab will check the timestamp of when a scheduled pipeline was supposed to run against the current time. If the difference exceeds a configurable threshold, the pipeline will be skipped and logged rather than executed.
User journey:
- Administrator configures the setting according to their preference
- Server experiences downtime (planned or unplanned)
- During downtime, scheduled pipelines are missed
- When server is restored, GitLab checks the configuration
- If enabled, GitLab skips execution of missed pipelines and logs the skipped events
- If disabled, GitLab executes missed pipelines (current behavior)
Further details
Use cases:
- Preventing maintenance operations from running outside maintenance windows
- Avoiding resource-intensive operations during business hours
- Ensuring time-sensitive operations (like deployments) don't run at inappropriate times
Benefits:
- Increased control over automation timing
- Reduced risk during disaster recovery scenarios
- Better predictability of system behavior after restoration
Permissions and Security
This feature would respect existing permissions:
-
No impact to members with no access (0) -
No impact to Guest (10) members -
No impact to Reporter (20) members -
Impact to Developer (30) members - Can see the setting at project level -
Impact to Maintainer (40) members - Can configure the setting at project level -
Impact to Owner (50) members - Can configure the setting at instance and project level
No security concerns are anticipated as this feature only affects the timing of pipeline execution, not the content or permissions of the pipelines themselves.
Documentation
Documentation updates needed:
- Update CI/CD scheduled pipelines documentation to explain the new setting
- Update server administration documentation to include this option in backup/restore procedures
- Add information about the new setting in the API documentation for scheduled pipelines
Availability & Testing
This feature poses minimal risk to availability as it's an opt-in configuration that affects only the timing of scheduled pipeline execution.
Test coverage needed:
- Unit tests for the new configuration option and time threshold logic
- Integration tests to verify behavior when server time changes significantly
- End-to-end tests simulating server restart scenarios with various configuration settings
Available Tier
Free and above, as scheduled pipelines are a free feature.
Feature Usage Metrics
Track:
- Number of instances with the setting enabled
- Number of projects with the setting enabled
- Number of scheduled pipelines skipped due to this setting
- Number of times administrators change this setting
What does success look like, and how can we measure that?
Success metrics:
- Reduction in support tickets related to unexpected pipeline executions after server restarts
- Positive feedback from administrators managing GitLab instances with critical scheduled pipelines
Acceptance criteria:
- Administrators can enable/disable the feature at instance level
- Project developers / maintainers can override the instance setting for a pipeline
- When enabled, scheduled pipelines that were missed during downtime are not executed after server restoration if past the threshold
- Skipped pipelines are properly logged for audit purposes
- The feature works consistently across self-managed and GitLab.com environments
What is the type of buyer?
This feature targets IT Operations buyers who are responsible for maintaining GitLab instances and ensuring system reliability.
Is this a cross-stage feature?
Yes, this feature affects both the CI/CD stage (scheduled pipelines functionality) and the Configure stage (instance administration). The relevant PMs from both stages should be consulted.
What is the competitive advantage or differentiation for this feature?
This feature enhances GitLab's disaster recovery capabilities by providing administrators with more control over automation behavior after system restoration. It addresses a specific pain point in enterprise environments where maintenance windows and scheduled operations are strictly regulated.
Links / references
Related to: #28405 (somewhat related but not addressing the same issue)
Code to determine when to add a pipeline to run: https://gitlab.com/gitlab-org/gitlab/-/blob/df01858216e720acb4491a69a3bc6227a1f6ae1e/app/models/concerns/schedulable.rb#L7