Skip to content
Snippets Groups Projects

WIP: Resolve "Run CI/CD pipelines on a schedule"

This MR is only for planning milestones for Idea 1 in #2989 (closed) and clarifying the specification. Each step should be separated MR.

Step 1: Remove legacy codes


  • Check whether legacy codes are still lurking (e.g. whenever gem)
  • Remove legacy codes. Test to make sure the change doesn't break current architectures and dependencies.

Step 2: Backend


Step 2.1: Database

table column type data
(TBD) trigger_type string external(api) or scheduled(cron)
(TBD) cron string e.g. 30 18 * * *
(TBD) cron_time_zone string e.g. Europe/Istanbul
(TBD) target_ref string e.g. complie-linux-dist-*
(TBD) condition_type string always or if_changed
(TBD) active boolean literaly pasue/unpause

Step 2.2: Controller

  • app/controllers/projects/triggers_controller.rb

  • Create a new trigger with cronjob (#new, #create, with sidekiq-cron) (TODO: Elaborate this process)

  • Edit a trigger (#edit, #update)

  • Remove a trigger (#destroy)

  • Invoke a trigger with cronjob immediately (#test_cronjob)

  • Support Pass job variables (TODO: make sure this needs to be classified)

Step 3: Frontend


Step 3.1: Registration of a new Trigger ("Settings" -> "CI/CD Pipelines" -> "Triggers")

  • Click Add trigger button -> Show a new trigger registration form

  • "Trigger description" (Already existed)

  • "Trigger type" (Radiobutton: External Trigger(API) or Scheduled Trigger(Cron))

  • if "Scheduled Trigger" chosen, expand those items

  • "Schedules" (TextFiled: e.g. 30 18 * * *. Syntax check.) . Plus there are three buttons: "Nightly builds", "Sunday night", "Last day of a month". If one of them clicked, automatically this filed fulfilled.

  • "Time zone" (Combobox: For gitlab.com users. Choose a country. e.g. Europe/Istanbul)

  • "Target ref" (TextFiled: wildcard(*) support. e.g. complie-linux-dist-*. If there are no matches, show an error msg.)

  • "Conditions"(Combobox: "always" or "if there was a new change on the branch". Default: "always")

  • "variables"(TextFiled: e.g. "variables[RUN_NIGHTLY_BUILD]=true")

  • Active/Deactive a trigger (only #edit)

Example Untitled__2__copy Reference: Buddy

Step 3.2: List triggers

In a row, there are

  • Status

  • Active/Deactive

  • Cron format

  • Last used (Already existed)

  • Link to the last invoked pipeline

  • Button

  • Delete a trigger (Already existed.)

  • Edit a trigger (Already existed. Edit parameters except "Trigger type")

Example

Untitled__2__copy_2

Reference: TravisCI

Note / Concerns


  • Limitation for gitlab.com. e.g. One project has only one Scheduled Trigger. User can not set less than one day interval. Target ref should be matched less than 5 targets.
  • Performance. Processes of schedulers and builds will incredibly increase.
  • No real-time status update like Buddy for a first iteration.
  • Maybe this layout should be refurbished by UI/UX dev.

Merge request reports

Closed by avatar (Feb 26, 2025 8:48am UTC)

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • @dosuken123

    ci_trigger_requests is to be removed :)

  • Grzegorz Bizon added ~19173 ~164274 labels

    added ~19173 ~164274 labels

  • @dosuken123 Let me know in case of questions! Thanks for picking that! :thumbsup: :hearts:

  • Author Maintainer

    Some specs are still under investigation or discussion on #2989 (closed).

  • mentioned in issue pages/nikola#2

  • Author Maintainer

    @ayufan @grzesiek May I ask why ci_trigger_requests will be removed? Or is there any related MR? (Couldn't find any..)

  • Author Maintainer

    Which is better for handling sidekiq-worker and scheduled-triggers?

    Idea 1: One sidekiq-cron worker manages one scheduled trigger

    Everytime when scheduled trigger registered, create sidekiq-cron worker(Sidekiq::Cron::Job.create) simultaneously.

    fig.1

    how_to_use_sidekiq_copy

    Idea 2: One sidekiq-cron worker manages all scheduled triggers

    Create only one sidekiq-cron worker, which performs periodically per 5min. Each time of perform executed, compare a current time to each schedule. If matched(or near), process the scheduled trigger.

    fig.2

    how_to_use_sidekiq_copy_2

  • Author Maintainer

    New background architecture (Draft): https://cacoo.com/diagrams/YkzHGiMGObhQlgXG-962E2.png

  • @dosuken123 I think that the plan is to simplify implementation of pipeline triggering, which might involve removing classes / moving code around, but since this is not done, and we still have Ci::TriggerRequest we can use it. When time comes for the refactoring we will need to take scheduled pipelines into account.

  • We should not touch nor extend ci_trigger_request.

    This is how I see it:

    1. We allow defining multiple triggers per-project,
    2. The ref is not the wildcard, it is exact match,
    3. We do support only always,
    4. We do not trigger new pipeline if the previous pipeline is still running from trigger,
    5. We do not expose active, it can be added later,
    6. We store in DB: next_run_at that will be calculated from cronjob and stored by worker that executes pipeline,
    7. The next_run_at needs to be not less than 1h from now,
    8. Ignore cron_time_zone for now and assume that everything is in time zone of user that created a trigger,

    DB structure (mostly aligned with your proposal):

    ci_triggers:
      trigger_type: external / scheduled # we add a migration that adds a default column with `external`
      ref: `ref/master` # not needed for external, if specified for external it is verified against the external trigger, if is mandatory for scheduled
      cron: "8 * * * *"
      next_run_at: "date/time" # column with index

    Worker

    We add a new cron job worker, that will look like this:

    class StuckCiJobsWorker
      include Sidekiq::Worker
      include CronjobQueue
    
      def perform
        return unless try_obtain_lease
    
        Ci::Trigger.scheduled.where("next_run_at < ?", Time.now).find_each do |trigger|
          begin
            Ci::CreateTriggerRequestService(trigger.project, trigger, trigger.ref)
          rescue => e
              Rails.logger.error "#{trigger.id}: Failed to trigger job: #{e.message}"
          ensure
             trigger.schedule_next_run!
          end
        end
      end
    end

    We update Ci::CreateTriggerRequestService with:

    def execute(project, trigger, ref, variables = nil)
         # we need to find pipeline for that ref and that trigger, ignore if it's running
    
         trigger_request = trigger.trigger_requests.create(variables: variables)
    
         pipeline = Ci::CreatePipelineService.new(project, trigger.owner, ref: ref).
            execute(ignore_skip_ci: true, trigger_request: trigger_request)
          if pipeline.persisted?
            trigger_request
          end
      end

    How I see the steps

    1. Add DB changes, add new cronjob worker, prepare all backend, on trigger list show only external triggers for now,
    2. Add API for scheduled triggers,
    3. Add new UI for triggers: 1. remove the token, 2. show external or scheduled, 3. show Last Run and Next Run, 4. do not show cron specification. Cron job definition and the token will be shown when you go to Edit. We can consider giving a button to copy token if needed. This is for @dimitrieh to figure out.
    Edited by Kamil Trzciński
  • For 9.1 we should be able to do 1., maybe, but unlikely 2., 3. seems to be impossible.

    Edited by Kamil Trzciński
  • Author Maintainer

    @ayufan Thank you for direction! I'll work on the 1st step. Anyway, should we also support variables?

  • Not in 1., maybe later.

  • mentioned in issue #2989 (closed)

  • Shinya Maeda mentioned in merge request !10133 (merged)

    mentioned in merge request !10133 (merged)

  • Author Maintainer

    @ayufan I have a question.

    1. Ignore cron_time_zone for now and assume that everything is in time zone of user that created a trigger,

    I can understand if users table has a column time_zone or something, but it seems that still time_zone is not persisted in Gitlab database. If I look up time_zone in config/gitlab.yml, it doesn't work on gitlab.com. So I'm still thinking we need to persist the data in database, otherwise a new worker can't calculate next_run_at repeatedly.

  • Author Maintainer

    @ayufan About where we persist those data. I was thinking that extending ci_triggers for STI would be a good idea, but scheduled triggers data are not so similar with ci_triggers and the data keep growing in the future(e.g. active, variables, condition, etc), So I recommend creating another table such as ci_scheduled_triggers, What do you think?

  • I like and dislike this too because we need somehow to indicate that pipeline was created from scheduled trigger.

    Aligning ci_triggers would also later allow us to extend external triggers with active, condition and variables which may also have sense.

  • Author Maintainer

    About external triggers.

    curl --request POST \
         --form token=TOKEN \
         --form ref=master \
         --form "variables[UPLOAD_TO_S3]=true" \
         https://gitlab.example.com/api/v4/projects/9/trigger/pipeline

    (from Exmpale)

    active, condition and variables can be handled by users. active = Start/Stop calling the API, condition = Combining with pipeline API, variables = Post parameter "variables[blah]". I can't imagine that storing those data would be useful. Rather, centralizing parameters at only here seems to be meaningful.

    Whereas, scheduled triggers are handled by sidekiq-cron. So user needs to teach sidekiq-cron those information.


    Currently, I'm implementing with those structures.

    ci_scheduled_triggers(column) ci_triggers(column) type
    project_id project_id integer
    deleted_at deleted_at datetime
    created_at created_at datetime
    updated_at updated_at datetime
    owner_id owner_id integer
    description description string
    (No need) token string
    cron (No need) string
    cron_time_zone (No need) string
    next_run_at (No need) datetime
    last_run_at (No need) datetime
    ref (No need) string

    I had tried a lot to merge this into one table, but technically those two kinds of triggers are quite different, so I'd suggest to have separated tables.

    @godfat Sorry for bothering you :sweat_smile: Could you also give me an advice? Is it better to be STI or separated tables?

  • @dosuken123 Feel free to ping me! :) Sorry I am not following this closely, but reading through above few comments, I think they're quite different so we should use a separate table. Also, I feel that they should not be called scheduled triggers, because they could be recurring, and in that case they don't look like triggers.

    On the other hand, from my past experience, polymorphic association is definitely a bad thing, and STI with more than one non-shared columns is often introducing problems in the future. While it's attracting (well, because we would have less repetition) but just like inheritance in OO, they're too powerful that we need to be very sure that we want to use it, otherwise it would not justify for the cost. (larger table is also harder to scale)

  • Author Maintainer

    @godfat Thank you for an advice! You seem to have much knowledge of database structure, so I wanted to ask you about this. While I was considering this structure, I read an article that it's bad to use STI if I add new multiple non-shared columns.

    Reference: https://devblast.com/b/single-table-inheritance-with-rails-4-part-1

    STI should be used if your submodels will share the same attributes but need different behavior. If you plan to add 10 columns only used by one submodel, using different tables might be a better solution

    While it's attracting (well, because we would have less repetition) but just like inheritance in OO

    This is out of topic, but I just know the database system. Object-oriented. :sweat_smile: Now I'm reading this and this. I had been a long while in C#.NET, but never used that, but kinda interesting to store objects directly.

    Also, I feel that they should not be called scheduled triggers, because they could be recurring, and in that case they don't look like triggers.

    Yes. I'm feeling this too. This feature has a name of scheduled "trigger", but it's far from original "trigger"(API base) feature.

  • @dosuken123 Since I am now more leaning toward functional programming, and I know much more about RDBMS now, I no longer have any interests in OODBMS :P I don't feel that's the way to go.

    I could see where the name triggers is coming from, and I also understand that they indeed share some common concepts. However I am still a bit worried that if we use the same name, we could get confused in the future while one involved over the other.

    Actually I also feel that the name trigger is a bit confusing... What about PipelineSchedule? It would be clear that it's a schedule for generating pipelines.

  • Author Maintainer

    @godfat In addition, at some points, it also makes sense even If we call it "scheduled trigger", As I summarized at here, basically a scheduling feature is started from API implementation (crontab+API). This is slightly lame for operators who are not good at crontab -e, but at least it becomes possible of scheduling. And later, some of CI system evolved into having built-in scheduling feature. They allow users to customize it without manipulating terminal with rich GUI.

    Now Gitlab has API named "trigger", if it evolved, it should inherit the name "trigger", I think. But the problem is... those implementations will be quite different. So that If we release the service as "scheduled trigger" and implement as PipelineSchedule may make sense or cause another chaos... :see_no_evil:

  • @dosuken123 That's really an awesome summary. I could see that "scheduled triggers" would be a natural evolution, so I am not really against it. However I think I would still prefer to have "pipeline schedules" as the name, together with "pipeline triggers". So "pipeline schedules" are schedules which could possibly create a pipeline on schedule, and "pipeline triggers" are triggers which we could pull anytime to create a pipeline. I don't think this would be confusing. (and of course, the feature and implementation should have the same name)

    Edited by Lin Jen-Shin
  • @dosuken123

    Maybe, but I also see that we will have for external triggers. We would use ref to limit what you can execute.

    - ref
    - last_run_at

    But also having two separate objects makes it a little harder to annotate ci_pipelines, as we would have to trigger_id and scheduled_id. So pretty much we would something extra like cron and cron_time_zone.

    Edited by Kamil Trzciński
  • Author Maintainer

    @ayufan OK. Then let me create another MR using STI. Aside from an MR !10133 (merged) using separated tables.

  • Author Maintainer

    I close this because this's outdated.

  • closed

  • Shinya Maeda mentioned in merge request !10510 (closed)

    mentioned in merge request !10510 (closed)

Please register or sign in to reply
Loading