Skip to content
Snippets Groups Projects

Create deployments for ready-to-run jobs only

What does this MR do and why?

This change is part of the epic for Reduce main database load by optimizing `deploy... (&10169)

This ensures that deployment records are only created for deploy jobs that are ready to run. This means that deploy jobs that are still at the created state will not have deployment records upfront, helping to reduce the load on the main database.

For further details, please check the Expected behavior section related issue, and the illustration of the expected behavior in the Screenshots below.

Screenshots or screen recordings

GitLab CI configuration example

Given the following GitLab CI configuration with jobs in different stages:

  • [stage-0] test job
  • 4 deploy jobs
    • [stage-1] review runs automatically after test
    • [stage-2] staging is set as when: manual and allow_failure: false, meaning it must succeed before jobs in the next stages can run
    • [stage-3] canary is set as when: manual and by default allow_failure: true
    • [stage-4] production-a is set as when: manual (and by default allow_failure: true)
    • [stage-4] production-b is set as when: manual and needs: [canary] (and by default allow_failure: true)
=== EXPAND THIS SECTION FOR THE GITLAB CI CONFIG ===
stages: [stage-0, stage-1, stage-2, stage-3, stage-4]

test:
  stage: stage-0
  script: sleep 120

review:
  stage: stage-1
  environment:
    name: review
    action: start
  script: sleep 60

staging:
  stage: stage-2
  environment:
    name: staging
    action: start
  script: sleep 60
  when: manual
  allow_failure: false

canary:
  stage: stage-3
  environment:
    name: canary
    action: start
  script: exit 0
  when: manual

production-a:
  stage: stage-4
  environment:
    name: production-a
    action: start
  script: exit 0
  when: manual

production-b:
  stage: stage-4
  environment:
    name: production-b
    action: start
  script: exit 0
  when: manual
  needs: [canary]

Current behavior

This is the behavior before the change in this MR is applied / the Feature Flag is disabled

=== EXPAND THIS SECTION FOR THE EXAMPLE TABLE ===
Pipeline Deployments created Screenshot
test job is running Deployment records of all deploy jobs are already created FF_disabled
review job is running Deployment record for review deploy job is already created see above
staging job is played manually Deployment record for staging deploy job is already created see above
production-a job is played manually Deployment record for production-a deploy job is already created see above
production-b job is played manually Deployment record for production-a deploy job is already created see above

Expected behavior

This is the expected behavior after the change / the Feature Flag is enabled

=== EXPAND THIS SECTION FOR THE EXAMPLE TABLE ===
Pipeline Deployments created Screenshot
test job is running No deployment records created yet FF_enabled_1_test_running
test job done; review deploy job is running Deployment record created for review job FF_enabled_2_test_done_and_review_running
review job done; staging deploy job is played Deployment record created for staging job

Since staging is set as allow_failure: false, the subsequent deployment jobs are not yet "ready" because they need staging to succeed.
FF_enabled_3_review_done_and_staging_running
staging job done Deployment records for canary, production-a, and production-b are created

At this point, canary is ready to run since staging has succeeded.
Since canary is set by default as allow_failure: true, the subsequent production-a and production-b jobs are also considered ready to run.
FF_enabled_4_staging_done
canary job is running Deployment for canary is already created FF_enabled_6_canary_running
production-a and production-b jobs are played and about to be picked up Deployment records for production-a and production-b are already created FF_enabled_7_canary_done_and_production_pending

Note: since production-b is set as needs: [canary], its corresponding deployment should ideally not be created until canary succeeds. This will be addressed in a follow-up issue.

How to set up and validate locally

  1. Create a project or use an existing project for testing

  2. Enable the create_deployment_only_for_processable_jobs Feature Flag in the rails console

    Feature.enable(:create_deployment_only_for_processable_jobs)
  3. Use the example above for the configuration in your .gitlab-ci.yml

  4. Run the pipeline / observe the running pipeline and verify that it follows the Expected behavior in the example above

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #391291 (closed)

Edited by Pam Artiaga

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Contributor

    Reviewer roulette

    Changes that require review have been detected!

    Please refer to the table below for assigning reviewers and maintainers suggested by Danger in the specified category:

    Category Reviewer Maintainer
    backend ahuntsman current availability (UTC-5, 16 hours behind @partiaga) sashi_kumar current availability (UTC+0, 11 hours behind @partiaga)
    ~"Verify" Reviewer review is optional for ~"Verify" stanhu current availability (UTC-7, 18 hours behind @partiaga)

    Please check reviewer's status!

    • available Reviewer is available!
    • unavailable Reviewer is unavailable!

    Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.

    To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.

    Once you've decided who will review this merge request, assign them as a reviewer! Danger does not automatically notify them for you.

    If needed, you can retry the :repeat: danger-review job that generated this comment.

    Generated by :no_entry_sign: Danger

  • Pam Artiaga added 1 commit

    added 1 commit

    Compare with previous version

  • Pam Artiaga
    • Author Maintainer
      Resolved by Pam Artiaga

      Making the proposed change in the AtomicProcessingService does not result in the expected behavior detailed in the issue.

      In the AtomicProcessingService spec, we have this example CI config:

      CI config
      stages: [stage-0, stage-1, stage-2, stage-3]
      
      test:
        stage: stage-0
        script: exit 0
      
      review:
        stage: stage-1
        environment:
          name: review
          action: start
        script: exit 0
      
      staging:
        stage: stage-2
        environment:
          name: staging
          action: start
        script: exit 0
        when: manual
      
      canary:
        stage: stage-3
        environment:
          name: canary
          action: start
        script: exit 0
        when: manual
      
      production:
        stage: stage-3
        environment:
          name: production
          action: start
        script: exit 0
        when: manual
        needs: [canary]

      The expectation detailed in the issue seem to suggest that the manual deploy jobs (in this case, staging, canary, and production) change status from created to manual one after the other.

      But what actually happens is that all manual jobs seem to become created to manual at the same time, after the non-manual jobs in the previous stages are done. The only situation where a manual job does not become created to manual is if it needs another job, in which case it becomes skipped.

      So, basing on what's described in the issue, the expectation for the pipeline above would be:

      expectations based on issue description

      This expectation does not match the actual state transitions happening.

      # 1 - process test job, which becomes 'pending'
      process_pipeline
      succeed_pending
      
      # 2 - process review job, which becomes 'pending'
      #     the review deployment record is created here
      process_pipeline
      succeed_pending
      
      # 3 - process staging job, which becomes 'manual'
      #     the staging deployment record is created here
      process_pipeline
      play_manual_job('staging') # staging job becomes 'pending'
      succeed_pending
      
      # 4 - process canary and production jobs
      #     with canary becoming 'manual' and production becoming 'skipped'
      #     the canary deployment record is created here
      process_pipeline
      play_manual_job('canary')
      succeed_pending
      
      # 5 - process production job, which now becomes 'manual' with canary succeeding
      #     the production deployment is only created here
      process_pipeline

      But, basing on the actual state transitions of the jobs, the expectation should be as follows:

      expectations based on actual state transition

      This is also the logic in the specs for this MR

      # 1 - process test job, which becomes 'pending'
      process_pipeline
      succeed_pending
      
      # 2 - process review job, which becomes 'pending'
      #     the review deployment record is created here
      process_pipeline
      succeed_pending
      
      # 3 - process staging, canary, and production jobs
      #     staging becomes 'manual', so we expect the staging deployment record to be created
      #   - both canary and production are also processed even if they are at a later stage
      #     canary becomes 'manual', so we expect the canary deployment record to be created
      #     production becomes 'skipped' because it *needs* canary
      #        we expect the production deployment record to NOT be created, but it is still created
      process_pipeline
      play_manual_job('staging') # staging job becomes 'pending'
      play_manual_job('canary') # canary job becomes 'pending'
      succeed_pending # staging and canary job becomes 'successful'
      
      # 4 - process production job, which now becomes 'manual' with canary being successful
      #     we expect the production deploy to only be created here when it becomes 'manual'
      #     but it was already created in step 3 even when it was 'skipped'
      process_pipeline

      Now, the specs still fail because we are expecting the production job to only have a deployment when it becomes 'manual', but a deployment was actually already created when it become 'skipped'. (See details in !132835 (comment 1581552248))

      In order for the expectations to succeed, we would need to call the Deployments::CreateForJobService in the Ci::ProcessBuildService, but we can't do this because Ci::ProcessBuildService is called within Gitlab::OptimisticLocking. (See this commit for illustration)

      Edited by Pam Artiaga
  • Contributor

    Allure report

    allure-report-publisher generated test report!

    e2e-test-on-gdk: :white_check_mark: test report for 51f9eeec

    expand test summary
    +-----------------------------------------------------------------------+
    |                            suites summary                             |
    +------------------+--------+--------+---------+-------+-------+--------+
    |                  | passed | failed | skipped | flaky | total | result |
    +------------------+--------+--------+---------+-------+-------+--------+
    | Manage           | 13     | 0      | 1       | 0     | 14    | ✅     |
    | Govern           | 35     | 0      | 0       | 0     | 35    | ✅     |
    | Verify           | 8      | 0      | 0       | 0     | 8     | ✅     |
    | Create           | 45     | 0      | 1       | 0     | 46    | ✅     |
    | Data Stores      | 18     | 0      | 4       | 0     | 22    | ✅     |
    | Plan             | 55     | 0      | 0       | 0     | 55    | ✅     |
    | Monitor          | 4      | 0      | 0       | 0     | 4     | ✅     |
    | Package          | 0      | 0      | 1       | 0     | 1     | ➖     |
    | Framework sanity | 0      | 0      | 1       | 0     | 1     | ➖     |
    +------------------+--------+--------+---------+-------+-------+--------+
    | Total            | 178    | 0      | 8       | 0     | 186   | ✅     |
    +------------------+--------+--------+---------+-------+-------+--------+

    e2e-package-and-test: :heavy_minus_sign: test report for 51f9eeec

    expand test summary
    +------------------------------------------------------------+
    |                       suites summary                       |
    +-------+--------+--------+---------+-------+-------+--------+
    |       | passed | failed | skipped | flaky | total | result |
    +-------+--------+--------+---------+-------+-------+--------+
    +-------+--------+--------+---------+-------+-------+--------+
    | Total | 0      | 0      | 0       | 0     | 0     | ➖     |
    +-------+--------+--------+---------+-------+-------+--------+
  • Pam Artiaga marked this merge request as draft

    marked this merge request as draft

  • Pam Artiaga added 1234 commits

    added 1234 commits

    • 5c9bfd68...dfc3ddbf - 1229 commits from branch master
    • bfad9a1e - Create deployments for ready-to-run jobs only
    • e9472f50 - Add tests for AtomicProcessingService
    • 3bc1e064 - Update puts statements
    • 727e0e26 - Update specs to consider allow_failure setting
    • 8764dc81 - Remove puts, add back tests for disable FF

    Compare with previous version

  • Pam Artiaga added 1 commit

    added 1 commit

    Compare with previous version

  • Pam Artiaga resolved all threads

    resolved all threads

  • Pam Artiaga marked this merge request as ready

    marked this merge request as ready

  • Pam Artiaga changed the description

    changed the description

  • Pam Artiaga changed the description

    changed the description

  • Pam Artiaga changed the description

    changed the description

  • Pam Artiaga changed the description

    changed the description

  • Pam Artiaga
  • Pam Artiaga requested review from @shinya.maeda

    requested review from @shinya.maeda

  • Shinya Maeda
  • Shinya Maeda
  • Pam Artiaga added 3 commits

    added 3 commits

    Compare with previous version

  • Pam Artiaga changed the description

    changed the description

  • Shinya Maeda approved this merge request

    approved this merge request

  • :wave: @shinya.maeda, thanks for approving this merge request.

    This is the first time the merge request has been approved. To ensure full test coverage, a new pipeline will be started shortly.

    For more info, please refer to the following links:

  • Shinya Maeda requested review from @minac and removed review request for @shinya.maeda

    requested review from @minac and removed review request for @shinya.maeda

  • Mehmet Emin INAC removed review request for @minac

    removed review request for @minac

  • Pam Artiaga added 1 commit

    added 1 commit

    • 65113f35 - Account for FF being switched midway through a process

    Compare with previous version

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading