Skip to content

Change life cycle of `deployments` records in order to make it a stateful object

Shinya Maeda requested to merge stateful_deployments into master

What does this MR do?

This MR changes the life cycle of deployments records, in order to make it a stateful object.

How deployments records are created today?

Previously, ci_pipelines, ci_builds, environments and deployments records were created in the following life cycle.

The creation flow is

  1. User pushed a new commit or manually executed a pipeline on master
  2. GitLab-rails reads gitlab-ci.yml
  3. GitLab-rails creates one ci_pipelines record and multiple ci_builds records
  4. GitLab-rails iterates created ci_builds records one by one, and if a job is supposed to deploy (i.e. environment keyword is specified), it creates a corresponding environments record
  5. GitLab-Runenr processes pending jobs
  6. Every time a job (which is supposed to deploy) succeeded, GitLab-rails creates one deployments record. This deployment record is considered as a successful state.
  7. merge_request_metrics.first_deployed_to_production_at will be updated by deployment.deployed_at only if the code has been deployed to production

The problems of current architecture

  • Deployment models rely on Ci::Build to check the deployment status/result. However, Deployment models should be more independent as Ci::Build is not the only deployable objects, but also users will be able to create a deployment object with external CD service
  • Currently, we compute virtual deployment status (Ci::Build#deployment_status) per requests. This should be persisted somewhere to efficient select relevant records.

How deployments records will be created in this MR?

Basically, we should create deployments record when ci_builds record is created. And the deployments.status value will be tightly synchronized with ci_builds.status.

The deployment statuses can be represented as

state :created, value: 0   # A deployment will happen
state :running, value: 1   # A deployment is happening
state :success, value: 2   # A deployment succeeded
state :failed, value: 3    # A deployment failed
state :canceled, value: 4  # A deployment canceled 

The creation flow will be

  1. User pushed a new commit or manually executed a pipeline on master
  2. GitLab-rails reads gitlab-ci.yml
  3. GitLab-rails creates one ci_pipelines record and multiple ci_builds records
  4. GitLab-rails iterates created ci_builds records one by one, and if a job is supposed to deploy (i.e. environment keyword is specified), it creates a corresponding environments record and deployments records. The initial deployment status is created.
  5. GitLab-Runenr processes pending jobs
  6. Every time a job (which is supposed to deploy) is updated, GitLab-rails updates the associated deployments.status value. The status will transit to running -> created/failed/canceled.
  7. merge_request_metrics.first_deployed_to_production_at will be updated by deployment.deployed_at only if the code has been deployed to production

What are the goodies of the new architecture

  • Deployment model will be a more independent object. This allows us to work on https://gitlab.com/gitlab-org/gitlab-ce/issues/47118 easily.
  • Computation for virtual deployment status is no longer necessary. We can just refer the deployments.status column.
  • In deployments index page, users will be able to filter deployments rows per status. For instance, they can see only failed deployments.

Migrate data in regular migration.

In this MR, we add four database migrations.

  1. db/migrate/20181015155839_add_finished_at_to_deployments.rb
  2. db/migrate/20181016141739_add_status_to_deployments.rb
  3. db/migrate/20181022135539_add_index_on_status_to_deployments.rb
  4. db/migrate/20181023144439_add_partial_index_for_legacy_successful_deployments.rb
  5. db/post_migrate/20181030135124_fill_empty_finished_at_in_deployments.rb

Especially, database reviewer has to take a look closer at 20181016141739_add_status_to_deployments.rb and 20181030135124_fill_empty_finished_at_in_deployments.rb.

20181016141739_add_status_to_deployments.rb is to add a column status to deployments table and set 2 (i.e. Successful status) by default. We've already discussed in slack about the timing, and we figure out that the migration takes about 1 minute on a production replica.

20181030135124_fill_empty_finished_at_in_deployments.rb is to fill empty finished_at column by each created_at value. (If the row doesn't have a value on finished_at column that means the deployment finished at created_at). This migration also goes through all rows on deployments table, however, the timing would not be different, since the approach is similar to the 20181016141739_add_status_to_deployments.rb migration.

Why so many has_many :deployments, -> { success }, ?

Currently, Deployment relation is referred from Project, Environment and Ci::Build. Those objects still think that all deployments records are successful. In order to avoid breaking the current logic, we should use -> { success } filter in each association.

Because of that, each SQL will look like these.

SELECT "deployments".* FROM "deployments" WHERE "deployments"."project_id" = 15 AND "deployments"."status" = 2; # Find successful deployments under the project
SELECT "deployments".* FROM "deployments" WHERE "deployments"."environment_id" = 31 AND "deployments"."status" = 2; # Find successful deployments under the environment

Those queries are re-tuned with the following indexes.

Indexes:
    ...
    "index_deployments_on_environment_id_and_status" btree (environment_id, status)
    "index_deployments_on_project_id_and_status" btree (project_id, status)

What are the relevant issue numbers?

This MR should wait for EE-port MR has been merged

This MR was caught by ee-compat-check as there is potential conflict against EE repo.

Does this MR meet the acceptance criteria?

Edited by Shinya Maeda

Merge request reports