Skip to content

Add action to deployments

What does this MR do?

I think we should not use deployments table for stop actions. Maybe we should introduce a new table

FIX 1: Add action to deployment

In https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/22380, we made Deployment a stateful object, however, we only store deployments record only if it starts an environment, we don't store deployments when it stops environment today.

In this MR, we're going to add action column to deployments table. If the value is start, it means that the deployment record is for starting an environment. On the other hand, if the value is stop, the deployment record is for stopping an environment.

This gives us a lot of flexibility as it gives us the following contexts case by case.

deployments.status deployments.action description
created start The deployment will happen in the future
running start The deployment is happening
success start The deployment succeeded
failed start The deployment failed
canceled start The deployment was canceled (e.g. pipeline was skipped)
created stop It will stop an environment in the future
running stop It is stopping an environment
success stop It successfully stopped an environment
failed stop It failed to stop an environment
canceled stop It was canceled to stop an environment

This idea was originally proposed by @ayufan

FIX 2: Removing has_many :deployments, -> { success } from Project and Environment models

Currently, we implicitly fetch successful deployments, however, it should be explicitly fetched in order to avoid the confusion between the real latest record and the successful lastest record.

This is a very important fix that currently AtomicIID initialize deployment-IID from the latest successful deployment. This should be any deployments.

FIX 3: Optimize indexes for the new queries

Most of the cases we search success and start deployments records, for example, showing deployments index page, environment page, etc, so that WHERE status = 2 AND start = 1 is frequently used in this MR.

I picked up top 5 queries that were affected by this change.

1. environment.deployments.deployed
  SELECT "deployments".* FROM "deployments" WHERE "deployments"."environment_id" = 37 AND "deployments"."status" = 2 AND "deployments"."action" = 1;

1. project.deployments.deployed
  SELECT "deployments".* FROM "deployments" WHERE "deployments"."project_id" = 15 AND "deployments"."status" = 2 AND "deployments"."action" = 1;

1. environment.deployments.start.find_by_sha(sha)
  SELECT  "deployments".* FROM "deployments" WHERE "deployments"."environment_id" = 34 AND "deployments"."action" = 1 AND "deployments"."sha" = '0c2ea7d19b24e2a1eb24d9165c0da0fd1cb25595' LIMIT 1;

1. Environment.where(project_id: [mr.source_project_id, mr.target_project_id]).available.with_deployment(mr.diff_head_sha)
  SELECT "environments".* FROM "environments" WHERE "environments"."project_id" IN (15, 15) AND ("environments"."state" IN ('available')) AND (EXISTS (SELECT 1 FROM "deployments" WHERE "deployments"."action" = 1 AND (deployments.environment_id = environments.id) AND "deployments"."sha" = '0c2ea7d19b24e2a1eb24d9165c0da0fd1cb25595'));

1. @project.deployments.deployed.where("finished_at > ?", @from).count
  SELECT COUNT(*) FROM "deployments" WHERE "deployments"."project_id" = 15 AND "deployments"."status" = 2 AND "deployments"."action" = 1 AND (finished_at > '2018-11-06 13:24:08.770981');

The first two queries were optimized with the following indexes

# Index
    "index_deployments_on_project_id_and_action_and_status_and_id" btree (project_id, action, status, id)
    "index_deployments_on_project_id_and_action_and_status_and_iid" btree (project_id, action, status, iid)
    "index_deployments_on_env_deployed_for_id" btree (environment_id, action, status, id)
    "index_deployments_on_env_deployed_for_iid" btree (environment_id, action, status, iid)
# Query plan
gitlabhq_development_ce=# explain analyze SELECT "deployments".* FROM "deployments" WHERE "deployments"."environment_id" = 37 AND "deployments"."status" = 2 AND "deployments"."action" = 1;
                                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using index_deployments_on_environment_id_and_action_and_status on deployments  (cost=0.28..8.30 rows=1 width=126) (actual time=0.013..0.013 rows=1 loops=1)
   Index Cond: ((environment_id = 37) AND (action = 1) AND (status = 2))
 Planning time: 0.106 ms
 Execution time: 0.041 ms
(4 rows)

The third query was optimized with the following index.

# Index
"index_deployments_on_environment_id_and_action_and_sha" btree (environment_id, action, sha)
# Query plan
gitlabhq_development_ce=# explain analyze SELECT  "deployments".* FROM "deployments" WHERE "deployments"."environment_id" = 34 AND "deployments"."action" = 1 AND "deployments"."sha" = '0c2ea7d19b24e2a1eb24d9165c0da0fd1cb25595' LIMIT 1;
                                                                                 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..8.30 rows=1 width=126) (actual time=0.015..0.015 rows=1 loops=1)
   ->  Index Scan using index_deployments_on_environment_id_and_action_and_sha on deployments  (cost=0.28..8.30 rows=1 width=126) (actual time=0.015..0.015 rows=1 loops=1)
         Index Cond: ((environment_id = 34) AND (action = 1) AND ((sha)::text = '0c2ea7d19b24e2a1eb24d9165c0da0fd1cb25595'::text))
 Planning time: 0.121 ms
 Execution time: 0.034 ms
(5 rows)

The fourth query was also optimized with the above index.

# Query plan

gitlabhq_development_ce=# explain analyze SELECT "environments".* FROM "environments" WHERE "environments"."project_id" IN (15, 15) AND ("environments"."state" IN ('available')) AND (EXISTS (SELECT 1 FROM "deployments" WHERE "deployments"."action" = 1 AND (deployments.environment_id = environments.id) AND "deployments"."sha" = '0c2ea7d19b24e2a1eb24d9165c0da0fd1cb25595'));
                                                                                  QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.28..17.74 rows=1 width=105) (actual time=0.035..0.039 rows=1 loops=1)
   ->  Seq Scan on environments  (cost=0.00..1.42 rows=1 width=105) (actual time=0.008..0.009 rows=7 loops=1)
         Filter: ((project_id = ANY ('{15,15}'::integer[])) AND ((state)::text = 'available'::text))
         Rows Removed by Filter: 30
   ->  Index Only Scan using index_deployments_on_environment_id_and_action_and_sha on deployments  (cost=0.28..8.30 rows=1 width=4) (actual time=0.004..0.004 rows=0 loops=7)
         Index Cond: ((environment_id = environments.id) AND (action = 1) AND (sha = '0c2ea7d19b24e2a1eb24d9165c0da0fd1cb25595'::text))
         Heap Fetches: 1
 Planning time: 0.393 ms
 Execution time: 0.056 ms
(9 rows)

This query iterates on each environment for index_deployments_on_environment_id_and_action_and_sha conditioned deployments. Probably, we should follow up to simplify this query. But furthermore, I feel that we should redesign ci_environment_status as we still have N+1 problem even today.

The fifth query was optimised with the following index.

# Index
index_deployments_on_deployed_for_finished_at" btree (project_id, action, status, finished_at)
# Query plan

gitlabhq_development_ce=# explain analyze SELECT COUNT(*) FROM "deployments" WHERE "deployments"."project_id" = 15 AND "deployments"."status" = 2 AND "deployments"."action" = 1 AND (finished_at > '2018-11-06 13:24:08.770981');
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=8.30..8.31 rows=1 width=8) (actual time=0.026..0.026 rows=1 loops=1)
   ->  Index Only Scan using index_deployments_on_deployed_for_finished_at on deployments  (cost=0.28..8.30 rows=1 width=0) (actual time=0.017..0.020 rows=7 loops=1)
         Index Cond: ((project_id = 15) AND (action = 1) AND (status = 2) AND (finished_at > '2018-11-06 13:24:08.770981+09'::timestamp with time zone))
         Heap Fetches: 7
 Planning time: 0.393 ms
 Execution time: 0.045 ms
(6 rows)

Fix ci_environments_status to get the real latest deployment

When ci_environments_status is accessed, it returns the latest deployment through existing refs (refs/environments/${env_name}/deployments/${deployment_iid}). This means we cannot get running/failed/etc deployments, but only success deployments, because the ref is created when the deployment succeeded.

To get the real latest deployment record (and its status), we have to ask database, instead of gitaly.

In order to understand and retain the original behavior with ref search in gitaly, we broke down how it works today.

Here is the process flow when ci_environments_status internal endpoint is accessed.

  1. Get the head pipeline of the merge request
  2. Get job IDs from the pipeline and find environments
  3. Iterate environments
  4. => Find ref from the given SHA (via gitaly)
  5. => parse deployment IID
  6. => Find the deployment record with the given IID

So why do we have to access actual git ref? Because it has the following implications.

  • It searches a ref including the given sha, instead of the exactly matched sha.
  • Environments that were created in a forked project should appear in the target project

But please think about we recently made deployments as a stateful object. The corresponding deployments record will be created when a pipeline is created. Given pipeline is created right after user pushed a new commit, we can directly ask deployments table if there are any relevant deployment with the given sha, today.

With the reasoning, I changed the routine to

  1. Get the diff head sha of the merge request
  2. Find environments which have a deployment with the given sha (This supports forked projects as well)
  3. Iterate environments
  4. => Find the deployment record with the given sha

What are the relevant issue numbers?

Does this MR meet the acceptance criteria?

Edited by Shinya Maeda

Merge request reports