Skip to content
Snippets Groups Projects

Draft: PoC for removing partitioned data using foreign keys

Closed Marius Bobin requested to merge 358943-fk-data-removal-poc into master
5 unresolved threads

What does this MR do and why?

:exclamation: :exclamation: :exclamation: Not to be merged :exclamation: :exclamation: :exclamation:

Related to #358943 (closed)

This partitions all of the tables described in https://docs.gitlab.com/ee/architecture/blueprints/ci_data_decay/pipeline_partitioning.html#why-do-we-need-to-partition-cicd-data and creates 5 partitions for each. It also adds some horrible hacks to allow the execution of Ci::CreatePipelineService for data insertion:

# Configure which partition to use for data storage: 100..105
[1] pry(main)> Thread.current[:current_partition_id] = 103
=> 103
[2] pry(main)> Ci::CreatePipelineService.new(Project.find(75), User.first, { ref: 'main', variables_attributes: [ {key: 'GIT_STRATEGY', value: 'none' } ] }).execute!(:web, {})
=> #<ServiceResponse:0x00007ff272967960
 @http_status=:ok,
 @message=nil,
 @payload=
  #<Ci::Pipeline:0x00007ff286766300
   id: 544,
   ref: "main",
   sha: "bc47a984eb5f8b5712a420ac1c043396e05f973f",
   before_sha: "0000000000000000000000000000000000000000",
   created_at: Thu, 14 Jul 2022 12:44:39.769820000 UTC +00:00,
   updated_at: Thu, 14 Jul 2022 12:44:39.769820000 UTC +00:00,
   tag: false,
   yaml_errors: nil,
   committed_at: nil,
   project_id: 75,
   status: "created",
   started_at: nil,
   finished_at: nil,
   duration: nil,
   user_id: 1,
   lock_version: 0,
   auto_canceled_by_id: nil,
   pipeline_schedule_id: nil,
   source: "web",
   config_source: "repository_source",
   protected: true,
   failure_reason: nil,
   iid: 57,
   merge_request_id: nil,
   source_sha: nil,
   target_sha: nil,
   external_pull_request_id: nil,
   ci_ref_id: 73,
   locked: "artifacts_locked",
   partition_id: 103>,
 @status=:success>

Data is inserted in the right partition:

gitlabhq_development_ci=# select id, iid, partition_id, project_id from p_ci_pipelines_103;
 id  | iid | partition_id | project_id
-----+-----+--------------+------------
 520 |  33 |          103 |         75
 521 |  34 |          103 |         75
 522 |  35 |          103 |         75
 544 |  57 |          103 |         75
(4 rows)

gitlabhq_development_ci=# select id, name, commit_id, partition_id from p_ci_builds_103;
  id   |  name   | commit_id | partition_id
-------+---------+-----------+--------------
 26751 | build1  |       520 |          103
 26752 | test1   |       520 |          103
 26753 | test2   |       520 |          103
 26754 | deploy1 |       520 |          103
 26755 | build1  |       521 |          103
 26756 | test1   |       521 |          103
 26757 | test2   |       521 |          103
 26758 | deploy1 |       521 |          103
 26759 | build1  |       522 |          103
 26760 | test1   |       522 |          103
 26761 | test2   |       522 |          103
 26762 | deploy1 |       522 |          103
 26847 | build1  |       544 |          103
 26848 | test1   |       544 |          103
 26849 | test2   |       544 |          103
 26850 | deploy1 |       544 |          103
(16 rows)

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Marius Bobin

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
338 350 scope :for_id, -> (id) { where(id: id) }
339 351 scope :for_iid, -> (iid) { where(iid: iid) }
340 352 scope :for_project, -> (project_id) { where(project_id: project_id) }
341 scope :created_after, -> (time) { where('ci_pipelines.created_at > ?', time) }
342 scope :created_before_id, -> (id) { where('ci_pipelines.id < ?', id) }
353 scope :created_after, -> (time) { where("#{quoted_table_name}.created_at > ?", time) }
354 scope :created_before_id, -> (id) { where("#{quoted_table_name}.id < ?", id) }
  • Author Maintainer

    I think we need to add indexes and FK constraints for each partition:

    gitlabhq_development_ci=# \d+ p_ci_pipelines
                                                                Partitioned table "public.p_ci_pipelines"
              Column          |            Type             | Collation | Nullable |                 Default                  | Storage  | Stats target | Description
    --------------------------+-----------------------------+-----------+----------+------------------------------------------+----------+--------------+-------------
     id                       | integer                     |           | not null | nextval('ci_pipelines_id_seq'::regclass) | plain    |              |
     ref                      | character varying           |           |          |                                          | extended |              |
     sha                      | character varying           |           |          |                                          | extended |              |
     before_sha               | character varying           |           |          |                                          | extended |              |
     created_at               | timestamp without time zone |           |          |                                          | plain    |              |
     updated_at               | timestamp without time zone |           |          |                                          | plain    |              |
     tag                      | boolean                     |           |          | false                                    | plain    |              |
     yaml_errors              | text                        |           |          |                                          | extended |              |
     committed_at             | timestamp without time zone |           |          |                                          | plain    |              |
     project_id               | integer                     |           |          |                                          | plain    |              |
     status                   | character varying           |           |          |                                          | extended |              |
     started_at               | timestamp without time zone |           |          |                                          | plain    |              |
     finished_at              | timestamp without time zone |           |          |                                          | plain    |              |
     duration                 | integer                     |           |          |                                          | plain    |              |
     user_id                  | integer                     |           |          |                                          | plain    |              |
     lock_version             | integer                     |           |          | 0                                        | plain    |              |
     auto_canceled_by_id      | integer                     |           |          |                                          | plain    |              |
     pipeline_schedule_id     | integer                     |           |          |                                          | plain    |              |
     source                   | integer                     |           |          |                                          | plain    |              |
     config_source            | integer                     |           |          |                                          | plain    |              |
     protected                | boolean                     |           |          |                                          | plain    |              |
     failure_reason           | integer                     |           |          |                                          | plain    |              |
     iid                      | integer                     |           |          |                                          | plain    |              |
     merge_request_id         | integer                     |           |          |                                          | plain    |              |
     source_sha               | bytea                       |           |          |                                          | extended |              |
     target_sha               | bytea                       |           |          |                                          | extended |              |
     external_pull_request_id | bigint                      |           |          |                                          | plain    |              |
     ci_ref_id                | bigint                      |           |          |                                          | plain    |              |
     locked                   | smallint                    |           | not null | 1                                        | plain    |              |
     partition_id             | bigint                      |           | not null | 100                                      | plain    |              |
    Partition key: LIST (partition_id)
    Indexes:
        "p_ci_pipelines_id_partition_id" UNIQUE CONSTRAINT, btree (id, partition_id)
    Partitions: ci_pipelines FOR VALUES IN ('100'),
                p_ci_pipelines_101 FOR VALUES IN ('101'),
                p_ci_pipelines_102 FOR VALUES IN ('102'),
                p_ci_pipelines_103 FOR VALUES IN ('103'),
                p_ci_pipelines_104 FOR VALUES IN ('104'),
                p_ci_pipelines_105 FOR VALUES IN ('105')
    
    gitlabhq_development_ci=# \d+ p_ci_pipelines_103
                                                                    Table "public.p_ci_pipelines_103"
              Column          |            Type             | Collation | Nullable |                 Default                  | Storage  | Stats target | Description
    --------------------------+-----------------------------+-----------+----------+------------------------------------------+----------+--------------+-------------
     id                       | integer                     |           | not null | nextval('ci_pipelines_id_seq'::regclass) | plain    |              |
     ref                      | character varying           |           |          |                                          | extended |              |
     sha                      | character varying           |           |          |                                          | extended |              |
     before_sha               | character varying           |           |          |                                          | extended |              |
     created_at               | timestamp without time zone |           |          |                                          | plain    |              |
     updated_at               | timestamp without time zone |           |          |                                          | plain    |              |
     tag                      | boolean                     |           |          | false                                    | plain    |              |
     yaml_errors              | text                        |           |          |                                          | extended |              |
     committed_at             | timestamp without time zone |           |          |                                          | plain    |              |
     project_id               | integer                     |           |          |                                          | plain    |              |
     status                   | character varying           |           |          |                                          | extended |              |
     started_at               | timestamp without time zone |           |          |                                          | plain    |              |
     finished_at              | timestamp without time zone |           |          |                                          | plain    |              |
     duration                 | integer                     |           |          |                                          | plain    |              |
     user_id                  | integer                     |           |          |                                          | plain    |              |
     lock_version             | integer                     |           |          | 0                                        | plain    |              |
     auto_canceled_by_id      | integer                     |           |          |                                          | plain    |              |
     pipeline_schedule_id     | integer                     |           |          |                                          | plain    |              |
     source                   | integer                     |           |          |                                          | plain    |              |
     config_source            | integer                     |           |          |                                          | plain    |              |
     protected                | boolean                     |           |          |                                          | plain    |              |
     failure_reason           | integer                     |           |          |                                          | plain    |              |
     iid                      | integer                     |           |          |                                          | plain    |              |
     merge_request_id         | integer                     |           |          |                                          | plain    |              |
     source_sha               | bytea                       |           |          |                                          | extended |              |
     target_sha               | bytea                       |           |          |                                          | extended |              |
     external_pull_request_id | bigint                      |           |          |                                          | plain    |              |
     ci_ref_id                | bigint                      |           |          |                                          | plain    |              |
     locked                   | smallint                    |           | not null | 1                                        | plain    |              |
     partition_id             | bigint                      |           | not null | 100                                      | plain    |              |
    Partition of: p_ci_pipelines FOR VALUES IN ('103')
    Partition constraint: ((partition_id IS NOT NULL) AND (partition_id = '103'::bigint))
    Indexes:
        "p_ci_pipelines_103_id_partition_id_key" UNIQUE CONSTRAINT, btree (id, partition_id)
    Access method: heap
    
    gitlabhq_development_ci=# select id, iid, partition_id, project_id from p_ci_pipelines_103;
     id  | iid | partition_id | project_id
    -----+-----+--------------+------------
     520 |  33 |          103 |         75
     521 |  34 |          103 |         75
     522 |  35 |          103 |         75
     544 |  57 |          103 |         75
    (4 rows)
    
    gitlabhq_development_ci=# delete from p_ci_pipelines_103 where id = 544;
    DELETE 1
    gitlabhq_development_ci=# select id, name, commit_id, partition_id from p_ci_builds_103 where commit_id = 544;
      id   |  name   | commit_id | partition_id
    -------+---------+-----------+--------------
     26847 | build1  |       544 |          103
     26848 | test1   |       544 |          103
     26849 | test2   |       544 |          103
     26850 | deploy1 |       544 |          103
    (4 rows)
  • Author Maintainer

    After adding the constraints, cascade deletion works:

    gitlabhq_development_ci=# ALTER TABLE ONLY p_ci_pipelines ADD UNIQUE (id, partition_id);
    ALTER TABLE
    
    gitlabhq_development_ci=# ALTER TABLE p_ci_builds ADD CONSTRAINT fk_d3130c9a7f FOREIGN KEY (commit_id, partition_id) REFERENCES p_ci_pipelines(id, partition_id) ON DELETE CASCADE;
    ALTER TABLE
    gitlabhq_development_ci=# select id, iid, partition_id, project_id from p_ci_pipelines_103;
     id  | iid | partition_id | project_id
    -----+-----+--------------+------------
     520 |  33 |          103 |         75
     521 |  34 |          103 |         75
     522 |  35 |          103 |         75
    (3 rows)
    
    gitlabhq_development_ci=# select id, name, commit_id, partition_id from p_ci_builds_103 where commit_id = 522;
      id   |  name   | commit_id | partition_id
    -------+---------+-----------+--------------
     26759 | build1  |       522 |          103
     26760 | test1   |       522 |          103
     26761 | test2   |       522 |          103
     26762 | deploy1 |       522 |          103
    (4 rows)
    
    gitlabhq_development_ci=# delete from p_ci_pipelines_103 where id = 522;
    DELETE 1
    
    gitlabhq_development_ci=# select id, name, commit_id, partition_id from p_ci_builds_103 where commit_id = 522;
     id | name | commit_id | partition_id
    ----+------+-----------+--------------
    (0 rows)
    • Author Maintainer

      Deleting a pipeline by id from p_ci_pipelines deletes the corresponding builds from the right partition:

      gitlabhq_development_ci=# select id, iid, partition_id, project_id from p_ci_pipelines_103;
       id  | iid | partition_id | project_id
      -----+-----+--------------+------------
       520 |  33 |          103 |         75
       521 |  34 |          103 |         75
      (2 rows)
      
      gitlabhq_development_ci=# select id, name, commit_id, partition_id from p_ci_builds_103 where commit_id = 521;
        id   |  name   | commit_id | partition_id
      -------+---------+-----------+--------------
       26755 | build1  |       521 |          103
       26756 | test1   |       521 |          103
       26757 | test2   |       521 |          103
       26758 | deploy1 |       521 |          103
      (4 rows)
      
      gitlabhq_development_ci=# delete from p_ci_pipelines where id = 521;
      DELETE 1
      
      gitlabhq_development_ci=# select id, name, commit_id, partition_id from p_ci_builds_103 where commit_id = 521;
       id | name | commit_id | partition_id
      ----+------+-----------+--------------
      (0 rows)
    • Author Maintainer

      When the partition_id is not provided it goes though each partition and it triggers FK events when needed:

      gitlabhq_development_ci=# explain analyze delete from p_ci_pipelines where id = 518;
                                                                                 QUERY PLAN
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------
       Delete on p_ci_pipelines  (cost=0.15..12.98 rows=6 width=6) (actual time=0.060..0.062 rows=0 loops=1)
         Delete on ci_pipelines
         Delete on p_ci_pipelines_101
         Delete on p_ci_pipelines_102
         Delete on p_ci_pipelines_103
         Delete on p_ci_pipelines_104
         Delete on p_ci_pipelines_105
         ->  Index Scan using ci_pipelines_pkey on ci_pipelines  (cost=0.15..2.17 rows=1 width=6) (actual time=0.013..0.013 rows=0 loops=1)
               Index Cond: (id = 518)
         ->  Index Scan using p_ci_pipelines_101_id_partition_id_key on p_ci_pipelines_101  (cost=0.14..2.16 rows=1 width=6) (actual time=0.005..0.006 rows=0 loops=1)
               Index Cond: (id = 518)
         ->  Index Scan using p_ci_pipelines_102_id_partition_id_key on p_ci_pipelines_102  (cost=0.14..2.16 rows=1 width=6) (actual time=0.009..0.010 rows=1 loops=1)
               Index Cond: (id = 518)
         ->  Index Scan using p_ci_pipelines_103_id_partition_id_key on p_ci_pipelines_103  (cost=0.14..2.16 rows=1 width=6) (actual time=0.007..0.007 rows=0 loops=1)
               Index Cond: (id = 518)
         ->  Index Scan using p_ci_pipelines_104_id_partition_id_key on p_ci_pipelines_104  (cost=0.14..2.16 rows=1 width=6) (actual time=0.006..0.006 rows=0 loops=1)
               Index Cond: (id = 518)
         ->  Index Scan using p_ci_pipelines_105_id_partition_id_key on p_ci_pipelines_105  (cost=0.14..2.16 rows=1 width=6) (actual time=0.005..0.005 rows=0 loops=1)
               Index Cond: (id = 518)
       Planning Time: 0.580 ms
       Trigger for constraint p_ci_builds_commit_id_partition_id_fkey2 on p_ci_pipelines_102: time=0.198 calls=1
       Execution Time: 0.327 ms
      (22 rows)

      With partition_id:

      gitlabhq_development_ci=# explain analyze delete from p_ci_pipelines where id = 517 and partition_id = 102;
                                                                                 QUERY PLAN
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------
       Delete on p_ci_pipelines  (cost=0.14..2.16 rows=1 width=6) (actual time=0.039..0.040 rows=0 loops=1)
         Delete on p_ci_pipelines_102
         ->  Index Scan using p_ci_pipelines_102_id_partition_id_key on p_ci_pipelines_102  (cost=0.14..2.16 rows=1 width=6) (actual time=0.023..0.023 rows=1 loops=1)
               Index Cond: ((id = 517) AND (partition_id = 102))
       Planning Time: 0.161 ms
       Trigger for constraint p_ci_builds_commit_id_partition_id_fkey2 on p_ci_pipelines_102: time=0.238 calls=1
       Execution Time: 0.302 ms
      (7 rows)

      After fiddling with the config and log levels it seems that the trigger constraints propagate the partition_id value:

      2022-07-14 16:51:30.782 EEST [67558] CONTEXT:  SQL statement "DELETE FROM "public"."p_ci_builds" WHERE $1 OPERATOR(pg_catalog.=) "commit_id" AND $2 OPERATOR(pg_catalog.=) "partition_id""
      2022-07-14 16:51:30.782 EEST [67558] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      Wall of text from the log
      2022-07-14 16:51:30.779 EEST [67558] DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
      2022-07-14 16:51:30.779 EEST [67558] LOG:  statement: explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      2022-07-14 16:51:30.780 EEST [67558] LOG:  parse tree:
      2022-07-14 16:51:30.780 EEST [67558] DETAIL:     {QUERY
                 :commandType 5
                 :querySource 0
                 :canSetTag true
                 :utilityStmt ?
                 :resultRelation 0
                 :hasAggs false
                 :hasWindowFuncs false
                 :hasTargetSRFs false
                 :hasSubLinks false
                 :hasDistinctOn false
                 :hasRecursive false
                 :hasModifyingCTE false
                 :hasForUpdate false
                 :hasRowSecurity false
                 :cteList <>
                 :rtable <>
                 :jointree <>
                 :targetList <>
                 :override 0
                 :onConflict <>
                 :returningList <>
                 :groupClause <>
                 :groupingSets <>
                 :havingQual <>
                 :windowClause <>
                 :distinctClause <>
                 :sortClause <>
                 :limitOffset <>
                 :limitCount <>
                 :rowMarks <>
                 :setOperations <>
                 :constraintDeps <>
                 :withCheckOptions <>
                 :stmt_location 0
                 :stmt_len 80
                 }
      
      2022-07-14 16:51:30.780 EEST [67558] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      2022-07-14 16:51:30.780 EEST [67558] LOG:  rewritten parse tree:
      2022-07-14 16:51:30.780 EEST [67558] DETAIL:  (
                 {QUERY
                 :commandType 5
                 :querySource 0
                 :canSetTag true
                 :utilityStmt ?
                 :resultRelation 0
                 :hasAggs false
                 :hasWindowFuncs false
                 :hasTargetSRFs false
                 :hasSubLinks false
                 :hasDistinctOn false
                 :hasRecursive false
                 :hasModifyingCTE false
                 :hasForUpdate false
                 :hasRowSecurity false
                 :cteList <>
                 :rtable <>
                 :jointree <>
                 :targetList <>
                 :override 0
                 :onConflict <>
                 :returningList <>
                 :groupClause <>
                 :groupingSets <>
                 :havingQual <>
                 :windowClause <>
                 :distinctClause <>
                 :sortClause <>
                 :limitOffset <>
                 :limitCount <>
                 :rowMarks <>
                 :setOperations <>
                 :constraintDeps <>
                 :withCheckOptions <>
                 :stmt_location 0
                 :stmt_len 80
                 }
              )
      
      2022-07-14 16:51:30.780 EEST [67558] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      2022-07-14 16:51:30.780 EEST [67558] LOG:  plan:
      2022-07-14 16:51:30.780 EEST [67558] DETAIL:     {PLANNEDSTMT
                 :commandType 4
                 :queryId 0
                 :hasReturning false
                 :hasModifyingCTE false
                 :canSetTag true
                 :transientPlan false
                 :dependsOnRole false
                 :parallelModeNeeded false
                 :jitFlags 0
                 :planTree
                    {MODIFYTABLE
                    :startup_cost 0.14
                    :total_cost 2.16
                    :plan_rows 1
                    :plan_width 6
                    :parallel_aware false
                    :parallel_safe false
                    :plan_node_id 0
                    :targetlist <>
                    :qual <>
                    :lefttree <>
                    :righttree <>
                    :initPlan <>
                    :extParam (b)
                    :allParam (b)
                    :operation 4
                    :canSetTag true
                    :nominalRelation 1
                    :rootRelation 1
                    :partColsUpdated false
                    :resultRelations (i 2)
                    :resultRelIndex 0
                    :rootResultRelIndex 0
                    :plans (
                       {INDEXSCAN
                       :startup_cost 0.14
                       :total_cost 2.16
                       :plan_rows 1
                       :plan_width 6
                       :parallel_aware false
                       :parallel_safe false
                       :plan_node_id 1
                       :targetlist (
                          {TARGETENTRY
                          :expr
                             {VAR
                             :varno 2
                             :varattno -1
                             :vartype 27
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno -1
                             :location -1
                             }
                          :resno 1
                          :resname ctid
                          :ressortgroupref 0
                          :resorigtbl 0
                          :resorigcol 0
                          :resjunk true
                          }
                       )
                       :qual <>
                       :lefttree <>
                       :righttree <>
                       :initPlan <>
                       :extParam (b 0)
                       :allParam (b 0)
                       :scanrelid 2
                       :indexid 11295617
                       :indexqual (
                          {OPEXPR
                          :opno 96
                          :opfuncid 65
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {VAR
                             :varno 65002
                             :varattno 1
                             :vartype 23
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno 1
                             :location -1
                             }
                             {CONST
                             :consttype 23
                             :consttypmod -1
                             :constcollid 0
                             :constlen 4
                             :constbyval true
                             :constisnull false
                             :location 54
                             :constvalue 4 [ 4 2 0 0 0 0 0 0 ]
                             }
                          )
                          :location 52
                          }
                          {OPEXPR
                          :opno 416
                          :opfuncid 474
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {VAR
                             :varno 65002
                             :varattno 2
                             :vartype 20
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno 30
                             :location -1
                             }
                             {CONST
                             :consttype 23
                             :consttypmod -1
                             :constcollid 0
                             :constlen 4
                             :constbyval true
                             :constisnull false
                             :location 77
                             :constvalue 4 [ 102 0 0 0 0 0 0 0 ]
                             }
                          )
                          :location 75
                          }
                       )
                       :indexqualorig (
                          {OPEXPR
                          :opno 96
                          :opfuncid 65
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {VAR
                             :varno 2
                             :varattno 1
                             :vartype 23
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno 1
                             :location -1
                             }
                             {CONST
                             :consttype 23
                             :consttypmod -1
                             :constcollid 0
                             :constlen 4
                             :constbyval true
                             :constisnull false
                             :location 54
                             :constvalue 4 [ 4 2 0 0 0 0 0 0 ]
                             }
                          )
                          :location 52
                          }
                          {OPEXPR
                          :opno 416
                          :opfuncid 474
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {VAR
                             :varno 2
                             :varattno 30
                             :vartype 20
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno 30
                             :location -1
                             }
                             {CONST
                             :consttype 23
                             :consttypmod -1
                             :constcollid 0
                             :constlen 4
                             :constbyval true
                             :constisnull false
                             :location 77
                             :constvalue 4 [ 102 0 0 0 0 0 0 0 ]
                             }
                          )
                          :location 75
                          }
                       )
                       :indexorderby <>
                       :indexorderbyorig <>
                       :indexorderbyops <>
                       :indexorderdir 1
                       }
                    )
                    :withCheckOptionLists <>
                    :returningLists <>
                    :fdwPrivLists (<>)
                    :fdwDirectModifyPlans (b)
                    :rowMarks <>
                    :epqParam 0
                    :onConflictAction 0
                    :arbiterIndexes <>
                    :onConflictSet <>
                    :onConflictWhere <>
                    :exclRelRTI 0
                    :exclRelTlist <>
                    }
                 :rtable (
                    {RTE
                    :alias <>
                    :eref
                       {ALIAS
                       :aliasname p_ci_pipelines
                       :colnames ("id" "ref" "sha" "before_sha" "created_at" "updated_at" "t
                       ag" "yaml_errors" "committed_at" "project_id" "status" "started_at" "
                       finished_at" "duration" "user_id" "lock_version" "auto_canceled_by_id
                       " "pipeline_schedule_id" "source" "config_source" "protected" "failur
                       e_reason" "iid" "merge_request_id" "source_sha" "target_sha" "externa
                       l_pull_request_id" "ci_ref_id" "locked" "partition_id")
                       }
                    :rtekind 0
                    :relid 11294925
                    :relkind p
                    :rellockmode 3
                    :tablesample <>
                    :lateral false
                    :inh true
                    :inFromCl false
                    :requiredPerms 10
                    :checkAsUser 0
                    :selectedCols (b 8 37)
                    :insertedCols (b)
                    :updatedCols (b)
                    :extraUpdatedCols (b)
                    :securityQuals <>
                    }
                    {RTE
                    :alias <>
                    :eref
                       {ALIAS
                       :aliasname p_ci_pipelines
                       :colnames ("id" "ref" "sha" "before_sha" "created_at" "updated_at" "t
                       ag" "yaml_errors" "committed_at" "project_id" "status" "started_at" "
                       finished_at" "duration" "user_id" "lock_version" "auto_canceled_by_id
                       " "pipeline_schedule_id" "source" "config_source" "protected" "failur
                       e_reason" "iid" "merge_request_id" "source_sha" "target_sha" "externa
                       l_pull_request_id" "ci_ref_id" "locked" "partition_id")
                       }
                    :rtekind 0
                    :relid 11295609
                    :relkind r
                    :rellockmode 3
                    :tablesample <>
                    :lateral false
                    :inh false
                    :inFromCl false
                    :requiredPerms 0
                    :checkAsUser 0
                    :selectedCols (b 8 37)
                    :insertedCols (b)
                    :updatedCols (b)
                    :extraUpdatedCols (b)
                    :securityQuals <>
                    }
                 )
                 :resultRelations (i 2)
                 :rootResultRelations (i 1)
                 :subplans <>
                 :rewindPlanIDs (b)
                 :rowMarks <>
                 :relationOids (o 11294925 11295609)
                 :invalItems <>
                 :paramExecTypes (o 0)
                 :utilityStmt <>
                 :stmt_location 0
                 :stmt_len 0
                 }
      
      2022-07-14 16:51:30.780 EEST [67558] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      2022-07-14 16:51:30.781 EEST [67558] LOG:  parse tree:
      2022-07-14 16:51:30.781 EEST [67558] DETAIL:     {QUERY
                 :commandType 4
                 :querySource 0
                 :canSetTag true
                 :utilityStmt <>
                 :resultRelation 1
                 :hasAggs false
                 :hasWindowFuncs false
                 :hasTargetSRFs false
                 :hasSubLinks false
                 :hasDistinctOn false
                 :hasRecursive false
                 :hasModifyingCTE false
                 :hasForUpdate false
                 :hasRowSecurity false
                 :cteList <>
                 :rtable (
                    {RTE
                    :alias <>
                    :eref
                       {ALIAS
                       :aliasname p_ci_builds
                       :colnames ("status" "finished_at" "trace" "created_at" "updated_at" "
                       started_at" "runner_id" "coverage" "commit_id" "name" "options" "allo
                       w_failure" "stage" "trigger_request_id" "stage_idx" "tag" "ref" "user
                       _id" "type" "target_url" "description" "project_id" "erased_by_id" "e
                       rased_at" "artifacts_expire_at" "environment" "when" "yaml_variables"
                        "queued_at" "token" "lock_version" "coverage_regex" "auto_canceled_b
                       y_id" "retried" "protected" "failure_reason" "scheduled_at" "token_en
                       crypted" "upstream_pipeline_id" "resource_group_id" "waiting_for_reso
                       urce_at" "processed" "scheduling_type" "id" "stage_id" "partition_id"
                       )
                       }
                    :rtekind 0
                    :relid 11294979
                    :relkind p
                    :rellockmode 3
                    :tablesample <>
                    :lateral false
                    :inh true
                    :inFromCl false
                    :requiredPerms 10
                    :checkAsUser 0
                    :selectedCols (b 16 53)
                    :insertedCols (b)
                    :updatedCols (b)
                    :extraUpdatedCols (b)
                    :securityQuals <>
                    }
                 )
                 :jointree
                    {FROMEXPR
                    :fromlist (
                       {RANGETBLREF
                       :rtindex 1
                       }
                    )
                    :quals
                       {BOOLEXPR
                       :boolop and
                       :args (
                          {OPEXPR
                          :opno 96
                          :opfuncid 65
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {PARAM
                             :paramkind 0
                             :paramid 1
                             :paramtype 23
                             :paramtypmod -1
                             :paramcollid 0
                             :location 41
                             }
                             {VAR
                             :varno 1
                             :varattno 9
                             :vartype 23
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 1
                             :varoattno 9
                             :location 67
                             }
                          )
                          :location 44
                          }
                          {OPEXPR
                          :opno 410
                          :opfuncid 467
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {PARAM
                             :paramkind 0
                             :paramid 2
                             :paramtype 20
                             :paramtypmod -1
                             :paramcollid 0
                             :location 83
                             }
                             {VAR
                             :varno 1
                             :varattno 46
                             :vartype 20
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 1
                             :varoattno 46
                             :location 109
                             }
                          )
                          :location 86
                          }
                       )
                       :location 79
                       }
                    }
                 :targetList <>
                 :override 0
                 :onConflict <>
                 :returningList <>
                 :groupClause <>
                 :groupingSets <>
                 :havingQual <>
                 :windowClause <>
                 :distinctClause <>
                 :sortClause <>
                 :limitOffset <>
                 :limitCount <>
                 :rowMarks <>
                 :setOperations <>
                 :constraintDeps <>
                 :withCheckOptions <>
                 :stmt_location 0
                 :stmt_len 0
                 }
      
      2022-07-14 16:51:30.781 EEST [67558] CONTEXT:  SQL statement "DELETE FROM "public"."p_ci_builds" WHERE $1 OPERATOR(pg_catalog.=) "commit_id" AND $2 OPERATOR(pg_catalog.=) "partition_id""
      2022-07-14 16:51:30.781 EEST [67558] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      2022-07-14 16:51:30.781 EEST [67558] LOG:  rewritten parse tree:
      2022-07-14 16:51:30.781 EEST [67558] DETAIL:  (
                 {QUERY
                 :commandType 4
                 :querySource 0
                 :canSetTag true
                 :utilityStmt <>
                 :resultRelation 1
                 :hasAggs false
                 :hasWindowFuncs false
                 :hasTargetSRFs false
                 :hasSubLinks false
                 :hasDistinctOn false
                 :hasRecursive false
                 :hasModifyingCTE false
                 :hasForUpdate false
                 :hasRowSecurity false
                 :cteList <>
                 :rtable (
                    {RTE
                    :alias <>
                    :eref
                       {ALIAS
                       :aliasname p_ci_builds
                       :colnames ("status" "finished_at" "trace" "created_at" "updated_at" "
                       started_at" "runner_id" "coverage" "commit_id" "name" "options" "allo
                       w_failure" "stage" "trigger_request_id" "stage_idx" "tag" "ref" "user
                       _id" "type" "target_url" "description" "project_id" "erased_by_id" "e
                       rased_at" "artifacts_expire_at" "environment" "when" "yaml_variables"
                        "queued_at" "token" "lock_version" "coverage_regex" "auto_canceled_b
                       y_id" "retried" "protected" "failure_reason" "scheduled_at" "token_en
                       crypted" "upstream_pipeline_id" "resource_group_id" "waiting_for_reso
                       urce_at" "processed" "scheduling_type" "id" "stage_id" "partition_id"
                       )
                       }
                    :rtekind 0
                    :relid 11294979
                    :relkind p
                    :rellockmode 3
                    :tablesample <>
                    :lateral false
                    :inh true
                    :inFromCl false
                    :requiredPerms 10
                    :checkAsUser 0
                    :selectedCols (b 16 53)
                    :insertedCols (b)
                    :updatedCols (b)
                    :extraUpdatedCols (b)
                    :securityQuals <>
                    }
                 )
                 :jointree
                    {FROMEXPR
                    :fromlist (
                       {RANGETBLREF
                       :rtindex 1
                       }
                    )
                    :quals
                       {BOOLEXPR
                       :boolop and
                       :args (
                          {OPEXPR
                          :opno 96
                          :opfuncid 65
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {PARAM
                             :paramkind 0
                             :paramid 1
                             :paramtype 23
                             :paramtypmod -1
                             :paramcollid 0
                             :location 41
                             }
                             {VAR
                             :varno 1
                             :varattno 9
                             :vartype 23
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 1
                             :varoattno 9
                             :location 67
                             }
                          )
                          :location 44
                          }
                          {OPEXPR
                          :opno 410
                          :opfuncid 467
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {PARAM
                             :paramkind 0
                             :paramid 2
                             :paramtype 20
                             :paramtypmod -1
                             :paramcollid 0
                             :location 83
                             }
                             {VAR
                             :varno 1
                             :varattno 46
                             :vartype 20
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 1
                             :varoattno 46
                             :location 109
                             }
                          )
                          :location 86
                          }
                       )
                       :location 79
                       }
                    }
                 :targetList <>
                 :override 0
                 :onConflict <>
                 :returningList <>
                 :groupClause <>
                 :groupingSets <>
                 :havingQual <>
                 :windowClause <>
                 :distinctClause <>
                 :sortClause <>
                 :limitOffset <>
                 :limitCount <>
                 :rowMarks <>
                 :setOperations <>
                 :constraintDeps <>
                 :withCheckOptions <>
                 :stmt_location 0
                 :stmt_len 0
                 }
              )
      
      2022-07-14 16:51:30.781 EEST [67558] CONTEXT:  SQL statement "DELETE FROM "public"."p_ci_builds" WHERE $1 OPERATOR(pg_catalog.=) "commit_id" AND $2 OPERATOR(pg_catalog.=) "partition_id""
      2022-07-14 16:51:30.781 EEST [67558] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      2022-07-14 16:51:30.782 EEST [67558] LOG:  plan:
      2022-07-14 16:51:30.782 EEST [67558] DETAIL:     {PLANNEDSTMT
                 :commandType 4
                 :queryId 0
                 :hasReturning false
                 :hasModifyingCTE false
                 :canSetTag true
                 :transientPlan false
                 :dependsOnRole false
                 :parallelModeNeeded false
                 :jitFlags 0
                 :planTree
                    {MODIFYTABLE
                    :startup_cost 0.14
                    :total_cost 4.05
                    :plan_rows 1
                    :plan_width 6
                    :parallel_aware false
                    :parallel_safe false
                    :plan_node_id 0
                    :targetlist <>
                    :qual <>
                    :lefttree <>
                    :righttree <>
                    :initPlan <>
                    :extParam (b)
                    :allParam (b)
                    :operation 4
                    :canSetTag true
                    :nominalRelation 1
                    :rootRelation 1
                    :partColsUpdated false
                    :resultRelations (i 2)
                    :resultRelIndex 0
                    :rootResultRelIndex 0
                    :plans (
                       {INDEXSCAN
                       :startup_cost 0.14
                       :total_cost 4.05
                       :plan_rows 1
                       :plan_width 6
                       :parallel_aware false
                       :parallel_safe false
                       :plan_node_id 1
                       :targetlist (
                          {TARGETENTRY
                          :expr
                             {VAR
                             :varno 2
                             :varattno -1
                             :vartype 27
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno -1
                             :location -1
                             }
                          :resno 1
                          :resname ctid
                          :ressortgroupref 0
                          :resorigtbl 0
                          :resorigcol 0
                          :resjunk true
                          }
                       )
                       :qual (
                          {OPEXPR
                          :opno 96
                          :opfuncid 65
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {CONST
                             :consttype 23
                             :consttypmod -1
                             :constcollid 0
                             :constlen 4
                             :constbyval true
                             :constisnull false
                             :location -1
                             :constvalue 4 [ 4 2 0 0 0 0 0 0 ]
                             }
                             {VAR
                             :varno 2
                             :varattno 9
                             :vartype 23
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno 9
                             :location -1
                             }
                          )
                          :location 44
                          }
                       )
                       :lefttree <>
                       :righttree <>
                       :initPlan <>
                       :extParam (b 0)
                       :allParam (b 0)
                       :scanrelid 2
                       :indexid 11295638
                       :indexqual (
                          {OPEXPR
                          :opno 410
                          :opfuncid 467
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {VAR
                             :varno 65002
                             :varattno 2
                             :vartype 20
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno 46
                             :location -1
                             }
                             {CONST
                             :consttype 20
                             :consttypmod -1
                             :constcollid 0
                             :constlen 8
                             :constbyval true
                             :constisnull false
                             :location -1
                             :constvalue 8 [ 102 0 0 0 0 0 0 0 ]
                             }
                          )
                          :location 86
                          }
                       )
                       :indexqualorig (
                          {OPEXPR
                          :opno 410
                          :opfuncid 467
                          :opresulttype 16
                          :opretset false
                          :opcollid 0
                          :inputcollid 0
                          :args (
                             {VAR
                             :varno 2
                             :varattno 46
                             :vartype 20
                             :vartypmod -1
                             :varcollid 0
                             :varlevelsup 0
                             :varnoold 2
                             :varoattno 46
                             :location -1
                             }
                             {CONST
                             :consttype 20
                             :consttypmod -1
                             :constcollid 0
                             :constlen 8
                             :constbyval true
                             :constisnull false
                             :location -1
                             :constvalue 8 [ 102 0 0 0 0 0 0 0 ]
                             }
                          )
                          :location 86
                          }
                       )
                       :indexorderby <>
                       :indexorderbyorig <>
                       :indexorderbyops <>
                       :indexorderdir 1
                       }
                    )
                    :withCheckOptionLists <>
                    :returningLists <>
                    :fdwPrivLists (<>)
                    :fdwDirectModifyPlans (b)
                    :rowMarks <>
                    :epqParam 0
                    :onConflictAction 0
                    :arbiterIndexes <>
                    :onConflictSet <>
                    :onConflictWhere <>
                    :exclRelRTI 0
                    :exclRelTlist <>
                    }
                 :rtable (
                    {RTE
                    :alias <>
                    :eref
                       {ALIAS
                       :aliasname p_ci_builds
                       :colnames ("status" "finished_at" "trace" "created_at" "updated_at" "
                       started_at" "runner_id" "coverage" "commit_id" "name" "options" "allo
                       w_failure" "stage" "trigger_request_id" "stage_idx" "tag" "ref" "user
                       _id" "type" "target_url" "description" "project_id" "erased_by_id" "e
                       rased_at" "artifacts_expire_at" "environment" "when" "yaml_variables"
                        "queued_at" "token" "lock_version" "coverage_regex" "auto_canceled_b
                       y_id" "retried" "protected" "failure_reason" "scheduled_at" "token_en
                       crypted" "upstream_pipeline_id" "resource_group_id" "waiting_for_reso
                       urce_at" "processed" "scheduling_type" "id" "stage_id" "partition_id"
                       )
                       }
                    :rtekind 0
                    :relid 11294979
                    :relkind p
                    :rellockmode 3
                    :tablesample <>
                    :lateral false
                    :inh true
                    :inFromCl false
                    :requiredPerms 10
                    :checkAsUser 0
                    :selectedCols (b 16 53)
                    :insertedCols (b)
                    :updatedCols (b)
                    :extraUpdatedCols (b)
                    :securityQuals <>
                    }
                    {RTE
                    :alias <>
                    :eref
                       {ALIAS
                       :aliasname p_ci_builds
                       :colnames ("status" "finished_at" "trace" "created_at" "updated_at" "
                       started_at" "runner_id" "coverage" "commit_id" "name" "options" "allo
                       w_failure" "stage" "trigger_request_id" "stage_idx" "tag" "ref" "user
                       _id" "type" "target_url" "description" "project_id" "erased_by_id" "e
                       rased_at" "artifacts_expire_at" "environment" "when" "yaml_variables"
                        "queued_at" "token" "lock_version" "coverage_regex" "auto_canceled_b
                       y_id" "retried" "protected" "failure_reason" "scheduled_at" "token_en
                       crypted" "upstream_pipeline_id" "resource_group_id" "waiting_for_reso
                       urce_at" "processed" "scheduling_type" "id" "stage_id" "partition_id"
                       )
                       }
                    :rtekind 0
                    :relid 11295631
                    :relkind r
                    :rellockmode 3
                    :tablesample <>
                    :lateral false
                    :inh false
                    :inFromCl false
                    :requiredPerms 0
                    :checkAsUser 0
                    :selectedCols (b 16 53)
                    :insertedCols (b)
                    :updatedCols (b)
                    :extraUpdatedCols (b)
                    :securityQuals <>
                    }
                 )
                 :resultRelations (i 2)
                 :rootResultRelations (i 1)
                 :subplans <>
                 :rewindPlanIDs (b)
                 :rowMarks <>
                 :relationOids (o 11294979 11295631)
                 :invalItems <>
                 :paramExecTypes (o 0)
                 :utilityStmt <>
                 :stmt_location 0
                 :stmt_len 0
                 }
      
      2022-07-14 16:51:30.782 EEST [67558] CONTEXT:  SQL statement "DELETE FROM "public"."p_ci_builds" WHERE $1 OPERATOR(pg_catalog.=) "commit_id" AND $2 OPERATOR(pg_catalog.=) "partition_id""
      2022-07-14 16:51:30.782 EEST [67558] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 516 and partition_id = 102;
      2022-07-14 16:51:30.782 EEST [67558] DEBUG:  CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 8106593/1/2
      2022-07-14 16:51:30.898 EEST [64971] DEBUG:  snapshot of 0+0 running transaction ids (lsn 9/E7DE07E8 oldest xid 8106594 latest complete 8106593 next xid 8106594)
      
      Database config used to capture the logs
      log_directory = 'pg_log'
      log_filename = 'postgresql-dateformat.log'
      log_statement = 'all'
      logging_collector = on
      debug_print_parse = on
      debug_print_rewritten = on
      debug_print_plan = on
      log_min_messages = 'DEBUG5'
    • Deleting a pipeline by id from p_ci_pipelines deletes the corresponding builds from the right partition

      But does it need to scan all partitions if p_ci_builds are also partitioned? Meaning that if we don't provide partitioning key PostgreSQL will still need to scan all partitions, right @mbobin? I think that it highlights the importance of having uniform partition_id assign to a pipeline and related resources.

    • Author Maintainer

      @grzesiek

      But does it need to scan all partitions if p_ci_builds are also partitioned?

      No, it doesn't scan all partitions because we need to include the partition key in the definition: FOREIGN KEY (commit_id, partition_id) REFERENCES p_ci_pipelines(id, partition_id).

      with both tables partitioned, I've executed a query to delete a pipeline. Notice that in this case it iterates through all pipeline partitions and triggers the FK removal only on the partition that holds the data(p_ci_pipelines_103):

      gitlabhq_development_ci=# explain analyze delete from p_ci_pipelines where id = 564;
                                                                                 QUERY PLAN
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------
       Delete on p_ci_pipelines  (cost=0.15..12.98 rows=6 width=6) (actual time=0.045..0.046 rows=0 loops=1)
         Delete on ci_pipelines
         Delete on p_ci_pipelines_101
         Delete on p_ci_pipelines_102
         Delete on p_ci_pipelines_103
         Delete on p_ci_pipelines_104
         Delete on p_ci_pipelines_105
         ->  Index Scan using ci_pipelines_pkey on ci_pipelines  (cost=0.15..2.17 rows=1 width=6) (actual time=0.011..0.011 rows=0 loops=1)
               Index Cond: (id = 564)
         ->  Index Scan using p_ci_pipelines_101_id_partition_id_key on p_ci_pipelines_101  (cost=0.14..2.16 rows=1 width=6) (actual time=0.002..0.002 rows=0 loops=1)
               Index Cond: (id = 564)
         ->  Index Scan using p_ci_pipelines_102_id_partition_id_key on p_ci_pipelines_102  (cost=0.14..2.16 rows=1 width=6) (actual time=0.004..0.004 rows=0 loops=1)
               Index Cond: (id = 564)
         ->  Index Scan using p_ci_pipelines_103_id_partition_id_key on p_ci_pipelines_103  (cost=0.14..2.16 rows=1 width=6) (actual time=0.006..0.006 rows=1 loops=1)
               Index Cond: (id = 564)
         ->  Index Scan using p_ci_pipelines_104_id_partition_id_key on p_ci_pipelines_104  (cost=0.14..2.16 rows=1 width=6) (actual time=0.004..0.004 rows=0 loops=1)
               Index Cond: (id = 564)
         ->  Index Scan using p_ci_pipelines_105_id_partition_id_key on p_ci_pipelines_105  (cost=0.14..2.16 rows=1 width=6) (actual time=0.004..0.004 rows=0 loops=1)
               Index Cond: (id = 564)
       Planning Time: 23.025 ms
       Trigger for constraint p_ci_builds_commit_id_partition_id_fkey3 on p_ci_pipelines_103: time=0.268 calls=1
       Execution Time: 0.451 ms
      (22 rows)
      

      And in the low level logs we can see that it uses both attributes(commit_id, partition_id) from the KF definition to delete the builds data, meaning that it will scan only p_ci_builds_103 (partition_id=103):

      2022-07-18 15:24:32.439 EEST [98692] CONTEXT:  SQL statement "DELETE FROM "public"."p_ci_builds" WHERE $1 OPERATOR(pg_catalog.=) "commit_id" AND $2 OPERATOR(pg_catalog.=) "partition_id""
      2022-07-18 15:24:32.439 EEST [98692] STATEMENT:  explain analyze delete from p_ci_pipelines where id = 564;

      If we do provide a partition_id for pipelines deletion it will scan only that partition.

      Does this answer your question?

    • Author Maintainer

      I think we will need to add the partition_id column to all the tables that reference a partitioned table to make the FK work and to fetch the referenced record without scanning all partitions.

    • It answers my question @mbobin. I wonder how difficult it will actually be to redefine foreign keys, as we would need to also replace indexes, right? We know it might be difficult in the current setup.

    • Author Maintainer

      @grzesiek I think we'll have to add partition_id to at least 23 tables and update their FK definitions.

      [45] pry(main)> ci_tables = ::Gitlab::Database::GitlabSchema.tables_to_schema.select { |k,v| v == :gitlab_ci }.keys.map { |t| Class.new(Ci::ApplicationRecord) { |c| c.table_name = t } } ; nil
      => nil
      [46] pry(main)> ci_tables.size
      => 54
      [53] pry(main)> ci_tables.select { |t| (t.column_names & ['build_id', 'job_id', 'stage_id', 'pipeline_id', 'commit_id']).any? }.size
      => 23
      [54] pry(main)> ci_tables.select { |t| (t.column_names & ['build_id', 'job_id', 'stage_id', 'pipeline_id', 'commit_id']).any? }.map(&:table_name)
      => ["ci_build_needs",
       "ci_build_pending_states",
       "ci_build_report_results",
       "ci_builds",
       "ci_builds_metadata",
       "ci_builds_runner_session",
       "ci_build_trace_chunks",
       "ci_build_trace_metadata",
       "ci_job_artifacts",
       "ci_job_variables",
       "ci_pending_builds",
       "ci_pipeline_artifacts",
       "ci_pipeline_chat_data",
       "ci_pipeline_messages",
       "ci_pipelines_config",
       "ci_pipeline_variables",
       "ci_resources",
       "ci_running_builds",
       "ci_sources_pipelines",
       "ci_sources_projects",
       "ci_stages",
       "ci_trigger_requests",
       "ci_unit_test_failures"]
    • Please register or sign in to reply
    • Author Maintainer

      @grzesiek my first thoughts here are that we'll need to redefine the PK and FK constraints to include the partition_id and other required indices. But after that the cascading FK deletion should just work within the specified partitions.

    • Author Maintainer

      To sum up:

      • we need to add an UNIQUE (id, partition_id) constraint on all partitioned tables
      • we need to add partition_id to all tables that reference a partitioned table. This might complicate things later on when we decide to partition those tables because we'd need to move data around, but it should be similar to the plan for moving existing pipelines data to partition_id < 100 partitions
      • update foreign key definitions to use (relation_id, partition_id) and indexes where necessary

      @grzesiek do we have enough information to close #358943 (closed)?

      /cc @morefice

    • do we have enough information to close #358943 (closed)?

      @mbobin Yes! I believe that we can close this, but perhaps let's do that after we update the partitioning design document. I believe that we need to update this page https://docs.gitlab.com/ee/architecture/blueprints/ci_data_decay/pipeline_partitioning.html with what we've learned from the PoC!

      Let's discuss the ideas about how to adjust our direction in the diff of that merge request, because we initially planned to add partition_id only to a few tables. /cc @morefice

    • Author Maintainer

      MR opened at !92781 (merged)

    • Awesome work @mbobin

      Thanks for the heads up, catching up everything that has been discussed in this MR :eyes:

      Should we create the associated issues related to our latest discoveries?

      update foreign key definitions to use (relation_id, partition_id) and indexes where necessary

      This will be fun :see_no_evil:

      Edited by Max Orefice
    • Please register or sign in to reply
  • Allure report

    allure-report-publisher generated test report!

    review-qa-blocking: :x: test report for 57e429f3

    expand test summary
    +-----------------------------------------------------------------------------------------+
    |                                     suites summary                                      |
    +------------------------------------+--------+--------+---------+-------+-------+--------+
    |                                    | passed | failed | skipped | flaky | total | result |
    +------------------------------------+--------+--------+---------+-------+-------+--------+
    | Create                             | 0      | 23     | 2       | 23    | 25    | ❌     |
    | Plan                               | 0      | 47     | 1       | 47    | 48    | ❌     |
    | Verify                             | 0      | 12     | 1       | 12    | 13    | ❌     |
    | Manage                             | 3      | 37     | 0       | 40    | 40    | ❌     |
    | Version sanity check               | 0      | 0      | 1       | 0     | 1     | ➖     |
    | Configure                          | 0      | 0      | 1       | 0     | 1     | ➖     |
    | Feature flag handler sanity checks | 9      | 0      | 0       | 9     | 9     | ❗     |
    | Package                            | 0      | 0      | 1       | 0     | 1     | ➖     |
    | Protect                            | 0      | 2      | 0       | 2     | 2     | ❌     |
    | Secure                             | 0      | 2      | 0       | 2     | 2     | ❌     |
    +------------------------------------+--------+--------+---------+-------+-------+--------+
    | Total                              | 12     | 123    | 7       | 135   | 142   | ❌     |
    +------------------------------------+--------+--------+---------+-------+-------+--------+
  • Marius Bobin mentioned in merge request !92535 (merged)

    mentioned in merge request !92535 (merged)

  • Marius Bobin added 1 commit

    added 1 commit

    Compare with previous version

  • 10 10 deployments: 2,
    11 11 milestones: 3,
    12 12 epics: 4,
    13 ci_pipelines: 5,
    13 p_ci_pipelines: 5,
  • Marius Bobin mentioned in merge request !92781 (merged)

    mentioned in merge request !92781 (merged)

  • closed

  • Max Orefice mentioned in merge request !96815 (merged)

    mentioned in merge request !96815 (merged)

  • Marius Bobin mentioned in epic &7522

    mentioned in epic &7522

  • James Heimbuck added Category:Continuous Integration label and removed 1 deleted label

    added Category:Continuous Integration label and removed 1 deleted label

  • mentioned in merge request mbobin/headway!1 (closed)

  • Please register or sign in to reply
    Loading