Add index to ci_job_artifacts
Ref: #362640 (closed)
What does this MR do and why?
Following up !88427 (merged) where we introduced a new index on ci_job_artifacts: (project_id, id)
during low traffic this weekend.
This MR is adding a migration which will create our index for users outside of gitlab.com
as explained in our documentation.
Why are we doing this?
This index will be used to order job artifacts by id
instead of created_at
to recalculate job artifacts size which is currently broken for some projects.
Database queries
✅ Verification that our index exists on production
Table "public.ci_job_artifacts"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-------------------+--------------------------+-----------+----------+----------------------------------------------+----------+--------------+-------------
project_id | integer | | not null | | plain | |
file_type | integer | | not null | | plain | |
size | bigint | | | | plain | |
created_at | timestamp with time zone | | not null | | plain | |
updated_at | timestamp with time zone | | not null | | plain | |
expire_at | timestamp with time zone | | | | plain | |
file | character varying | | | | extended | |
file_store | integer | | | 1 | plain | |
file_sha256 | bytea | | | | extended | |
file_format | smallint | | | | plain | |
file_location | smallint | | | | plain | |
id | bigint | | not null | nextval('ci_job_artifacts_id_seq'::regclass) | plain | |
job_id | bigint | | not null | | plain | |
locked | smallint | | | 2 | plain | |
original_filename | text | | | | extended | |
Indexes:
"ci_job_artifacts_pkey" PRIMARY KEY, btree (id)
"index_ci_job_artifacts_on_job_id_and_file_type" UNIQUE, btree (job_id, file_type)
"index_ci_job_artifacts_for_terraform_reports" btree (project_id, id) WHERE file_type = 18
"index_ci_job_artifacts_id_for_terraform_reports" btree (id) WHERE file_type = 18
"index_ci_job_artifacts_on_expire_at_and_job_id" btree (expire_at, job_id)
"index_ci_job_artifacts_on_expire_at_for_removal" btree (expire_at) WHERE locked = 0 AND expire_at IS NOT NULL
"index_ci_job_artifacts_on_file_store" btree (file_store)
"index_ci_job_artifacts_on_file_type_for_devops_adoption" btree (file_type, project_id, created_at) WHERE file_type = ANY (ARRAY[5, 6, 8, 23])
"index_ci_job_artifacts_on_id_project_id_and_created_at" btree (project_id, created_at, id)
"index_ci_job_artifacts_on_id_project_id_and_file_type" btree (project_id, file_type, id)
"index_ci_job_artifacts_on_project_id" btree (project_id)
+ "index_ci_job_artifacts_on_project_id_and_id" btree (project_id, id)
"index_ci_job_artifacts_on_project_id_for_security_reports" btree (project_id) WHERE file_type = ANY (ARRAY[5, 6, 7, 8])
"tmp_index_ci_job_artifacts_on_expire_at_where_locked_unknown" btree (expire_at, job_id) WHERE locked = 2 AND expire_at IS NOT NULL
"tmp_index_ci_job_artifacts_on_id_where_trace_and_expire_at" btree (id) WHERE file_type = 3 AND (expire_at = ANY (ARRAY['2021-04-22 00:00:00+00'::timestamp with time zone, '2021-05-22 00:00:00+00'::timestamp with time zone, '2021-06-22 00:00:00+00'::timestamp with time zone, '2022-01-22 00:00:00+00'::timestamp with time zone, '2022-02-22 00:00:00+00'::timestamp with time zone, '2022-03-22 00:00:00+00'::timestamp with time zone, '2022-04-22 00:00:00+00'::timestamp with time zone]))
Check constraints:
"check_27f0f6dbab" CHECK (file_store IS NOT NULL)
"check_85573000db" CHECK (char_length(original_filename) <= 512)
Foreign-key constraints:
"fk_rails_c5137cb2c1" FOREIGN KEY (job_id) REFERENCES ci_builds(id) ON DELETE CASCADE
Referenced by:
TABLE "ci_build_trace_metadata" CONSTRAINT "fk_21d25cac1a" FOREIGN KEY (trace_artifact_id) REFERENCES ci_job_artifacts(id) ON DELETE CASCADE
TABLE "ci_job_artifact_states" CONSTRAINT "fk_rails_80a9cba3b2" FOREIGN KEY (job_artifact_id) REFERENCES ci_job_artifacts(id) ON DELETE CASCADE
Access method: heap
Migration output
ci: == 20220530074027 AddProjectIdIndexToJobArtifacts: migrating ==================
ci: -- transaction_open?()
ci: -> 0.0000s
ci: -- index_exists?(:ci_job_artifacts, [:project_id, :id], {:name=>"index_ci_job_artifacts_on_project_id_and_id", :algorithm=>:concurrently})
ci: -> 0.0053s
ci: -- execute("SET statement_timeout TO 0")
ci: -> 0.0001s
ci: -- add_index(:ci_job_artifacts, [:project_id, :id], {:name=>"index_ci_job_artifacts_on_project_id_and_id", :algorithm=>:concurrently})
ci: -> 0.0015s
ci: -- execute("RESET statement_timeout")
ci: -> 0.0001s
ci: == 20220530074027 AddProjectIdIndexToJobArtifacts: migrated (0.0112s) =========
-------------------------- ROLLBACK --------------------------------
ci: == 20220530074027 AddProjectIdIndexToJobArtifacts: reverting ==================
ci: -- transaction_open?()
ci: -> 0.0000s
ci: -- indexes(:ci_job_artifacts)
ci: -> 0.0062s
ci: -- execute("SET statement_timeout TO 0")
ci: -> 0.0001s
ci: -- remove_index(:ci_job_artifacts, {:algorithm=>:concurrently, :name=>"index_ci_job_artifacts_on_project_id_and_id"})
ci: -> 0.0010s
ci: -- execute("RESET statement_timeout")
ci: -> 0.0001s
ci: == 20220530074027 AddProjectIdIndexToJobArtifacts: reverted (0.0116s) =========
NEW query without index
SELECT "ci_job_artifacts"."id", "ci_job_artifacts"."size"
FROM "ci_job_artifacts"
WHERE "ci_job_artifacts"."project_id" = 278964
AND "ci_job_artifacts"."id" > 500000
AND "ci_job_artifacts"."id" <= 700000
ORDER BY "ci_job_artifacts"."id" ASC
LIMIT 1000;
NEW query with index
SELECT "ci_job_artifacts"."id", "ci_job_artifacts"."size"
FROM "ci_job_artifacts"
WHERE "ci_job_artifacts"."project_id" = 278964
AND "ci_job_artifacts"."id" > 500000
AND "ci_job_artifacts"."id" <= 700000
ORDER BY "ci_job_artifacts"."id" ASC
LIMIT 1000;
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Edited by Max Orefice