Skip to content

Optimize artifact management page queries

Albert requested to merge 387765-optimize-query-artifact-management-page into master

What does this MR do and why?

This MR improves the performance of the job artifact management page.

A few changes were made to improve the performance of this page:

  1. Update graphQL query so it can use existing ci_builds indices.
  • count is removed as this times out without adding a new index on ci_builds
  • statuses is removed as only we are interested in any job that has any artifact.
  1. Change scope to select job artifact of any type.

The GraphQL query and resolver was initially added in !104304 (merged).

Improvement result

Query to list ci_builds and query plan before the change:

Query to list ci_builds and query plan after the change:

SELECT
    ci_builds.*
FROM
    ci_builds
WHERE
    ci_builds.type = 'Ci::Build' AND
    ci_builds.project_id = 278964 AND
    ci_builds.status IN ( 'created' ) AND
    EXISTS (
        SELECT
            1
        FROM
            ci_job_artifacts
        WHERE
            ci_builds.id = ci_job_artifacts.job_id
    )
ORDER BY
    ci_builds.id DESC
LIMIT 21;
Limit  (cost=1.28..469.44 rows=21 width=1251) (actual time=0.281..0.905 rows=21 loops=1)
  ->  Nested Loop Semi Join  (cost=1.28..208767794.85 rows=9364762 width=1251) (actual time=0.280..0.902 rows=21 loops=1)
        ->  Index Scan Backward using index_ci_builds_on_project_id_and_id on ci_builds  (cost=0.70..127965237.51 rows=91862977 width=1251) (actual time=0.018..0.415 rows=198 loops=1)
              Index Cond: (project_id = 278964)
              Filter: (((status)::text <> 'created'::text) AND ((type)::text = 'Ci::Build'::text))
              Rows Removed by Filter: 19
        ->  Index Only Scan using index_ci_job_artifacts_on_job_id_and_file_type on ci_job_artifacts  (cost=0.58..2.51 rows=7 width=8) (actual time=0.002..0.002 rows=0 loops=198)
              Index Cond: (job_id = ci_builds.id)
              Heap Fetches: 21
Planning Time: 1.145 ms
Execution Time: 0.959 ms

Note that the query still performs a filter Filter: (((ci_builds.status)::text <> 'created'::text) AND ((ci_builds.type)::text = 'Ci::Build'::text)).

This is unavoidable without adding a new index on ci_builds. Given that we have addressed the hot spot in the previous query, which is the filter on job artifact type, and that this feature is behind a feature flag, we can move forward with this change for now.

For more detail on the change in query performance, refer to the postgres.ai links above.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

How to set up and validate locally

  1. Enable the artifacts_management_page feature flag
    Feature.enable(:artifacts_management_page)
  2. Create a pipeline that has both jobs with artifacts and jobs without artifacts.
  3. Visit the project's artifacts page `http://gdk.test:3000/project/path/-/artifacts.
  4. The page should list jobs with their artifacts.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #387765 (closed)

Edited by Albert

Merge request reports