Skip to content

Fix N+1 when loading Candidate Artifacts

What does this MR do and why?

This is a follow up on !101442 (merged), fixing the N+1 introduced.

Database

Migrations

Query:

SELECT
    packages_packages.*
FROM
    "packages_packages"
    INNER JOIN ml_candidates ON packages_packages.name = (concat('ml_candidate_', ml_candidates.id))
WHERE
    "ml_candidates"."id" IN (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 1)
ORDER BY
    "packages_packages"."id" ASC
LIMIT 1000

Query Plan:

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14140/commands/49511

FAQ

  1. Why not....
  • a simple Foreign Key?

The artifact is uploaded through MLFlow Client. GitLab gives the client an endpoint to the generic package, which is used for upload, but we have no hook to create the association once the package is created. Using the name of the package is how we managed to create a "virtual foreign key"

  • a scope, similar to including_metrics_and_params?

In theory the following code works when fetching the artifact for a single candidate:

has_one :package, 
        ->(candidate) { unscope(where: :candidate_id).find_by(project_id: candidate.project_id,
                                                                  name: candidate.package_name,
                                                                  version: candidate.package_version) },
        class_name: "Packages::Package"

But ActiveRecord doesn't support preloading instance dependent associations.

Another option:

has_one :package, -> do
      joins("INNER JOIN ml_candidates ON name = CONCAT('ml_candidate_', ml_candidates.iid)").unscope(:where)
    end, class_name: 'Packages::Package'

This almost works, but it generates the following query:

SELECT "packages_packages".* FROM "packages_packages" INNER JOIN "ml_candidates" ON name = CONCAT('ml_candidate_', ml_candidates.iid) WHERE "packages_packages"."candidate_id" IN ('1', '2', '3', '4', '5');

When it should be

SELECT "packages_packages".* FROM "packages_packages" INNER JOIN "ml_candidates" ON name = CONCAT('ml_candidate_', ml_candidates.iid) WHERE "ml_candidates"."candidate_id" IN ('1', '2', '3', '4', '5');

How to set up and validate locally

Tests should show the error is fixed

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #370480 (closed)

Edited by Eduardo Bonet

Merge request reports