Skip to content

Classify and fan out cross-database query issues

Next step of #337077 (closed).

For each failing spec in https://gitlab.com/gitlab-org/gitlab/-/blob/8d6e6e652e61aab61f06dc3e07f169de0a0a23d7/.cross-join-allowlist.yml, create issues and fan out to groups

Background

In #337077 (closed), we disallowed any queries that joined between tables in main: and ci: databases. e.g. (SELECT * from projects INNER JOIN ci_builds ON ci_builds.project_id = projects.id is forbidden).

There are of course existing queries that cross databases. These are allowlisted by allowing the specs that produce them, the list is in spec/support/database/cross-join-allowlist.yml.

In some cases, the spec allowlist is probably too broad so we switch to using an alternative allowlist method using allow_cross_joins_across_databases. This also allows us to annotate with the issue link where we will be fixing the cross-database query.

Problem

There are 418 specs allowlisted which means there are at least that many cross-database queries. Potentially, there are no issues opened for these queries, though a majority should have already been discovered from the PoC MR.

Even so, the more allow_cross_joins_across_databases places we can add, the less we need to rely on the PoC MR (and stop needing to rebase the PoC MR every so often, which is painful)

Proposal

  1. Gather all candidate call sites - DONE using !69037 (closed)
  2. Open relevant issues, and also add allow_cross_joins_across_databases where possible. (see #337077 (closed)) - DONE
  3. Add ::Gitlab::Database.allow_cross_joins_across_databases to the offending call sites. - DONE
  4. Write a way to automatically regenerate the .cross-join-allowlist.yml allowlist so that we can shrink this over time. - IN PROGRESS We can use !69037 (closed) for now

New issues opened

Existing issue linked

Edited by Thong Kuah