Skip to content

Improve label search with issuable optimization

What does this MR do?

This MR addresses the label finder problem which was discovered when the optimized_issuable_label_filter feature flag was enabled on PRD (#259719 (closed))

The original query is quite slow (cached execution): https://explain.depesz.com/s/QsW3

Note: the change is behind a feature flag. Later we'd like to enable it on production for a short period of time to measure the performance.

New query:

  • Skips the ORDER BY, we don't care about the order.
  • Uses UNION ALL since having duplicated results is acceptable (unlikely though)
  • Matches the index on group_id and project_id.
    • For this I had to replace an existing index which was "broken".
SELECT "labels"."title",
       "labels"."id"
FROM (
        (SELECT "labels".*
         FROM "labels"
         WHERE "labels"."project_id" IS NULL
           AND "labels"."title" IN ('bug')
           AND "labels"."group_id" IN
             (WITH RECURSIVE "base_and_descendants" AS (
                                                          (SELECT "namespaces".*
                                                           FROM "namespaces"
                                                           WHERE "namespaces"."type" = 'Group'
                                                             AND "namespaces"."id" = 9970)
                                                        UNION
                                                          (SELECT "namespaces".*
                                                           FROM "namespaces",
                                                                "base_and_descendants"
                                                           WHERE "namespaces"."type" = 'Group'
                                                             AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id"
              FROM "base_and_descendants" AS "namespaces"))
      UNION ALL
        (SELECT "labels".*
         FROM "labels"
         WHERE "labels"."group_id" is NULL
           AND "labels"."title" IN ('bug')
           AND "labels"."project_id" IN
             (SELECT "projects"."id"
              FROM "projects"
              WHERE "projects"."namespace_id" IN
                  (WITH RECURSIVE "base_and_descendants" AS (
                                                               (SELECT "namespaces".*
                                                                FROM "namespaces"
                                                                WHERE "namespaces"."type" = 'Group'
                                                                  AND "namespaces"."id" = 9970)
                                                             UNION
                                                               (SELECT "namespaces".*
                                                                FROM "namespaces",
                                                                     "base_and_descendants"
                                                                WHERE "namespaces"."type" = 'Group'
                                                                  AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id"
                   FROM "base_and_descendants" AS "namespaces")))) labels

The uncached measurements were <1s on db-lab.

Index replacement

I noticed that we have a weird index definition for group_id, title:

"index_labels_on_group_id_and_title" btree (group_id, title) WHERE project_id = NULL::integer

Inspecting the index, it seems that it's empty:

Schema |                Name                | Type  | Owner  | Table  |    Size    | Description
--------+------------------------------------+-------+--------+--------+------------+-------------
 public | index_labels_on_group_id_and_title | index | gitlab | labels | 8192 bytes |
(1 row)

Verifying that the index is not in use:

 schemaname | tablename |             indexname              |  num_rows  | table_size | index_size | unique | number_of_scans | tuples_read | tuples_fetched
------------+-----------+------------------------------------+------------+------------+------------+--------+-----------------+-------------+----------------
 public     | labels    | index_labels_on_group_id_and_title | 1.2925e+07 | 1276 MB    | 8192 bytes | N      |               0 |           0 |              0
(1 row)

Up:

== 20201120125953 ReplaceUnusedLabelsIndex: migrating =========================
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:labels, [:group_id, :title], {:where=>"project_id IS NULL", :name=>"index_on_labels_group_id_and_title_with_null_project_id", :algorithm=>:concurrently})
   -> 0.0027s
-- add_index(:labels, [:group_id, :title], {:where=>"project_id IS NULL", :name=>"index_on_labels_group_id_and_title_with_null_project_id", :algorithm=>:concurrently})
   -> 0.0062s
-- transaction_open?()
   -> 0.0000s
-- indexes(:labels)
   -> 0.0032s
-- remove_index(:labels, {:algorithm=>:concurrently, :name=>"index_labels_on_group_id_and_title"})
   -> 0.0019s
== 20201120125953 ReplaceUnusedLabelsIndex: migrated (0.0149s) ================

Down:

== 20201120125953 ReplaceUnusedLabelsIndex: reverting =========================
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:labels, [:group_id, :title], {:where=>"project_id = NULL::integer", :name=>"index_labels_on_group_id_and_title", :algorithm=>:concurrently})
   -> 0.0036s
-- execute("SET statement_timeout TO 0")
   -> 0.0006s
-- add_index(:labels, [:group_id, :title], {:where=>"project_id = NULL::integer", :name=>"index_labels_on_group_id_and_title", :algorithm=>:concurrently})
   -> 0.0153s
-- execute("RESET ALL")
   -> 0.0006s
-- transaction_open?()
   -> 0.0000s
-- indexes(:labels)
   -> 0.0029s
-- remove_index(:labels, {:algorithm=>:concurrently, :name=>"index_on_labels_group_id_and_title_with_null_project_id"})
   -> 0.0015s
== 20201120125953 ReplaceUnusedLabelsIndex: reverted (0.0259s) ================

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Adam Hegyi

Merge request reports