Skip to content

Draft: Enable intarray extension and create new index for traversal_ids

What does this MR do and why?

Describe in detail what your merge request does and why.

Screenshots or screen recordings

== 20210908093746 EnableIntArrayExtension: migrating ==========================
-- execute("CREATE EXTENSION IF NOT EXISTS intarray")
   -> 0.0161s
== 20210908093746 EnableIntArrayExtension: migrated (0.0162s) =================

== 20210908094535 AddNamespaceTraversalIdsIntArrayIndex: migrating ============
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:namespaces, :traversal_ids, {:using=>:gin, :name=>"index_namespaces_on_traversal_ids_int_array", :opclass=>{:name=>:gin__int_ops}, :algorithm=>:concurrently})
   -> 0.0090s
-- execute("SET statement_timeout TO 0")
   -> 0.0006s
-- add_index(:namespaces, :traversal_ids, {:using=>:gin, :name=>"index_namespaces_on_traversal_ids_int_array", :opclass=>{:name=>:gin__int_ops}, :algorithm=>:concurrently})
   -> 0.0042s
-- execute("RESET statement_timeout")
   -> 0.0006s
== 20210908094535 AddNamespaceTraversalIdsIntArrayIndex: migrated (0.0162s) ===

== 20210908095019 RemoveNamespaceTraversalIdsIndex: migrating =================
-- transaction_open?()
   -> 0.0000s
-- indexes(:namespaces)
   -> 0.0065s
-- remove_index(:namespaces, {:algorithm=>:concurrently, :name=>"index_namespaces_on_traversal_ids"})
   -> 0.0035s
== 20210908095019 RemoveNamespaceTraversalIdsIndex: migrated (0.0112s) ========

In postgres.ai, the creation time of this index was 1.467 min.

This change can make requests that usually times out to perform with a great time. For example:

SELECT namespaces.*
FROM
  (SELECT DISTINCT on(namespaces.id) namespaces.*
   FROM namespaces,

     (SELECT namespaces.id
      FROM namespaces
      INNER JOIN members ON namespaces.id = members.source_id
      WHERE members.type = 'GroupMember'
        AND members.source_type = 'Namespace'
        AND namespaces.type = 'Group'
        AND members.user_id = 1614863
        AND members.requested_at IS NULL
        AND (access_level >= 10)
        AND members.access_level IN (40,
                                     50)) base
   WHERE (namespaces.traversal_ids @> ARRAY[base.id])) namespaces

This query went from timing out to 1ms (query plan)

How to set up and validate locally

The index and the extensions are visible in the file structure.sql.

MR acceptance checklist

These checklists encourage us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Quality

  • Confirmed
  1. I have self-reviewed this MR per code review guidelines.
  2. For the code that that this change impacts, I believe that the automated tests (Testing Guide) validate functionality that is highly important to users (including consideration of all test levels). If the existing automated tests do not cover this functionality, I have added the necessary additional tests or I have added an issue to describe the automation testing gap and linked it to this MR.
  3. I have considered the technical aspects of the impact of this change on both gitlab.com hosted customers and self-hosted customers.
  4. I have considered the impact of this change on the front-end, back-end, and database portions of the system where appropriate and applied frontend, backend and database labels accordingly.
  5. I have tested this MR in all supported browsers, or determiend that this testing is not needed.
  6. I have confirmed that this change is backwards compatible across updates, or I have decided that this does not apply.
  7. I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?)
  8. If I am introducing a new expectation for existing data, I have confirmed that existing data meets this expectation or I have made this expectation optional rather than required.

Performance, reliability and availability

  • Confirmed
  1. I am confident that this MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines)
  2. I have added information for database reviewers in the MR description, or I have decided that it is unnecessary. (Does this MR have database-related changes?)
  3. I have considered the availability and reliability risks of this change. I have also considered the scalability risk based on future predicted growth
  4. I have considered the performance, reliability and availability impacts of this change on large customers who may have significantly more data than the average customer.

Documentation

  • Confirmed
  1. I have included changelog trailers, or I have decided that they are not needed. (Does this MR need a changelog?)
  2. I have added/updated documentation, or I have decided that documentation changes are not needed for this MR. (Is documentation required?)

Security

  • Confirmed
  1. I have confirmed that if this MR contains changes to processing or storing of credentials or tokens, authorization, and authentication methods, or other items described in the security review guidelines, I have added the label security and I have @-mentioned @gitlab-com/gl-security/appsec.

Deployment

  • Confirmed
  1. I have considered using a feature flag for this change because the change may be high risk. If I decided to use a feature flag, I plan to test the change in staging before I test it in production, and I have considered rolling it out to a subset of production customers before doing rolling it out to all customers. When to use a feature flag
  2. I have informed the Infrastructure department of a default setting or new setting change per definition of done, or decided that this is not needed.

Merge request reports