Skip to content
Snippets Groups Projects

Allow secondary emails in user search

Merged Catalin Irimie requested to merge cat-user-search-secondary-emails into master
All threads resolved!

What does this MR do?

Allows searching users by email using secondary emails as well - implements #26110 (closed).

User searching by email only happened with an equality check for the primary email. We already had a search_with_secondary_emails helper however inefficient because it was using an IN condition.

Since there is a unique constraint on secondary emails as well, that can be updated with a strict equality check and make the query perform similarly to the one without secondary emails check.

That allows us to default to searching through secondary emails as well (when using the API, project member list etc).

Uses the :user_search_secondary_email feature flag, rollout issue in #282137 (closed).

Query plans

Examples using my (primary) email address:

Before

Depesz and raw SQL

Expand for plan & raw SQL
SELECT
    "users".*
FROM
    "users"
WHERE (("users"."name" ILIKE '%cirimie@gitlab.com%'
        OR "users"."username" ILIKE '%cirimie@gitlab.com%')
    OR "users"."email" = 'cirimie@gitlab.com')
ORDER BY
    CASE WHEN users.name = 'cirimie@gitlab.com' THEN
        0
    WHEN users.username = 'cirimie@gitlab.com' THEN
        1
    WHEN users.email = 'cirimie@gitlab.com' THEN
        2
    ELSE
        3
    END,
    "users"."name" ASC
Limit  (cost=3236.63..3236.68 rows=20 width=1266) (actual time=83.767..83.767 rows=1 loops=1)
  Buffers: shared hit=2586 read=145 written=9
  I/O Timings: read=5.174 write=0.365
  ->  Sort  (cost=3236.63..3240.22 rows=1436 width=1266) (actual time=83.765..83.766 rows=1 loops=1)
        Sort Key: (CASE WHEN ((name)::text = 'cirimie@gitlab.com'::text) THEN 0 WHEN ((username)::text = 'cirimie@gitlab.com'::text) THEN 1 WHEN ((email)::text = 'cirimie@gitlab.com'::text) THEN 2 ELSE 3 END), name
        Sort Method: quicksort  Memory: 27kB
        Buffers: shared hit=2586 read=145 written=9
        I/O Timings: read=5.174 write=0.365
        ->  Bitmap Heap Scan on users  (cost=807.41..3198.41 rows=1436 width=1266) (actual time=83.735..83.735 rows=1 loops=1)
              Recheck Cond: (((name)::text ~~* '%cirimie@gitlab.com%'::text) OR ((username)::text ~~* '%cirimie@gitlab.com%'::text) OR ((email)::text = 'cirimie@gitlab.com'::text))
              Heap Blocks: exact=1
              Buffers: shared hit=2580 read=145 written=9
              I/O Timings: read=5.174 write=0.365
              ->  BitmapOr  (cost=807.41..807.41 rows=1436 width=0) (actual time=83.708..83.708 rows=0 loops=1)
                    Buffers: shared hit=2579 read=145 written=9
                    I/O Timings: read=5.174 write=0.365
                    ->  Bitmap Index Scan on index_users_on_name_trigram  (cost=0.00..426.75 rows=700 width=0) (actual time=45.048..45.048 rows=0 loops=1)
                          Index Cond: ((name)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1453 read=89 written=8
                          I/O Timings: read=1.598 write=0.328
                    ->  Bitmap Index Scan on index_users_on_username_trigram  (cost=0.00..377.52 rows=736 width=0) (actual time=38.499..38.499 rows=0 loops=1)
                          Index Cond: ((username)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1123 read=55 written=1
                          I/O Timings: read=3.560 write=0.037
                    ->  Bitmap Index Scan on users_email_key  (cost=0.00..2.07 rows=1 width=0) (actual time=0.159..0.159 rows=1 loops=1)
                          Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
                          Buffers: shared hit=3 read=1
                          I/O Timings: read=0.015
Planning Time: 15.935 ms
Execution Time: 84.005 ms

After

Depesz and raw SQL

Expand for plan & raw SQL
SELECT
    "users".*
FROM
    "users"
WHERE ((("users"."name" ILIKE '%cirimie@gitlab.com%'
            OR "users"."username" ILIKE '%cirimie@gitlab.com%')
        OR "users"."email" = 'cirimie@gitlab.com')
    OR "users"."id" = (
        SELECT
            "emails"."user_id"
        FROM
            "emails"
        WHERE
            "emails"."email" = 'cirimie@gitlab.com'))
ORDER BY
    CASE WHEN users.name = 'cirimie@gitlab.com' THEN
        0
    WHEN users.username = 'cirimie@gitlab.com' THEN
        1
    WHEN users.email = 'cirimie@gitlab.com' THEN
        2
    ELSE
        3
    END,
    "users"."name" ASC
Limit  (cost=3075.22..3075.27 rows=20 width=1266) (actual time=65.071..65.071 rows=1 loops=1)
  Buffers: shared hit=2543 read=76
  I/O Timings: read=7.962
  InitPlan 1 (returns $0)
    ->  Index Scan using emails_email_key on emails  (cost=0.42..3.44 rows=1 width=4) (actual time=0.061..0.061 rows=0 loops=1)
          Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
          Buffers: shared hit=3
  ->  Sort  (cost=3071.78..3075.38 rows=1437 width=1266) (actual time=65.069..65.070 rows=1 loops=1)
        Sort Key: (CASE WHEN ((users.name)::text = 'cirimie@gitlab.com'::text) THEN 0 WHEN ((users.username)::text = 'cirimie@gitlab.com'::text) THEN 1 WHEN ((users.email)::text = 'cirimie@gitlab.com'::text) THEN 2 ELSE 3 END), users.name
        Sort Method: quicksort  Memory: 27kB
        Buffers: shared hit=2543 read=76
        I/O Timings: read=7.962
        ->  Bitmap Heap Scan on users  (cost=637.21..3033.54 rows=1437 width=1266) (actual time=65.034..65.035 rows=1 loops=1)
              Recheck Cond: (((name)::text ~~* '%cirimie@gitlab.com%'::text) OR ((username)::text ~~* '%cirimie@gitlab.com%'::text) OR ((email)::text = 'cirimie@gitlab.com'::text) OR (id = $0))
              Heap Blocks: exact=1
              Buffers: shared hit=2537 read=76
              I/O Timings: read=7.962
              ->  BitmapOr  (cost=637.21..637.21 rows=1437 width=0) (actual time=64.996..64.996 rows=0 loops=1)
                    Buffers: shared hit=2536 read=76
                    I/O Timings: read=7.962
                    ->  Bitmap Index Scan on index_users_on_name_trigram  (cost=0.00..420.75 rows=700 width=0) (actual time=42.937..42.937 rows=0 loops=1)
                          Index Cond: ((name)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1494 read=44
                          I/O Timings: read=6.909
                    ->  Bitmap Index Scan on index_users_on_username_trigram  (cost=0.00..211.02 rows=736 width=0) (actual time=21.882..21.882 rows=0 loops=1)
                          Index Cond: ((username)::text ~~* '%cirimie@gitlab.com%'::text)
                          Buffers: shared hit=1036 read=31
                          I/O Timings: read=1.040
                    ->  Bitmap Index Scan on users_email_key  (cost=0.00..2.07 rows=1 width=0) (actual time=0.106..0.106 rows=1 loops=1)
                          Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
                          Buffers: shared hit=3 read=1
                          I/O Timings: read=0.012
                    ->  Bitmap Index Scan on users_pkey  (cost=0.00..1.94 rows=1 width=0) (actual time=0.067..0.067 rows=0 loops=1)
                          Index Cond: (id = $0)
                          Buffers: shared hit=3
Planning Time: 15.475 ms
Execution Time: 65.250 ms

Without the IN to EQ conversion

Depesz and raw SQL

Expand for plan & raw SQL
SELECT
    "users".*
FROM
    "users"
WHERE ((("users"."name" ILIKE '%cirimie@gitlab.com%'
            OR "users"."username" ILIKE '%cirimie@gitlab.com%')
        OR "users"."email" = 'cirimie@gitlab.com')
    OR "users"."id" IN (
        SELECT
            "emails"."user_id"
        FROM
            "emails"
        WHERE
            "emails"."email" = 'cirimie@gitlab.com'))
ORDER BY
    CASE WHEN users.name = 'cirimie@gitlab.com' THEN
        0
    WHEN users.username = 'cirimie@gitlab.com' THEN
        1
    WHEN users.email = 'cirimie@gitlab.com' THEN
        2
    ELSE
        3
    END,
    "users"."name" ASC
Limit  (cost=1884193.14..1884195.47 rows=20 width=1266) (actual time=6649.961..6649.963 rows=1 loops=1)
  Buffers: shared hit=144594 read=1694 written=58
  I/O Timings: read=129.816 write=2.880
  ->  Gather Merge  (cost=1884193.14..2241904.11 rows=3065880 width=1266) (actual time=6649.959..7213.782 rows=1 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        Buffers: shared hit=437345 read=5103 written=181
        I/O Timings: read=402.003 write=6.289
        ->  Sort  (cost=1883193.12..1887025.47 rows=1532940 width=1266) (actual time=6643.448..6643.449 rows=0 loops=3)
              Sort Key: (CASE WHEN ((users.name)::text = 'cirimie@gitlab.com'::text) THEN 0 WHEN ((users.username)::text = 'cirimie@gitlab.com'::text) THEN 1 WHEN ((users.email)::text = 'cirimie@gitlab.com'::text) THEN 2 ELSE 3 END), users.name
              Sort Method: quicksort  Memory: 27kB
              Worker 0:  Sort Method: quicksort  Memory: 25kB
              Worker 1:  Sort Method: quicksort  Memory: 25kB
              Buffers: shared hit=437345 read=5103 written=181
              I/O Timings: read=402.003 write=6.289
              ->  Parallel Seq Scan on users  (cost=3.44..1842402.13 rows=1532940 width=1266) (actual time=6643.258..6643.374 rows=0 loops=3)
                    Filter: (((name)::text ~~* '%cirimie@gitlab.com%'::text) OR ((username)::text ~~* '%cirimie@gitlab.com%'::text) OR ((email)::text = 'cirimie@gitlab.com'::text) OR (hashed SubPlan 1))
                    Rows Removed by Filter: 2452186
                    Buffers: shared hit=437307 read=5103 written=181
                    I/O Timings: read=402.003 write=6.289
                    SubPlan 1
                      ->  Index Scan using emails_email_key on emails  (cost=0.42..3.44 rows=1 width=4) (actual time=0.107..0.108 rows=0 loops=3)
                            Index Cond: ((email)::text = 'cirimie@gitlab.com'::text)
                            Buffers: shared hit=11
Planning Time: 3.718 ms
Execution Time: 7213.945 ms

7 seconds since it wouldn't be able to use Bitmap scans anymore

Screenshots (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Catalin Irimie

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Catalin Irimie
  • Catalin Irimie
  • Catalin Irimie added 1 commit

    added 1 commit

    • 7e5a6b0a - Allow secondary emails in user search

    Compare with previous version

  • Catalin Irimie changed milestone to %13.6

    changed milestone to %13.6

  • Catalin Irimie marked the checklist item Changelog entry as completed

    marked the checklist item Changelog entry as completed

  • Catalin Irimie assigned to @manojmj and unassigned @cat

    assigned to @manojmj and unassigned @cat

  • Catalin Irimie marked this merge request as ready

    marked this merge request as ready

  • Catalin Irimie changed the description

    changed the description

  • Manoj M J
  • Manoj M J
  • unassigned @manojmj

  • added database label

  • Catalin Irimie added 1 commit

    added 1 commit

    Compare with previous version

  • assigned to @dstull

  • Catalin Irimie changed the description

    changed the description

  • Doug Stull
  • Doug Stull
  • Doug Stull
  • Doug Stull assigned to @cat and unassigned @dstull

    assigned to @cat and unassigned @dstull

  • Setting label(s) devopsmanage sectiondev based on ~"group::access".

  • Catalin Irimie added 343 commits

    added 343 commits

    • 9f69415d...10332fe7 - 339 commits from branch master
    • cf6c99e6 - Allow secondary emails in user search
    • 78d22d3e - Fix linting errors
    • f0cd7154 - Update user search tests to let_it_be, no delegate
    • edc6428e - Return just a secondary email when searching users

    Compare with previous version

  • Catalin Irimie assigned to @dstull and unassigned @cat

    assigned to @dstull and unassigned @cat

  • Doug Stull assigned to @toon and unassigned @dstull

    assigned to @toon and unassigned @dstull

  • added databasereviewed label and removed databasereview pending label

  • Doug Stull approved this merge request

    approved this merge request

  • Toon Claes
  • Toon Claes
  • Toon Claes
  • Toon Claes changed milestone to %13.7

    changed milestone to %13.7

  • Toon Claes
  • Toon Claes approved this merge request

    approved this merge request

    • Resolved by Sean McGivern

      @cat Awesome work. From database this looks great, so I'm gonna approve.

      But I've added a few comments on the specs. They wouldn't be really blocking to me, but the milestone in config/feature_flags/development/user_search_secondary_email.yml is (sorry for the delay on the review), so would you mind addressing the comments on the specs as well. FYI I'm OOO on Friday (tomorrow) so if you like you can ask another database or backend maintainer to merge.

  • added databaseapproved label and removed databasereviewed label

  • Toon Claes assigned to @cat and unassigned @toon

    assigned to @cat and unassigned @toon

  • Catalin Irimie added 1022 commits

    added 1022 commits

    • edc6428e...4cb7ff43 - 1017 commits from branch master
    • 440c5627 - Allow secondary emails in user search
    • 11e728c3 - Fix linting errors
    • dbb4caf5 - Update user search tests to let_it_be, no delegate
    • 240cf2e8 - Return just a secondary email when searching users
    • 863096ba - Update specs, change milestone

    Compare with previous version

  • Catalin Irimie assigned to @smcgivern and unassigned @cat

    assigned to @smcgivern and unassigned @cat

  • Sean McGivern resolved all threads

    resolved all threads

  • Sean McGivern enabled an automatic merge when the pipeline for f62026ab succeeds

    enabled an automatic merge when the pipeline for f62026ab succeeds

  • merged

  • Sean McGivern mentioned in commit 5fc1123b

    mentioned in commit 5fc1123b

  • added workflowcanary label and removed workflowstaging label

  • added workflowproduction label and removed workflowcanary label

  • mentioned in issue #26110 (closed)

  • Catalin Irimie mentioned in merge request !49312 (merged)

    mentioned in merge request !49312 (merged)

  • Please register or sign in to reply
    Loading