Skip to content

Recent issues search autocomplete

Dylan Griffith requested to merge 239166-recent-items-autocomplete into master

What does this MR do?

This adds support for autocompleting issues that were recently viewed by the current user when they use the search bar. This data is persisted in Redis sorted sets as this provides the ability to automatically expire the data for inactive users, keep the data sorted based on most recently viewed first and easily keep a limited number of items per user expiring the least recent ones first.

The recent item tracking is partly implemented as a generic solution as this will quickly be followed up with merge requests and perhaps other features people wish to quickly navigate to. The full generic refactoring will happen when we implement the 2nd usage but for now the most important thing is to include the data type in the Redis key so that we don't need to migrate data when we make this generic later.

This feature is behind a feature flag per user for now so we can track the performance implications in production.

Redis impact

I benchmarked this like:

Gitlab::Redis::SharedState.with do |redis|
  (200100..220100).each do |user_id|
    (100100..100200).each do |issue_id|
      key = "recent_items:issue:#{user_id}"
      redis.zadd(key, Time.now.to_f, issue_id)
    end
  end
end
  • Memory usage of a single user storing 100 ids: 2589 B = 2.5 KB
  • Memory stats total.allocated after storing 100 users with 100 ids each: 1148216 B = 1.1 MB
  • Memory stats total.allocated after storing ~20 users with 100 ids each: 54945008 B = 54 MB => ~2.7KB per user
MEMORY STATS - after adding 100 users with 100 items each
redis /home/dylan/workspace/gitlab-development-kit/redis/redis.socket> MEMORY STATS
 1) "peak.allocated"
 2) (integer) 1148168
 3) "total.allocated"
 4) (integer) 1148216
 5) "startup.allocated"
 6) (integer) 796296
 7) "replication.backlog"
 8) (integer) 0
 9) "clients.slaves"
10) (integer) 0
11) "clients.normal"
12) (integer) 66616
13) "aof.buffer"
14) (integer) 0
15) "lua.caches"
16) (integer) 216
17) "db.0"
18) 1) "overhead.hashtable.main"
    2) (integer) 5184
    3) "overhead.hashtable.expires"
    4) (integer) 104
19) "overhead.total"
20) (integer) 868416
21) "keys.count"
22) (integer) 104
23) "keys.bytes-per-key"
24) (integer) 3383
25) "dataset.bytes"
26) (integer) 279800
27) "dataset.percentage"
28) "79.506706237792969"
29) "peak.percentage"
30) "100.00418090820312"
31) "allocator.allocated"
32) (integer) 1716032
33) "allocator.active"
34) (integer) 2154496
35) "allocator.resident"
36) (integer) 10063872
37) "allocator-fragmentation.ratio"
38) "1.2555103302001953"
39) "allocator-fragmentation.bytes"
40) (integer) 438464
41) "allocator-rss.ratio"
42) "4.6711025238037109"
43) "allocator-rss.bytes"
44) (integer) 7909376
45) "rss-overhead.ratio"
46) "0.80301177501678467"
47) "rss-overhead.bytes"
48) (integer) -1982464
49) "fragmentation"
50) "7.298853874206543"
51) "fragmentation.bytes"
52) (integer) 6974192
MEMORY STATS - after adding ~20k users with 100 items each
redis /home/dylan/workspace/gitlab-development-kit/redis/redis.socket> MEMORY STATS
 1) "peak.allocated"
 2) (integer) 54903928
 3) "total.allocated"
 4) (integer) 54945008
 5) "startup.allocated"
 6) (integer) 796296
 7) "replication.backlog"
 8) (integer) 0
 9) "clients.slaves"
10) (integer) 0
11) "clients.normal"
12) (integer) 99388
13) "aof.buffer"
14) (integer) 0
15) "lua.caches"
16) (integer) 216
17) "db.0"
18) 1) "overhead.hashtable.main"
    2) (integer) 1076864
    3) "overhead.hashtable.expires"
    4) (integer) 104
19) "overhead.total"
20) (integer) 1972868
21) "keys.count"
22) (integer) 20368
23) "keys.bytes-per-key"
24) (integer) 2658
25) "dataset.bytes"
26) (integer) 52972140
27) "dataset.percentage"
28) "97.8271484375"
29) "peak.percentage"
30) "100.07482147216797"
31) "allocator.allocated"
32) (integer) 55096536
33) "allocator.active"
34) (integer) 55447552
35) "allocator.resident"
36) (integer) 59219968
37) "allocator-fragmentation.ratio"
38) "1.0063709020614624"
39) "allocator-fragmentation.bytes"
40) (integer) 351016
41) "allocator-rss.ratio"
42) "1.0680357217788696"
43) "allocator-rss.bytes"
44) (integer) 3772416
45) "rss-overhead.ratio"
46) "1.0364503860473633"
47) "rss-overhead.bytes"
48) (integer) 2158592
49) "fragmentation"
50) "1.1179265975952148"
51) "fragmentation.bytes"
52) (integer) 6474632

Redis keys will expire after 7 days so we need enough overhead for what we expect to be the number of users that would look at issues within a 7 day period. Assuming 1 million (which seems very high) this would be about 2.7 GB of memory which according to our Redis monitoring should still easily fit.

Screenshots

Screen_Shot_2020-08-31_at_9.45.03_am

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Closes #239166 (closed)

Edited by Dylan Griffith

Merge request reports