Optimize autocomplete of items in issues table

Summary

The issues autocomplete system we use is extremely naive and is only suitable for very small issue sets.

For the gitlab-org/gitlab project, it transfers nearly 4MB of data, and takes over ten seconds, because it sends information for all open issues (of which there are 35k) and does not make use of the user input at all to limit the response.

Steps to reproduce

On any gitlab-org/gitlab GFM text field type # and observe the network requests.

An example request would be:

curl 'https://gitlab.com/gitlab-org/gitlab/-/autocomplete_sources/issues'

Example Project

This project - https://gitlab.com/gitlab-org/gitlab

What is the current bug behavior?

Typing #123 transfers vast amounts of irrelevant data, wasting the user's bandwidth and memory.

What is the expected correct behavior?

Typing #123 only ends up transferring data for open issues with IID's starting with 123, which is a radically smaller subset.

For me this took 7.16 to return a 304 response (indicating that I didn't even end up transferring the data, and could use my cache), and the data is 3.73MB.

Output of checks

This bug happens on GitLab.com

Possible fixes

Add a string version of issue.iid in a separate column, add an appropriate index that supports prefix matching (a Trie would work) and pass at minimum a trigram of user input (i.e. wait until they have entered at least three digits).

This strategy might need to be enabled only for projects with a lot of issues (e.g. more than 1k open issues), since it is more complex and with less than 1k open issues one will never enter a trigram and need completion.

Edited Dec 06, 2021 by Alex Kalderimis