Optimize autocomplete of items in issues table
Summary
The issues autocomplete system we use is extremely naive and is only suitable for very small issue sets.
For the gitlab-org/gitlab
project, it transfers nearly 4MB of data, and takes over ten seconds, because it sends information for all open issues (of which there are 35k) and does not make use of the user input at all to limit the response.
Steps to reproduce
On any gitlab-org/gitlab
GFM text field type #
and observe the network requests.
An example request would be:
curl 'https://gitlab.com/gitlab-org/gitlab/-/autocomplete_sources/issues'
Example Project
This project - https://gitlab.com/gitlab-org/gitlab
What is the current bug behavior?
Typing #123
transfers vast amounts of irrelevant data, wasting the user's bandwidth and memory.
What is the expected correct behavior?
Typing #123
only ends up transferring data for open issues with IID's starting with 123
, which is a radically smaller subset.
For me this took 7.16 to return a 304 response (indicating that I didn't even end up transferring the data, and could use my cache), and the data is 3.73MB.
Output of checks
This bug happens on GitLab.com
Possible fixes
Add a string version of issue.iid
in a separate column, add an appropriate index that supports prefix matching (a Trie would work) and pass at minimum a trigram of user input (i.e. wait until they have entered at least three digits).
This strategy might need to be enabled only for projects with a lot of issues (e.g. more than 1k open issues), since it is more complex and with less than 1k open issues one will never enter a trigram and need completion.