Searching for users is incredibly inefficient

Searching for users relies on a LIKE condition that searches at any position in a value (using %VALUE%). This is incredibly inefficient as both PostgreSQL and MySQL are unable to use an index, resulting in a sequence scan over the entire users table. For example, take the following query:

explain analyze select * from users where lower(name) like '%yorickpeterse%' or lower(email) like '%yorickpeterse%' or lower(username) like '%yorickpeterse%';

This spits out the following plan:

                                                                                QUERY PLAN                                                                                 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on users  (cost=0.00..31684.90 rows=127 width=679) (actual time=406.114..543.142 rows=1 loops=1)
   Filter: ((lower((name)::text) ~~ '%yorickpeterse%'::text) OR (lower((email)::text) ~~ '%yorickpeterse%'::text) OR (lower((username)::text) ~~ '%yorickpeterse%'::text))
   Rows Removed by Filter: 414573
 Total runtime: 543.193 ms
(4 rows)

Because we also have to support MySQL (ugh) we can't use (for example) PostgreSQL's full text search capabilities. One alternative is to use VALUE% instead of %VALUE% as in this case at least PostgreSQL can use an index. The downside of this approach is that you will only get matches when your query matches the start of a value (e.g. "peterse" won't match "yorickpeterse" but "yorick" will).

cc @jacobvosmaer @jnijhof

Assignee Loading
Time tracking Loading