Allow ElasticSearch to be enabled for a subset of projects / groups
ElasticSearch is currently disabled on GitLab.com because we're having trouble getting the indexing to scale to that level, although many happy customers are using it on-premises.
A proposal that came up in discussion with @pcarranza was to allow ElasticSearch to be progressively enabled - we'd have a list of projects and groups, or perhaps a regular expression to select a subset, and only matching projects would be indexed/searched using ElasticSearch. The remainder would be searched in the current way.
Advantages:
- This would allow us to gather data on how our code is performing in production at the moment
- Start with just a few projects enabled
- Add instrumentation until we understand what's happening
- Add projects and groups until we run into performance problems
- FIX THE PROBLEMS
- Watch performance improve
- Enable more projects
- Shorter cycle times!
- We can get feedback at the threshold of performance problems, rather than it being an all-or-nothing proposition
- Won't have to disable ES completely while solving problems
Disadvantages:
- It's not a very useful feature in its own right, although being able to exclude certain projects from code search is a potentially valid use case
- We'd need to conduct two searches (in parallel, to minimize performance degradation) - database and ES - then merge the results
I was quite enthusiastic about this idea when I first had it, but I'm having second thoughts now. Should I do it for 8.17? Can we think of a different way to progressively enable ElasticSearch on GitLab.com? /cc @maratkalibek @DouweM @vsizov