Implement a percentage based rollout for ElasticSearch on GitLab.com
We believe rolling out ElasticSearch to all our GitLab.com projects will mean a very large volume of data being indexed and searched. It may not be safe to do this as an all or nothing rollout since it doesn't give us enough time to react to problems and scale out our infrastructure or make indexing/searching more efficient. It would also be a lot of manual effort of enabling then rolling back constantly as we learn about a new scaling challenge.
Roll out to a percentage of groups at a time starting with Gold groups.
This will require some changes to GitLab to support this in a sensible way.
Currently we have an ability to limit the groups that are being indexed/searched in Elasticsearch but it has the following problems:
- It likely does not handle very large numbers of groups in the list (it was only designed to be used for a few groups) and so the admin UI will probably break or timeout when there are hundreds or thousands in this list. It may also have performance impacts in other parts of the code when we check this list.
- This was intended to be used in such a way that we would enable for a set of groups then we'd allow the indexing to finish before enabling it for searching.
- Apart from clicking through the UI or writing one off scripts for the console there is no controlled way to roll this out to large numbers of groups
Extend this logic of rolling out to groups
In order to solve the above problems we'll want to:
- Adapt this feature so that it does not display all the groups that are part of the rollout in the admin UI when the number exceeds some sensible limit (eg. 20)
- Ensure in all places this logic is being used that it scales sensibly when there are thousands of groups in the rollout
- Set an extra boolean
index_statuses.records_initially_indexedindicating that we've finished
Elastic::IndexRecordService#initial_import_projectfor the given project
- Update our logic in
use_elasticsearch?to ensure that all projects within the current scope (ie. all projects in the group or just this project for project search) have the
- Create a script that can be run from rails console to enable for large numbers of groups at a time
- Ensure that you can remove groups from the rollout without data loss or bugs so that they stop being indexed and searched in case any parts of the system start to become overloaded
Validate how the different features behave when there are 100,000 namespaces and 100,000 projects enabled
- What queries are happening when loading the search page scoped to one of those groups
- What queries are happening when loading the search page scoped to a different group that is not enabled
- What queries are happening when loading the search page scoped to one of those projects
- What queries are happening when loading the search page scoped to a different project that is not enabled
- Hide projects/namespaces when there are more than 50 in the admin UI
- When there are more than 50 in the admin UI show message explaining you must manage them via API
- Add an API to trigger rollout to percentages at a time (admin only)
Store status as
index_statuses.records_initially_indexedafter indexing all DB records is completed for a project
Determine whether or not to use Elasticsearch based on non-empty SHA in
IndexStatus#records_initially_indexed? => true