ElasticSearch indexing on web
What does this MR do?
Instead of the current horizontal indexing (indexing all of DB first, then indexing all repos), a more vertical approach is offered, which indexes one project's database content and repository content as one unit of operation. This is done using background jobs.
This way of indexing on a project level is more scalable, as each job takes less time, and can survive across deployments. It is also easier to handle in case of failure, as we don't have to index from beginning again.
This is triggered through a button in the admin page. In the future this will be expanded to include more options. This behind a feature flag, so we can test the performance first, and also give time for UI to be polished.
Close #5299 (closed), as this can solve indexing gap issue.
Screenshots
the button to trigger indexing
flash message to indicate successful scheduling and link to queue
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation created/updated or follow-up review issue created -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Performance and testing
Currently the rake task would take 1 day to complete the indexing on the database portion. After switching to background jobs, this will take longer (as each job is executed 2 minutes apart from each other).
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.