ElasticCommitIndexerWorker break indexing into smaller incremental steps
Background
Related: gitlab-com/gl-infra/production#8391 (closed)
ElasticCommitIndexerWorker runtime is very dependent on the repository size / number of commits. However, having the timeout for the ElasticCommitIndexerWorker set to 24 hours is problematic for a few reasons:
- There is a recommendation of keeping jobs under 5 minutes: https://docs.gitlab.com/ee/development/sidekiq/worker_attributes.html#job-urgency
- This affects deployments
- We can't replace that pod until it's finished
Proposal
Some ideas from the team during debugging of the related incident. We need to get a solid plan technical in place before scheduling this work:
- Break the indexing into smaller incremental steps, so that we can effectively paginate through
- we can schedule them in the future with sidekiq, so that we don't clog the entire queue with the single job (gives fairness in case a single repo has a ton of files)
- I think we'd need to split it by files then. We'd generate the diff and schedule batches of files for reindexing