Asynchronously rewrite git history
Problem
Related to https://gitlab.com/gitlab-org/gitlab/-/issues/450701+.
In 17.2 we implemented a new feature Remove blobs that allows to reduce the repository size by removing blobs or redacting text in repositories.
However, this feature doesn't work with large repositories. Here's why:
- History rewriting is computationally intensive, especially for large repos.
- The rewriting process can take several minutes to complete.
- Our application has a 60-second timeout for client requests.
As a result, the process often fails for large repositories because it takes too long and exceeds the timeout limit. For more details, see issue GRPC::DeadlineExceeded exception when removing ... (#475250 - closed).
Proposal
Use an asynchronous processing instead of synchronous.
This approach will:
- Solve a timeout problem, because Sidekiq doesn't have a 60 seconds limitation.
- Improve the customer experience. The customer doesn't have to keep the page open during the whole rewrite history process.
To implement that we can:
- Add support for asynchronous processing on backend side
- Update UX to indicate that
Remove blobsandRedact textprocesses won't be completed immediately and provide notifications to the user about current status of the processing. - Update frontend to support new changes
We can also consider using GraphQL subscriptions to receive real-time updates from the rewrite history process. But it will require extra work on backend and frontend to implement and can be done as a follow-up.