Skip to content

maintenance: Override deadline for RPC call

Pavlo Strokov requested to merge ps-fix-optimize-job-cancellation into master

Gitaly service runs a maintenance job each day for a certain period of time. The log shows that OptimizeRepository RPC fails with DeadlineExceeded error. The root cause is a usage of the context that is limited by the duration of the maintenance. When job is out of maintenance window the context is cancelled and RPC is cancelled as well. Instead, we should handle it gracefully. That is why before RPC call we first suppress context cancellation and them apply a new default timeout for it. That protects execution from cancellation of the parent context as well as cancels long running RPC calls. If we don't cancel long running RPC the other repositories will suffer from not being optimized for a long time.

Closes: #3963 (closed)

Merge request reports