Skip to content

Import fails when using S3 for Merge Request Diffs

Summary

When using Object Storage for Merge Request Diffs, the Sidekiq nodes get confused and fail.

Steps to reproduce

From a large prospect:

We have multiple servers behind a load balancer. And we're using gitlab_rails['external_diffs_enabled' with S3 storage. Some of the import sidekiq jobs are failing attempting to find a merge request diff in /var/opt/gitlab/gitlab-rails/shared/external-diffs/. Sure enough, there are some files in this directory on our multiple machines. It kinda looks like one job creates a local file then another job runs to try to upload it to S3. But the job runs on a separate machine and can't find the local file! We can work around by stopping the sidekiq jobs on all but 1 machine so local files are only written to one server. I suppose we could also use an NFS mount for /var/opt/gitlab/gitlab-rails/shared ...

A related issue is that all of the files are going into a single directory. We have >250,000 files in a directory. Most filesystems have scaling issues with that many files in a directory. It would be a good idea to shard the files across hundreds of directories so there is an upper bound on number of files in a directory.

/cc @MikeWalsh @tipyn