Gitlab runner cache use AWS s3 multipart upload?
Status update (2023-07-17)
- R&D spike completed in 16.2 with a recommendation to develop a GitLab CI Cache Service.
- Solving the problem of supporting AWS or GCS multipart uploads is blocked pending work on the recommended solution to create an external caching service for GitLab CI.
Problem(s)
-
Uploading a large cache to AWS S3 can at times be slow or result in errors such as
context deadline exceeded (Client.Timeout exceeded while awaiting headers)
. In one customer escalation, the file to upload is ~600mb slack thread -
AWS has a single request limit of 5GB. If the cache is greater than the limit, then the upload will fail.
Other customer and user input
-
Why does gitlab-runner cache put object using self http client? The lib package in "github.com/minio/minio-go/v6" already has PresignedPutObject method with multipart uploads
-
When cache uploading large file (test >1.5G), the runner generates an error:
Additional details
-
Today in the runner, we upload the cache as one big blob to a
pre-signed
upload URL. However, using apre-signed
upload URL means that there is no sharing of credentials with the job environment. Thepre-signed url
means we don't need to share S3 credentials with the job environment, but it will be definitely less efficient than performing the upload function using, for example, the AWS CLI. -
context deadline exceeded (Client.Timeout exceeded while awaiting headers)
is from the Runner. When starting cache upload request we attach a context with defined timeout to it. That timeout is by default set to 10 minutes. If the request is not handled within that time,context
is cancelled and you see that error. And Runner will by default try to repeat the upload operation two more times. -
Users can define the
CACHE_REQUEST_TIMEOUT
. The default value =10 minutes.
Proposal
{placeholder for solution proposal pending the work on the linked spike.}