EOFError: multipart data over retained size limit when committing large files via Repository Files API with multipart/form-data
Summary
Users encounter an EOFError: multipart data over retained size limit error when attempting to create or update files larger than 16MB via the Repository Files API (POST/PUT /api/v4/projects/:id/repository/files/:file_path) using multipart/form-data content type.
Root Cause
The error originates from Rack's multipart parser which has a hardcoded memory buffer limit of 16MB (BUFFERED_UPLOAD_BYTESIZE_LIMIT) for non-file form fields.
When processing multipart/form-data requests:
- Workhorse intercepts the request and saves the entire request body to a temporary file (up to 300MB)
- Workhorse forwards metadata about the saved file to Rails
- Rails'
file_params_from_body_uploadmethod re-parses the saved file usingRack::Multipart.parse_multipart - The
contentfield (containing the file data to be committed) is sent as a regular form field without a filename - Rack's parser determines how to handle each field based on the presence of a filename:
- With filename →
TempfilePart→ streams to disk (no memory limit) - Without filename →
BufferPart→ buffers entirely in memory
- With filename →
- Since
contenthas no filename, Rack buffers it in memory - When the
contentfield exceeds 16MB, Rack'supdate_retained_size()method raisesEOFError: multipart data over retained size limit
This creates a mismatch: CommitsUploader allows requests up to 300MB (DEFAULT_MAX_REQUEST_SIZE), but Rack's internal buffer limit is only 16MB.
Relevant code path:
-
lib/api/helpers/commits_body_uploader_helper.rb:38callsRack::Multipart.parse_multipart(env) - Rack's
multipart/parser.rb:349-351enforces the 16MB limit
Sentry Error
https://new-sentry.gitlab.net/organizations/gitlab/issues/3332849/
Backtrace:
EOFError: multipart data over retained size limit (EOFError)
from rack/multipart/parser.rb:350:in `update_retained_size'
from rack/multipart/parser.rb:336:in `handle_mime_body'
from rack/multipart/parser.rb:250:in `block in run_parser'
from <internal:kernel>:187:in `loop'
from rack/multipart/parser.rb:241:in `run_parser'
from rack/multipart/parser.rb:225:in `on_read'
from rack/multipart/parser.rb:101:in `block in parse'
from <internal:kernel>:187:in `loop'
from rack/multipart/parser.rb:99:in `parse'
from rack/multipart.rb:53:in `extract_multipart'
from config/initializers/rack_multipart_patch.rb:10:in `extract_multipart'
from rack/multipart.rb:41:in `parse_multipart'
from lib/api/helpers/commits_body_uploader_helper.rb:38:in `file_params_from_body_upload'
from lib/api/files.rb:336:in `block (2 levels) in <class:Files>'
from grape/endpoint.rb:58:in `call'
from grape/endpoint.rb:58:in `block (2 levels) in generate_api_method'
from active_support/notifications.rb:212:in `instrument'
from grape/endpoint.rb:57:in `block in generate_api_method'
from grape/endpoint.rb:328:in `execute'
from grape/endpoint.rb:260:in `block in run'
Possible Solutions
Option 1: Pre-process multipart in Workhorse (Recommended)
Extend Workhorse's body_uploader.go to detect multipart/form-data content type and pre-process it similarly to how rewrite.go handles regular multipart uploads. Workhorse would:
- Parse the multipart body
- Extract large fields (like
content) to separate temporary files - Forward metadata about extracted fields to Rails (similar to how file uploads are handled)
- Rails would then read large fields from files instead of parsing them from the multipart body
This approach leverages Workhorse's existing multipart parsing capabilities and keeps memory usage low.
- Pros: Memory-efficient; consistent with existing Workhorse patterns; no Rack limitations
- Cons: Requires changes to both Workhorse and Rails; more complex implementation
Option 2: Implement streaming multipart parser in Rails
Replace Rack::Multipart.parse_multipart in file_params_from_body_upload with a custom streaming parser that writes large fields to temporary files instead of buffering in memory.
- Pros: Memory-efficient; changes contained to Rails
- Cons: Requires implementing/maintaining a custom multipart parser; need to ensure tempfiles are properly cleaned up
Option 3: Recommend application/json for large payloads
The JSON content type path already uses Oj.load_file(file_path) which streams from the file without loading everything into memory. We could:
-
Document that
application/jsonshould be used for files >16MB -
Return a helpful error message suggesting JSON format when multipart fails due to size
-
Pros: No code changes needed for the happy path; leverages existing efficient code path
-
Cons: May break existing client integrations; requires client-side changes
Option 4: Handle EOFError gracefully with informative error message
Catch the EOFError and return a 400 Bad Request with a message explaining the limitation and suggesting alternatives (e.g., use application/json content type).
- Pros: Better user experience; guides users to working solutions; quick to implement
- Cons: Doesn't fix the underlying limitation for multipart users
Recommendation
Short-term: Implement Option 4 to provide immediate relief with a helpful error message.
Long-term: Implement Option 1 (Workhorse pre-processing) for a proper fix that maintains memory efficiency and supports the full 300MB request size with multipart/form-data.