EOFError: multipart data over retained size limit when committing large files via Repository Files API with multipart/form-data

Summary

Users encounter an EOFError: multipart data over retained size limit error when attempting to create or update files larger than 16MB via the Repository Files API (POST/PUT /api/v4/projects/:id/repository/files/:file_path) using multipart/form-data content type.

Root Cause

The error originates from Rack's multipart parser which has a hardcoded memory buffer limit of 16MB (BUFFERED_UPLOAD_BYTESIZE_LIMIT) for non-file form fields.

When processing multipart/form-data requests:

  1. Workhorse intercepts the request and saves the entire request body to a temporary file (up to 300MB)
  2. Workhorse forwards metadata about the saved file to Rails
  3. Rails' file_params_from_body_upload method re-parses the saved file using Rack::Multipart.parse_multipart
  4. The content field (containing the file data to be committed) is sent as a regular form field without a filename
  5. Rack's parser determines how to handle each field based on the presence of a filename:
    • With filename → TempfilePart → streams to disk (no memory limit)
    • Without filename → BufferPart → buffers entirely in memory
  6. Since content has no filename, Rack buffers it in memory
  7. When the content field exceeds 16MB, Rack's update_retained_size() method raises EOFError: multipart data over retained size limit

This creates a mismatch: CommitsUploader allows requests up to 300MB (DEFAULT_MAX_REQUEST_SIZE), but Rack's internal buffer limit is only 16MB.

Relevant code path:

  • lib/api/helpers/commits_body_uploader_helper.rb:38 calls Rack::Multipart.parse_multipart(env)
  • Rack's multipart/parser.rb:349-351 enforces the 16MB limit

Sentry Error

https://new-sentry.gitlab.net/organizations/gitlab/issues/3332849/

Backtrace:

EOFError: multipart data over retained size limit (EOFError)
  from rack/multipart/parser.rb:350:in `update_retained_size'
  from rack/multipart/parser.rb:336:in `handle_mime_body'
  from rack/multipart/parser.rb:250:in `block in run_parser'
  from <internal:kernel>:187:in `loop'
  from rack/multipart/parser.rb:241:in `run_parser'
  from rack/multipart/parser.rb:225:in `on_read'
  from rack/multipart/parser.rb:101:in `block in parse'
  from <internal:kernel>:187:in `loop'
  from rack/multipart/parser.rb:99:in `parse'
  from rack/multipart.rb:53:in `extract_multipart'
  from config/initializers/rack_multipart_patch.rb:10:in `extract_multipart'
  from rack/multipart.rb:41:in `parse_multipart'
  from lib/api/helpers/commits_body_uploader_helper.rb:38:in `file_params_from_body_upload'
  from lib/api/files.rb:336:in `block (2 levels) in <class:Files>'
  from grape/endpoint.rb:58:in `call'
  from grape/endpoint.rb:58:in `block (2 levels) in generate_api_method'
  from active_support/notifications.rb:212:in `instrument'
  from grape/endpoint.rb:57:in `block in generate_api_method'
  from grape/endpoint.rb:328:in `execute'
  from grape/endpoint.rb:260:in `block in run'

Possible Solutions

Extend Workhorse's body_uploader.go to detect multipart/form-data content type and pre-process it similarly to how rewrite.go handles regular multipart uploads. Workhorse would:

  1. Parse the multipart body
  2. Extract large fields (like content) to separate temporary files
  3. Forward metadata about extracted fields to Rails (similar to how file uploads are handled)
  4. Rails would then read large fields from files instead of parsing them from the multipart body

This approach leverages Workhorse's existing multipart parsing capabilities and keeps memory usage low.

  • Pros: Memory-efficient; consistent with existing Workhorse patterns; no Rack limitations
  • Cons: Requires changes to both Workhorse and Rails; more complex implementation

Option 2: Implement streaming multipart parser in Rails

Replace Rack::Multipart.parse_multipart in file_params_from_body_upload with a custom streaming parser that writes large fields to temporary files instead of buffering in memory.

  • Pros: Memory-efficient; changes contained to Rails
  • Cons: Requires implementing/maintaining a custom multipart parser; need to ensure tempfiles are properly cleaned up

Option 3: Recommend application/json for large payloads

The JSON content type path already uses Oj.load_file(file_path) which streams from the file without loading everything into memory. We could:

  • Document that application/json should be used for files >16MB

  • Return a helpful error message suggesting JSON format when multipart fails due to size

  • Pros: No code changes needed for the happy path; leverages existing efficient code path

  • Cons: May break existing client integrations; requires client-side changes

Option 4: Handle EOFError gracefully with informative error message

Catch the EOFError and return a 400 Bad Request with a message explaining the limitation and suggesting alternatives (e.g., use application/json content type).

  • Pros: Better user experience; guides users to working solutions; quick to implement
  • Cons: Doesn't fix the underlying limitation for multipart users

Recommendation

Short-term: Implement Option 4 to provide immediate relief with a helpful error message.

Long-term: Implement Option 1 (Workhorse pre-processing) for a proper fix that maintains memory efficiency and supports the full 300MB request size with multipart/form-data.

Edited by Vasilii Iakliushin