Encoding::UndefinedConversionError in RemoteService#update_remote_mirror with Unicode branch names
Summary
Follow-up to #566590 (closed) - Repository mirror updates fail when processing branch names containing Unicode characters that cannot be converted from UTF-8 to ASCII-8BIT.
Problem Description
The RemoteService#update_remote_mirror method crashes with Encoding::UndefinedConversionError: U+00E9 from UTF-8 to ASCII-8BIT when encountering branch names with certain Unicode characters.
The error originates in lib/gitlab/gitaly_client/remote_service.rb:67 during Gitaly request initialization, preventing repositories from updating remote mirrors when they contain branches with Unicode names.
Encoding::UndefinedConversionError: U+00E9 from UTF-8 to ASCII-8BIT (Encoding::UndefinedConversionError)
from lib/gitlab/gitaly_client/remote_service.rb:67:in `initialize'
from lib/gitlab/gitaly_client/remote_service.rb:67:in `new'
from lib/gitlab/gitaly_client/remote_service.rb:67:in `block (2 levels) in update_remote_mirror'
from lib/gitlab/gitaly_client/remote_service.rb:66:in `<<'
from lib/gitlab/gitaly_client/remote_service.rb:66:in `each'
from lib/gitlab/gitaly_client/remote_service.rb:66:in `each'
from lib/gitlab/gitaly_client/remote_service.rb:66:in `block in update_remote_mirror'
Root Cause
Similar to the issue in CommitService, the problem occurs when branch names in the only_branches_matching array contain Unicode characters. The slice variable contains these branch names, and when passed to Gitaly::UpdateRemoteMirrorRequest.new(only_branches_matching: slice), the Gitaly protobuf initialization fails due to encoding conversion issues.
Proposed Fix
Update lib/gitlab/gitaly_client/remote_service.rb:67 to properly encode branch names before passing them to the Gitaly request:
slices.each do |slice|
encoded_slice = slice.map { |branch_name| encode_binary(branch_name) }
y.yield Gitaly::UpdateRemoteMirrorRequest.new(only_branches_matching: encoded_slice)
end
This follows the same pattern used elsewhere in the codebase where encode_binary is used to handle UTF-8 strings that need to be passed to Gitaly.
Impact
- Remote mirror updates fail for repositories with Unicode branch names
- Affects repositories with international branch naming conventions
- Related to the broader encoding issue identified in #566590 (closed)
Related Issues
-
#566590 (closed) - Similar encoding issue in
DetectRepositoryLanguagesWorker