CommitLanguages RPC fails with "linguist language_id not found" when .gitattributes uses a language alias

Summary

The CommitLanguages gRPC endpoint returns an Internal error when a repository's .gitattributes file specifies a linguist-language override using a language alias (e.g., omnetpp-msg) rather than the canonical language name (e.g., OMNeT++ MSG).

This surfaces in GitLab Rails as a GRPC::Internal exception in DetectRepositoryLanguagesWorker, causing repository language detection to fail entirely for the affected project.

Error

GRPC::Internal: 13:linguist language_id fetch: linguist language_id not found: omnetpp-msg

Backtrace

from grpc/generic/active_call.rb:29:in `check_status'
from grpc/generic/active_call.rb:189:in `attach_status_results_and_complete_call'
from grpc/generic/active_call.rb:383:in `request_response'
from grpc/generic/client_stub.rb:180:in `block in request_response'
from grpc/generic/interceptors.rb:170:in `intercept!'
from grpc/generic/client_stub.rb:179:in `request_response'
from grpc/generic/service.rb:171:in `block (3 levels) in rpc_stub_class'
from lib/gitlab/gitaly_client.rb:358:in `execute'
from lib/gitlab/gitaly_client/call.rb:28:in `block in call'
from lib/gitlab/gitaly_client/call.rb:88:in `recording_request'
from lib/gitlab/gitaly_client/call.rb:27:in `call'
from lib/gitlab/gitaly_client.rb:347:in `call'
from lib/gitlab/gitaly_client/with_feature_flag_actors.rb:31:in `block in gitaly_client_call'
from lib/gitlab/gitaly_client.rb:719:in `with_feature_flag_actors'
from lib/gitlab/gitaly_client/with_feature_flag_actors.rb:25:in `gitaly_client_call'
from lib/gitlab/gitaly_client/commit_service.rb:436:in `languages'
from lib/gitlab/git/repository.rb:844:in `block in languages'
from lib/gitlab/git/wraps_gitaly_errors.rb:7:in `wrapped_gitaly_errors'
from lib/gitlab/git/repository.rb:843:in `languages'
from app/models/repository.rb:386:in `languages'
from lib/gitlab/language_detection.rb:64:in `detection'
from lib/gitlab/language_detection.rb:13:in `languages'
from app/services/projects/detect_repository_languages_service.rb:40:in `ensure_programming_languages'
from app/services/projects/detect_repository_languages_service.rb:12:in `execute'
from ee/projects/detect_repository_languages_service.rb:7:in `execute'
from app/workers/detect_repository_languages_worker.rb:22:in `block in perform'
from app/services/concerns/exclusive_lease_guard.rb:32:in `try_obtain_lease'
from app/workers/detect_repository_languages_worker.rb:21:in `perform'

Root cause

The error originates in Gitaly's CommitLanguages RPC handler. The code path is:

  1. fileInstance.getLanguage() (internal/gitaly/linguist/file_instance.go:78-84) reads the linguist-language attribute from .gitattributes and returns its value verbatim — no alias resolution is performed. When a user sets linguist-language=omnetpp-msg, the string "omnetpp-msg" is returned as the language name.

  2. This raw alias value flows through DetermineStats()languageStats.add() and becomes a key in the ByteCountPerLanguage map.

  3. CommitLanguages() (internal/gitaly/service/commit/languages.go:72-76) iterates over the stats and calls linguist.LanguageID(lang) for each language name.

  4. LanguageID() (internal/gitaly/linguist/linguist.go:54-62) delegates to enry.GetLanguageID("omnetpp-msg"), which only accepts canonical language names (the go-enry documentation explicitly states: "The input must be the canonical language name. Aliases are not supported."). The canonical name "OMNeT++ MSG" has ID 664100008, but the alias "omnetpp-msg" is not present in the ID lookup map, so the call fails.

Additionally, the unresolved alias causes a secondary silent bug: enry.GetLanguageType() also only recognizes canonical names, so isIgnoredLanguage() misclassifies aliases of programming/markup languages as Unknown type, potentially causing files to be silently excluded from language statistics even when no error is raised.

The bug affects any language specified by alias in .gitattributes, not just omnetpp-msg. Other examples include cpp (alias for C++), golang (alias for Go), rb (alias for Ruby), etc.

Reproduction

Create a repository with a .gitattributes file containing:

*.msg linguist-language=omnetpp-msg
*.msg linguist-detectable

and a .msg file. Trigger DetectRepositoryLanguagesWorker for the project (e.g., push a commit).