Language detection crashing
We are a Premium customer and have been experiencing a crash in the DetectRepositoryLanguagesWorker
on every update to our default branch that litters Sidekiq with dead jobs. I have dug into this and it appears to be a bug with how Gitaly is using the Linguist API. I matched the trace to an issue on Linguist's GitHub. This issue has been closed as 'expected behaviour', which indicates incorrect usage of the API causing the crash. https://github.com/github/linguist/issues/4995
Logs
Error Class: Gitlab::Git::CommandError
Error Message: 2:exit status 1.
Error Backtrace:
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/wraps_gitaly_errors.rb:13:in `rescue in wrapped_gitaly_errors'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/wraps_gitaly_errors.rb:6:in `wrapped_gitaly_errors'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/git/repository.rb:745:in `languages'
/opt/gitlab/embedded/service/gitlab-rails/app/models/repository.rb:293:in `languages'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/language_detection.rb:64:in `detection'
Gitaly Trace:
{"correlation_id":"GOXShPps3U6","grpc.meta.auth_version":"v2","grpc.meta.client_name":"gitlab-sidekiq","grpc.meta.deadline_type":"unknown","grpc.method":"CommitLanguages","grpc.request.deadline":"2020-10-28T19:48:49Z","grpc.request.fullMethod":"/gitaly.CommitService/CommitLanguages","grpc.request.glProjectPath":"REDACTED","grpc.request.glRepository":"project-61","grpc.request.repoPath":"@hashed/REDACTED.git","grpc.request.repoStorage":"default","grpc.request.topLevelGroup":"@hashed","grpc.service":"gitaly.CommitService","grpc.start_time":"2020-10-28T13:48:49Z","level":"error","msg":"PID 32695 BUNDLE_GEMFILE=/opt/gitlab/embedded/service/gitaly-ruby/Gemfile\
undefined method `[]' for nil:NilClass\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:133:in `token_probability'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:122:in `block in tokens_probability'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:121:in `each'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:121:in `inject'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:121:in `tokens_probability'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:107:in `block in classify'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:106:in `each'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:106:in `classify'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:80:in `classify'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/classifier.rb:22:in `call'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist.rb:32:in `block (3 levels) in detect'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist.rb:100:in `instrument'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist.rb:31:in `block (2 levels) in detect'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist.rb:29:in `each'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist.rb:29:in `block in detect'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist.rb:100:in `instrument'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist.rb:24:in `detect'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/blob_helper.rb:368:in `language'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/lazy_blob.rb:70:in `language'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/blob_helper.rb:383:in `include_in_language_stats?'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/repository.rb:164:in `block in compute_stats'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/repository.rb:149:in `each_delta'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/repository.rb:149:in `compute_stats'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/repository.rb:116:in `cache'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/lib/linguist/repository.rb:68:in `languages'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/bin/git-linguist:123:in `block in git_linguist'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/bin/git-linguist:35:in `linguist'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/bin/git-linguist:122:in `git_linguist'\
/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/github-linguist-7.11.0/bin/git-linguist:147:in `\u003ctop (required)\u003e'\
/opt/gitlab/embedded/bin/git-linguist:23:in `load'\
/opt/gitlab/embedded/bin/git-linguist:23:in `\u003cmain\u003e'\
","peer.address":"@","pid":272,"span.kind":"server","system":"grpc","time":"2020-10-28T13:49:01.826Z"}
We are running GitLab 13.5.3-ee using the official Docker image.