Encoding::UndefinedConversionError when viewing Kubernetes pod job logs

Summary

  1. When a pod log contains invalid UTF-8 characters, a red banner appears saying:

    Environments|An error occurred while fetching the logs

    Reported (Zendesk, internal use only) by a 36-seat premium customer.

  2. An error also happens with valid UTF-8 characters: https://log.gitlab.net/app/kibana#/doc/AW5F1e45qthdGjPJueGO/pubsub-rails-inf-gprd-2020.01.29-000015/doc?id=AW_xhIFNqojRxKGhkIsY&_g=h@44136fa

    String that seems to have caused this exception:

    ✔ Started logging errors to Sentry

    Stacktrace:

    app/controllers/application_controller.rb:121:in `render',
    ee/app/controllers/projects/logs_controller.rb:28:in `k8s',
    ee/lib/gitlab/ip_address_state.rb:10:in `with',
    ee/app/controllers/ee/application_controller.rb:43:in `set_current_ip_address',
    lib/gitlab/session.rb:11:in `with_session',
    app/controllers/application_controller.rb:468:in `set_session_storage',
    app/controllers/application_controller.rb:462:in `set_locale',
    lib/gitlab/application_context.rb:46:in `block in use',
    lib/gitlab/application_context.rb:46:in `use',
    lib/gitlab/application_context.rb:19:in `with_context',
    app/controllers/application_controller.rb:453:in `set_current_context',
    lib/gitlab/error_tracking.rb:34:in `with_context',
    app/controllers/application_controller.rb:546:in `sentry_context',
    ee/lib/omni_auth/strategies/group_saml.rb:41:in `other_phase',
    ee/lib/gitlab/jira/middleware.rb:19:in `call'

Steps to reproduce

I created a reproduction project you can use to deploy to a cluster: https://gitlab.com/weimeng/repro-pod-logs-utf8-error.

Relevant docker image is docker.io/weimeng/utf8-test:latest.

Example Project

(If possible, please create an example project here on GitLab.com that exhibits the problematic behavior, and link to it here in the bug report)

(If you are using an older version of GitLab, this will also determine whether the bug is fixed in a more recent version)

What is the current bug behavior?

(What actually happens)

What is the expected correct behavior?

(What you should see instead)

Relevant logs and/or screenshots

Looking at the production_json.log, we can see:

{"method":"GET","path":"/svc/bapi/user/environments/205/logs.json","format":"json","controller":"Projects::EnvironmentsController","action":"logs","status":500,"error":"Encoding::UndefinedConversionError: \"\\xE2\" from ASCII-8BIT to UTF-8","duration":47.99,"view":0.0,"db":9.9,"time":"2019-10-24T21:37:13.849Z","params":[{"key":"pod_name","value":"user-84849c6fb5-985z4"},{"key":"namespace_id","value":"svc/bapi"},{"key":"project_id","value":"user"},{"key":"id","value":"205"}],"remote_ip":"107.130.181.82","user_id":29,"username":"jandrews","ua":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:69.0) Gecko/20100101 Firefox/69.0","queue_duration":13.65,"correlation_id":"70nCqqpbN33","cpu_s":0.07199061599999368}

Output of checks

This happens on v12.4.0-ee.

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:env:info)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

I created two patches to resolve this:

Against v12.4.0-ee pod_logs_utf8-12_4.patch

Against master pod_logs_utf8.patch

Edited Feb 24, 2020 by Reuben Pereira
Assignee Loading
Time tracking Loading