Invalid UTF-8 results in 500 error on pipeline and job page
Summary
If a job's trace isn't properly UTF-8 encoded, attempting to view the job's trace in the UI (/namespace/project/-/jobs/job_id
) will result in a "An error occurred while fetching the job log" message being displayed. Viewing the network panel in the browser's debug tools shows a 500 error being returned for the request to https://gitlab.example.com/namespace/project/-/jobs/job_id/trace.json?state=
.
If the job also happens to be marked as failed, a 500 error will be displayed when viewing the pipeline associated with the job in the UI (/namespace/project/-/pipelines/pipeline_id
). I suspect this is because we attempt to determine the failure reason to populate the dialog box when hovering over a failed pipeline in this view. If the job is marked as a success, only the job view (noted above) will result in an error.
Initially discovered via customer ticket: https://gitlab.zendesk.com/agent/tickets/224811 (internal use)
Steps to reproduce
This is a tough one to reproduce, and requires a bit of leg work on a self-managed instance to do so reliably. We believe the length of the job in the above support case caused the trace truncation service to inadvertently malform the encoding, but its hard to reproduce this.
Instead, it's easier to manually create a simple, failed job and manually replace the job log on the instance directly with a mock trace file that reproduces the issue.
- Create a project for testing purposes on a self-managed instance.
- Create a pipeline with a single job that will be marked as failed, like so:
job:
script:
- exit 1
-
Copy the attached
bad.log
file to the GitLab instance, or use the attachedbad_log.rb
Ruby script to create one. bad.log bad_log.rb -
Find the location of the above job's trace using the job ID (in this example its 4153), and change into the relevant directory:
$ find /var/opt/gitlab/gitlab-rails/shared/artifacts -name "4153"
/var/opt/gitlab/gitlab-rails/shared/artifacts/c2/35/c2356069e9d1e79ca924378153cfbbfb4d4416b1f99d41a2940bfdb66c5319db/2021_07_16/4153
$ cd cd /var/opt/gitlab/gitlab-rails/shared/artifacts/c2/35/c2356069e9d1e79ca924378153cfbbfb4d4416b1f99d41a2940bfdb66c5319db/2021_07_16/4153/2575/
$ ls -la
total 12
drwxr-xr-x 2 git git 4096 Jul 16 19:40 .
drwxr-xr-x 3 git git 4096 Jul 16 19:40 ..
-rw-r--r-- 1 git git 1790 Jul 16 19:40 job.log
- Move the
bad.log
file to the current directory, assign it proper permissions and rename the oldjob.log
file.
$ cp /root/bad.log . && chown git:git bad.log && mv job.log job_backup.log && mv bad.log job.log
$ ls -la
total 16
drwxr-xr-x 2 git git 4096 Jul 16 19:57 .
drwxr-xr-x 3 git git 4096 Jul 16 19:40 ..
-rw-r--r-- 1 git git 1790 Jul 16 19:40 job_backup.log
-rw-r--r-- 1 git git 1170 Jul 16 19:57 job.log
- Attempt to view the job and the pipeline pages in the UI, you should see the 500 errors.
What is the current bug behavior?
Invalid UTF-8 encoding of job trace results in 500 errors on the pipeline/job views.
What is the expected correct behavior?
Invalid UTF-8 job traces are properly handled and pipeline/job views in the UI load successfully.
Relevant logs and/or screenshots
Error when viewing the pipeline view for the relevant job:
ActionView::Template::Error invalid byte sequence in UTF-8 lib/gitlab/ci/ansi2html.rb:263:in `scan', lib/gitlab/ci/ansi2html.rb:263:in `block in convert', lib/gitlab/ci/ansi2html.rb:258:in `each_line', lib/gitlab/ci/ansi2html.rb:258:in `convert', lib/gitlab/ci/ansi2html.rb:30:in `convert', lib/gitlab/ci/trace/stream.rb:73:in `html', lib/gitlab/ci/trace.rb:33:in `block in html', lib/gitlab/ci/trace.rb:163:in `read_stream', lib/gitlab/ci/trace.rb:86:in `read', lib/gitlab/ci/trace.rb:32:in `html', app/helpers/ci/builds_helper.rb:10:in `build_summary', app/views/projects/pipelines/_with_tabs.html.haml:78, app/views/projects/pipelines/_with_tabs.html.haml:54:in `each_with_index', app/views/projects/pipelines/_with_tabs.html.haml:54, app/views/projects/pipelines/show.html.haml:31, app/controllers/application_controller.rb:124:in `render', app/controllers/projects/pipelines_controller.rb:224:in `block (2 levels) in render_show', app/controllers/projects/pipelines_controller.rb:222:in `render_show', app/controllers/projects/pipelines_controller.rb:103:in `block (2 levels) in show', app/controllers/projects/pipelines_controller.rb:102:in `show', app/controllers/application_controller.rb:534:in `block in allow_gitaly_ref_name_caching', lib/gitlab/gitaly_client.rb:341:in `allow_ref_name_caching', app/controllers/application_controller.rb:533:in `allow_gitaly_ref_name_caching', ee/lib/gitlab/ip_address_state.rb:10:in `with', ee/app/controllers/ee/application_controller.rb:44:in `set_current_ip_address', app/controllers/application_controller.rb:483:in `set_current_admin', lib/gitlab/session.rb:11:in `with_session', app/controllers/application_controller.rb:474:in `set_session_storage', lib/gitlab/i18n.rb:99:in `with_locale', lib/gitlab/i18n.rb:105:in `with_user_locale', app/controllers/application_controller.rb:468:in `set_locale', app/controllers/application_controller.rb:462:in `set_current_context', ee/lib/omni_auth/strategies/group_saml.rb:41:in `other_phase', lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call', lib/gitlab/middleware/rails_queue_duration.rb:33:in `call', lib/gitlab/middleware/speedscope.rb:13:in `call', lib/gitlab/request_profiler/middleware.rb:17:in `call', lib/gitlab/database/load_balancing/rack_middleware.rb:40:in `call', lib/gitlab/metrics/rack_middleware.rb:16:in `block in call', lib/gitlab/metrics/web_transaction.rb:21:in `run', lib/gitlab/metrics/rack_middleware.rb:16:in `call', lib/gitlab/jira/middleware.rb:19:in `call', lib/gitlab/middleware/go.rb:20:in `call', lib/gitlab/etag_caching/middleware.rb:21:in `call', lib/gitlab/middleware/multipart.rb:172:in `call', lib/gitlab/middleware/read_only/controller.rb:50:in `call', lib/gitlab/middleware/read_only.rb:18:in `call', lib/gitlab/middleware/same_site_cookies.rb:27:in `call', lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call', lib/gitlab/middleware/basic_health_check.rb:25:in `call', lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call', lib/gitlab/middleware/request_context.rb:21:in `call', config/initializers/fix_local_cache_middleware.rb:11:in `call', lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call', lib/gitlab/metrics/requests_rack_middleware.rb:74:in `call', lib/gitlab/middleware/release_env.rb:12:in `call'
Error when viewing the job view for the relevant job:
JSON::GeneratorError app/controllers/application_controller.rb:124:in `render', app/controllers/projects/jobs_controller.rb:65:in `block (3 levels) in trace', app/controllers/projects/jobs_controller.rb:58:in `block in trace', lib/gitlab/ci/trace.rb:163:in `read_stream', lib/gitlab/ci/trace.rb:86:in `read', app/controllers/projects/jobs_controller.rb:57:in `trace', ee/lib/gitlab/ip_address_state.rb:10:in `with', ee/app/controllers/ee/application_controller.rb:44:in `set_current_ip_address', app/controllers/application_controller.rb:483:in `set_current_admin', lib/gitlab/session.rb:11:in `with_session', app/controllers/application_controller.rb:474:in `set_session_storage', lib/gitlab/i18n.rb:99:in `with_locale', lib/gitlab/i18n.rb:105:in `with_user_locale', app/controllers/application_controller.rb:468:in `set_locale', app/controllers/application_controller.rb:462:in `set_current_context', ee/lib/omni_auth/strategies/group_saml.rb:41:in `other_phase', lib/gitlab/metrics/elasticsearch_rack_middleware.rb:16:in `call', lib/gitlab/middleware/rails_queue_duration.rb:33:in `call', lib/gitlab/middleware/speedscope.rb:13:in `call', lib/gitlab/request_profiler/middleware.rb:17:in `call', lib/gitlab/database/load_balancing/rack_middleware.rb:40:in `call', lib/gitlab/metrics/rack_middleware.rb:16:in `block in call', lib/gitlab/metrics/web_transaction.rb:21:in `run', lib/gitlab/metrics/rack_middleware.rb:16:in `call', lib/gitlab/jira/middleware.rb:19:in `call', lib/gitlab/middleware/go.rb:20:in `call', lib/gitlab/etag_caching/middleware.rb:21:in `call', lib/gitlab/middleware/multipart.rb:172:in `call', lib/gitlab/middleware/read_only/controller.rb:50:in `call', lib/gitlab/middleware/read_only.rb:18:in `call', lib/gitlab/middleware/same_site_cookies.rb:27:in `call', lib/gitlab/middleware/handle_malformed_strings.rb:21:in `call', lib/gitlab/middleware/basic_health_check.rb:25:in `call', lib/gitlab/middleware/handle_ip_spoof_attack_error.rb:25:in `call', lib/gitlab/middleware/request_context.rb:21:in `call', config/initializers/fix_local_cache_middleware.rb:11:in `call', lib/gitlab/middleware/rack_multipart_tempfile_factory.rb:19:in `call', lib/gitlab/metrics/requests_rack_middleware.rb:74:in `call', lib/gitlab/middleware/release_env.rb:12:in `call'
Output of checks
This bug happens on self-managed installations as well as GitLab.com.
Possible fixes
@stanhu and I were discussing this issue in Slack, and he mentioned the idea of automatically finding and discarding invalid bytes in the traces.