ScanIngestionError when a report contains unicode null characters
Summary
When a scan is ran, and the gl-*-report.json
file contains a unicode null character \u0000 OR \\\\\u0000
, ingestion of the report fails.
Seems like the warning is coming from here: https://gitlab.com/gitlab-org/gitlab/-/blob/2e61a808814c11810052a0bfbb07ea5a20822fa2/lib/gitlab/ci/parsers/security/common.rb#L49
Though only a warning is displayed, the report ingestion still fails.
This came up in a customer ticket: Zendesk Link - internal only
Example Project
What is the current bug behavior?
Report ingestion fails.
What is the expected correct behavior?
Security report should be parsed and ingested correctly.
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:env:info`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:check SANITIZE=true
)(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true
)(we will only investigate if the tests are passing)
Work around
In my case, the customer had the following entry for a dast_api
job:
\"run timeout /T 10\\u0000\"
The work around is:
dast_api:
after_script:
- sed -i 's/\\\\u0000//g' gl-dast-api-report.json
OR in some cases where you have \u0000
and \\u0000
in the report, you can use modify the work around:
- sed -i -e 's/\\\\u0000//g' -e 's/\\u0000//g' gl-dast-api-report.json