Feature proposal: truncate description (and solution) fields during vulnerability ingestion to avoid "Validation failed: Description is too long (maximum is 15000 characters)" // custom security scanner
Summary
A customer got an error with ingestion for a file from Checkmarx. The UI shows the error as (IngestionError) Ingestion failed for some vulnerabilities
as shown here:
Steps to reproduce
- Have the description of a SAST report more than 15,000 characters. Alternatively, use the one in the ticket here(Internal access only).
- Run the pipeline and click on Security tab to note the failure and error
- For Self-managed, check GitLab logs
/var/log/gitlab/gitlab-rails/exceptions_json.log
for the error.
Example Project
What is the proposed new behavior?
When following the docs on Security scanner integration, customers may choose to integrate third-party tools. These tools are not aware of GitLab's 15,000 character limitation for description
.
Today, customers are responsible for validating these reports prior to them being ingested. They made need to write extra code to truncate the description
field (and store the truncated information elsewhere).
description
field to be truncated automatically by GitLab. If we do this, we should include TRUNCATED
so that the viewer is aware that the field has been truncated.
The description should have been truncated before. Or the limit should match what is in the Database
Relevant logs and/or screenshots
With the same report the issue was replicated and caught the error in `exceptions_json.log:
"exception.message":"Validation failed: Description is too long (maximum is 15000 characters)"
This error seems to be thrown from the database constraint in the \d+ vulnerability_occurrences
table:
Check constraints:
"check_4a3a60f2ba" CHECK (char_length(solution) <= 7000)
"check_ade261da6b" CHECK (char_length(description) <= 15000)
"check_f602da68dd" CHECK (char_length(cve) <= 48400)
However, this section of the code defines the limit as 1048576
"description": {
"type": "string",
"maxLength": 1048576,
"description": "A long text section describing the vulnerability more fully."
},
Workaround
Use an after-script in order to truncate descriptions in the report to 15,000 characters:
cx-scan:
after_script:
- apk add --update ruby
- |
ruby <<EOF
require 'json'
file_name = 'gl-sast-report.json'
report = JSON.parse(File.read(file_name))
report['vulnerabilities'] = report['vulnerabilities'].map do |vuln|
next vuln unless vuln.key?('description')
vuln['description'] = vuln['description'][0...15_000]
vuln
end
raw_json = JSON.pretty_generate(report)
File.open(file_name, 'w') do |f|
f.write(raw_json)
end
EOF
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: \`sudo gitlab-rake gitlab:env:info\`) (For installations from source run and paste the output of: \`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production\`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:check SANITIZE=true`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true`) (we will only investigate if the tests are passing)
Implementation Plan
-
backend Truncate the description and solution fields to match our database size check constraints during ingestion.