Support non-ASCII characters in gzip artifact headers
What does this MR do?
- Adds support for non-ASCII characters in gzip artifact headers by URL-escaping the header fields
- Applies the fix to the GZIP compression used during artifact uploads to GitLab, while preserving the original non-ASCII characters in the final ZIP artifacts downloaded by users
Why was this MR needed?
- The gzip specification only supports Latin-1 characters in headers, causing failures when uploading artifacts with non-ASCII characters in their paths. This led to errors like:
gzip.Write: non-Latin-1 header
. - Although the final artifacts are served as ZIP files, the internal upload process uses GZIP compression, which needed to handle non-ASCII paths correctly.
What's the best way to test this MR?
Manual Testing
Tested with a sample project containing non-ASCII paths on my GDK:
- Created a custom runner-helper with the fix.
- Ran a dependency scanning job that generates and uploads artifacts.
- Verified:
- Upload fails with the default runner.
- Upload succeeds without the
non-Latin-1
header error using the new runner. - Artifacts are accessible in the GitLab UI with correct non-ASCII paths.
- Downloaded ZIP artifacts preserve the original non-ASCII characters.
Details
Successful job output:
[0KRunning with gitlab-runner development version (HEAD)[0;m
[0K on non-ascii-support-runner t1_jxS3j, system ID: s_96cb2444412f[0;m
[0K[36;1mResolving secrets[0;m[0;m
section_start:1732701619:prepare_executor
[0K[0K[36;1mPreparing the "docker" executor[0;m[0;m
[0KUsing Docker executor with image registry.gitlab.com/security-products/gemnasium-python:5 ...[0;m
[0KUsing helper image: custom-gitlab-runner-helper:latest (overridden, default would be registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:arm64-latest )[0;m
[0KUsing locally found image version due to "if-not-present" pull policy[0;m
[0KUsing docker image sha256:8b9fd1432be422f088d4066b99f78bc1c92152e4244ad6d7f0f076e1eff7445c for custom-gitlab-runner-helper:latest ...[0;m
[0KUsing helper image: custom-gitlab-runner-helper:latest (overridden, default would be registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:arm64-latest )[0;m
[0KUsing docker image sha256:8b9fd1432be422f088d4066b99f78bc1c92152e4244ad6d7f0f076e1eff7445c for custom-gitlab-runner-helper:latest ...[0;m
[0KUsing locally found image version due to "if-not-present" pull policy[0;m
[0KUsing docker image sha256:f47d0156012e9bb1770bc90c880108cc5346aeb6d622be2ba7124bf20b55706b for registry.gitlab.com/security-products/gemnasium-python:5 with digest registry.gitlab.com/security-products/gemnasium-python@sha256:5b7303e30345d210fc7e20d98c259d453d13aff175f0048038a2eccbcac41af7 ...[0;m
section_end:1732701619:prepare_executor
[0Ksection_start:1732701619:prepare_script
[0K[0K[36;1mPreparing environment[0;m[0;m
[0KUsing helper image: custom-gitlab-runner-helper:latest (overridden, default would be registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:arm64-latest )[0;m
[0KUsing docker image sha256:8b9fd1432be422f088d4066b99f78bc1c92152e4244ad6d7f0f076e1eff7445c for custom-gitlab-runner-helper:latest ...[0;m
Running on runner-t1jxs3j-project-20-concurrent-0 via Orins-MacBook-Pro.local...
section_end:1732701619:prepare_script
[0Ksection_start:1732701619:get_sources
[0K[0K[36;1mGetting source from Git repository[0;m[0;m
[32;1mFetching changes with git depth set to 20...[0;m
Reinitialized existing Git repository in /builds/dependencyscanning/non-ascii-in-cyclonedx/.git/
[32;1mChecking out caf8cb8e as detached HEAD (ref is main)...[0;m
Removing gl-dependency-scanning-report.json
Removing sbom-manifest.json
Removing "\346\265\213\350\257\225/dist/"
Removing "\346\265\213\350\257\225/gl-sbom-pypi-pip.cdx.json"
Removing "\346\265\213\350\257\225/pipdeptree.json"
[32;1mSkipping Git submodules setup[0;m
section_end:1732701620:get_sources
[0Ksection_start:1732701620:step_script
[0K[0K[36;1mExecuting "step_script" stage of the job script[0;m[0;m
[0KUsing docker image sha256:f47d0156012e9bb1770bc90c880108cc5346aeb6d622be2ba7124bf20b55706b for registry.gitlab.com/security-products/gemnasium-python:5 with digest registry.gitlab.com/security-products/gemnasium-python@sha256:5b7303e30345d210fc7e20d98c259d453d13aff175f0048038a2eccbcac41af7 ...[0;m
[32;1m$ /analyzer run[0;m
[0;32m[INFO] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/cmd/gemnasium-python/main.go:51] ▶ GitLab gemnasium-python analyzer v5.8.0[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/pkg/mod/gitlab.com/gitlab-org/security-products/analyzers/common/v3@v3.4.0/cacert/cacert.go:65] ▶ CA cert bundle not imported: empty bundle or empty target path[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/finder/finder.go:64] ▶ inspect directory: .[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/finder/finder.go:96] ▶ skip ignored directory: .git[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/finder/finder.go:64] ▶ inspect directory: 日本語[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/finder/finder.go:64] ▶ inspect directory: 测试[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/finder/detect.go:84] ▶ Selecting pip for pypi because this is the first match[0m
[0;32m[INFO] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/finder/finder.go:116] ▶ Detected supported dependency files in '测试'. Dependency files detected in this directory will be processed. Dependency files in other directories will be skipped.[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:21Z] [/go/src/app/cmd/gemnasium-python/main.go:241] ▶ Exporting dependencies for /builds/dependencyscanning/non-ascii-in-cyclonedx/测试/requirements.txt[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:26Z] [/go/src/app/builder/pip/pip.go:120] ▶ /usr/local/bin/pip3 download --disable-pip-version-check --dest ./dist -r requirements.txt
Collecting netaddr (from -r requirements.txt (line 1))
Downloading netaddr-1.3.0-py3-none-any.whl.metadata (5.0 kB)
Downloading netaddr-1.3.0-py3-none-any.whl (2.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 2.7 MB/s eta 0:00:00
Saved ./dist/netaddr-1.3.0-py3-none-any.whl
Successfully downloaded netaddr
[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:29Z] [/go/src/app/builder/pip/pip.go:155] ▶ /usr/local/bin/pip3 install --user --ignore-installed --no-warn-script-location --disable-pip-version-check --find-links ./dist --requirement requirements.txt
Looking in links: ./dist
Collecting netaddr (from -r requirements.txt (line 1))
Using cached netaddr-1.3.0-py3-none-any.whl.metadata (5.0 kB)
Using cached netaddr-1.3.0-py3-none-any.whl (2.3 MB)
Installing collected packages: netaddr
Successfully installed netaddr-1.3.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:30Z] [/go/src/app/advisory/repo.go:124] ▶ /usr/bin/git -C /gemnasium-db remote set-url origin https://gitlab.com/gitlab-org/security-products/gemnasium-db.git
[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:34Z] [/go/src/app/advisory/repo.go:124] ▶ /usr/bin/git -C /gemnasium-db fetch --force --tags origin master
From https://gitlab.com/gitlab-org/security-products/gemnasium-db
* branch master -> FETCH_HEAD
01eae3c295..9378916b5c master -> origin/master
[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:37Z] [/go/src/app/advisory/repo.go:124] ▶ /usr/bin/git -C /gemnasium-db checkout master
Already on 'master'
Your branch is behind 'origin/master' by 82 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:37Z] [/go/src/app/advisory/repo.go:137] ▶ /usr/bin/git -C /gemnasium-db symbolic-ref -q HEAD[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:38Z] [/go/src/app/advisory/repo.go:145] ▶ /usr/bin/git -C /gemnasium-db reset --hard origin/master
HEAD is now at 9378916b5c Merge branch 'advng/go/github.com/mattermost/mattermost/server/v8/CVE-2024-41144' into 'master'
[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:38Z] [/go/src/app/advisory/repo.go:153] ▶ /usr/bin/git -C /gemnasium-db rev-parse HEAD
9378916b5cc61bcacef70b55910decc596668579
[0m
[0;32m[INFO] [gemnasium-python] [2024-11-27T10:00:38Z] [/go/src/app/advisory/repo.go:157] ▶ Using commit 9378916b5cc61bcacef70b55910decc596668579
of vulnerability database
[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:38Z] [/go/src/app/scanner/scanner.go:140] ▶ Location set to 测试/requirements.txt[0m
[0;35m[DEBU] [gemnasium-python] [2024-11-27T10:00:40Z] [/go/src/app/vrange/python/python.go:59] ▶ /usr/local/bin/pipenv run /vrange/python/rangecheck.py /tmp/vrange_queries396921751
[
][0m
[0;32m[INFO] [gemnasium-python] [2024-11-27T10:00:40Z] [/go/src/app/convert/cli.go:55] ▶ using schema model 15[0m
section_end:1732701640:step_script
[0Ksection_start:1732701640:upload_artifacts_on_success
[0K[0K[36;1mUploading artifacts for successful job[0;m[0;m
[32;1mUploading artifacts...[0;m
**/gl-sbom-*.cdx.json: found 1 matching artifact files and directories[0;m
Uploading artifacts as "archive" to coordinator... 201 Created[0;m id[0;m=396 responseStatus[0;m=201 Created token[0;m=glcbt-64
[32;1mUploading artifacts...[0;m
测试/gl-sbom-pypi-pip.cdx.json
**/gl-sbom-*.cdx.json: found 1 matching artifact files and directories[0;m
gl-sbom-pypi-pip.cdx.json
gl-sbom-pypi-pip.cdx.json
%E6%B5%8B%E8%AF%95/gl-sbom-pypi-pip.cdx.json
2024-11-27 10:00:40.438277393 +0000 UTC
Uploading artifacts as "cyclonedx" to coordinator... 201 Created[0;m id[0;m=396 responseStatus[0;m=201 Created token[0;m=glcbt-64
[32;1mUploading artifacts...[0;m
gl-dependency-scanning-report.json: found 1 matching artifact files and directories[0;m
Uploading artifacts as "dependency_scanning" to coordinator... 201 Created[0;m id[0;m=396 responseStatus[0;m=201 Created token[0;m=glcbt-64
section_end:1732701641:upload_artifacts_on_success
[0Ksection_start:1732701641:cleanup_file_variables
[0K[0K[36;1mCleaning up project directory and file based variables[0;m[0;m
section_end:1732701641:cleanup_file_variables
[0K[32;1mJob succeeded[0;m
Artifacts dir remains unchanged:
What are the relevant issue numbers?
Edited by Orin Naaman