Windows Runner can't upload dot-env artifact
Problem
Originally raised at #212629 (comment 360318180) by @sebastien.perin.
Some runners (especially windows runners) do not write dotenv artifact in UTF-8, but something else such as UCS-2. This causes an error when the artifact is parsed in GitLab, as GitLab currently expects the artifact in UTF-8. For example, you see the following error:
#212629 (comment 510347429) (Thanks to @gidiar)
ArgumentError (invalid byte sequence in UTF-8):
/opt/gitlab/embedded/service/gitlab-rails/app/services/ci/parse_dotenv_artifact_service.rb:57:in `scan'
/opt/gitlab/embedded/service/gitlab-rails/app/services/ci/parse_dotenv_artifact_service.rb:57:in `scan_line!'
/opt/gitlab/embedded/service/gitlab-rails/app/services/ci/parse_dotenv_artifact_service.rb:41:in `block (2 levels) in parse!'
/opt/gitlab/embedded/service/gitlab-rails/app/services/ci/parse_dotenv_artifact_service.rb:40:in `each_line'
/opt/gitlab/embedded/service/gitlab-rails/app/services/ci/parse_dotenv_artifact_service.rb:40:in `block in parse!'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/ci/build/artifacts/adapters/gzip_stream.rb:24:in `block in each_blob'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/ci/build/artifacts/adapters/gzip_stream.rb:39:in `gzip'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/ci/build/artifacts/adapters/gzip_stream.rb:23:in `each_blob'
/opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/ci/artifactable.rb:31:in `block in each_blob'
/opt/gitlab/embedded/service/gitlab-rails/app/uploaders/gitlab_uploader.rb:104:in `open'
/opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/ci/artifactable.rb:30:in `each_blob'
/opt/gitlab/embedded/service/gitlab-rails/app/services/ci/parse_dotenv_artifact_service.rb:39:in `parse!'
.....
.....
Proposals
Proposal 1: Improve documentation
There is a workaround to write dotenv artifact in UTF-8 with powershell's Add-Content
command.
#212629 (comment 430278657) (Thanks to @vdsbenoit)
test-job:
stage: test
tags:
- seb-windows
script:
- echo test job!!!!!!!!!!!!!!
- Add-Content -Path build.env -Value "MY_ENV_VAR=true"
artifacts:
reports:
dotenv: build.env
We should document this workaround as this is useful for many users.
Proposal 2: Detect original encode and force re-encoding to UTF-8 in GitLab
If it's possible to detect the original encoding in GitLab, we can perform encode("<original encode>").force_encoding("utf-8")
.