pwsh shell on windows - unicode characters are corrupted by 13.9 STDIN change
Summary
A customer raised a ticket (pwsh) variation of the shell executor.
Since 13.9, when the powershell started being passed via STDIN, they found that a unicode character they had defined was getting corrupted by this process.
Detail on their use case is below, but they use a specific character as a placeholder value, and have powershell module(s) with that character coded into it.
They then need to be able to put the same value in their environment variables and CI.
However, when the value is passed via STDIN it is modified to another character or characters and so no longer matches. This breaks their code.
Steps to reproduce
See project: https://gitlab.com/bprescott-support/testing/zd206695-win_ps_chars
Or, the code is in this tarball:
Actual behavior
The powershell hex Format-Hex function prints the characters in ASCII as well as hex. I encountered misbehavior with GitLab when attempting to render these, so I've edited them out of the output here.
-
.gitlab-ci.ymldefinesGL_Test1with the unicode character - this value is fixed for all the jobs - three different approaches are defined for providing the reference character to powershell.
- the value which powershell finds is output in hex and returned as an artifact.
- the correct return is:
Offset Bytes
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ -----------------------------------------------
0000000000000000 E2 88 85
- all three approaches are illustrated using:
- the 13.8 runner, which didn't use STDIN
- the 13.9 runner, which introduced STDIN to demonstrate when the regression occurred
- the 13.12 runner, to confirm current state.
There's three illustrations of how this variable is being handled.
[1] The inyaml jobs (27842-psm_in_yaml stage) have the powershell code inside the yaml, including with the hard coded value.
- as the test variable and the reference value are both defined in yaml, and both passed through STDIN, the comparison works
- the comparison works because both strings are corrupted, so powershell is comparing like with like
- compare the hex output for 13.8 and 13.8+ to observe the character being corrupted
[2] the inargs jobs (27842-unicode_argument stage) have the powershell code in a module, but pass the reference value in as a parameter
- similarly, as the test variable and the reference value are both coming from the CI, the comparison works
- the change in behaviour from 13.8 to 13.9 can be observed in the hex output.
[3] the inpsm jobs (27842-unicode_in_psm) have the powershell code in a module, and the reference value is in the module
- the module is imported to powershell directly, so the reference character is not modifed - the hex output shows this correct
- the CI variables continue to be modified from 13.9 upwards. In 13.8, the comparison works, from 13.9 the comparison fails.
-
these are the main jobs to look at. This illustrates mostly clearly that the same unicode string gets into Powershell OK via the
.psm1file in the test repository, but the one in.gitlab-ci.ymldoes not. The 13.8 job is the 'control' and shows what used to happen, the 13.9 and 13.12 jobs show the corrupted output.
> unicode from .gitlab-ci.yml:
Label: String (System.String) <6FEC9241>
Offset Bytes
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ -----------------------------------------------
0000000000000000 C3 94 C3 AA C3 A0
> unicode being used for comparison:
Label: String (System.String) <6895EFAF>
Offset Bytes
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ -----------------------------------------------
0000000000000000 E2 88 85
example pipeline
https://gitlab.com/bprescott-support/testing/zd206695-win_ps_chars/-/pipelines/298008377
- Hex output from Job #1239385175
13.12_inpsm
- Hex output from Job #1239385171
13.8_inpsm
The jobs in stages 27830-erroractionpreferenc and 27830-error2 relate to #27830 (closed)
Expected behavior
Characters set in the CI and in variables should be passed into powershell uncorrupted.
Relevant logs and/or screenshots
job log
Add the job log
Environment description
I have a Windows 10 laptop with multiple runners set up as services. Reproduced using Powershell 7.0.6 (See job output for version)
All are configured with the defaults except for check_interval
concurrent = 1
check_interval = 13
[session_server]
session_timeout = 1800
[[runners]]
name = "foo"
url = "https://gitlab.com"
token = "bar"
executor = "shell"
shell = "pwsh"
customer use case (detail)
Customer used to use another CI solution that supported a hierarchy of variables, including allowing variables to be defined in scopes which GitLab does not support such as individual branches.
When migrating to GitLab, they maintained this way of working by
- using parent/child pipelines
- constructing the equivalent variable hierarchy in the parent pipeline in powershell
- determining what values have "won" for that particular pipeline, and writing them out to their CI
- executing that CI as a dynamic child pipeline
They have in excess of 40 repositories working this way.
Their code requires a placeholder value to represent an empty set, and to avoid a collision, and they use unicode character: "∅"
Used GitLab Runner version
13.8: does not display the issue 13.9: displays the issue 13.12 beta: still displays the issue
Possible fixes
Related to change to use STDIN for pwsh - !2715 (merged)