Function for Shell quoting of input parameters

#533802 (comment 2727781341) asked for feedback and there is an security critical issue with input parameters and how they are used in shell code.

Shell background

When shell code is executed, it goes through through eval, which does all kinds of transformations before the resulting code is executed:

  1. Brace Expansion
  2. Tilde Expansion
  3. Parameter and variable expansion
  4. Command substitution
  5. Arithmetic expansion
  6. Process substitution
  7. Word splitting
  8. Quote removal

Some of them can be disabled by quoting these parts:

  • single quotes ' prevent all transformations, but using a single quote itself becomes tricky as there is no escape mechanism.
  • double quotes " disables some transformation, so all remaining cases must still be disabled by using backslash \ escaping.

GitLab pipelines

The problem with GitLabs input parameters is, that their value is inserted as is:

  • This works well with variables:, as there GitLab does the proper serialization.

    Sample pipeline
    .evil:
      parallel:
        matrix:
          - evil:
              - Double quote " END
              - Single quote ' END
              - Backtick ` END
              - Code $(date) END
              - Variable ${SHELL} END
              - |
                Blanks  Newline
                END
              - Wildcard * END
              - Parenthesis ( END
              - "Comment # END"
    
    unquoted:
      extends: [.evil]
      script: |
        echo $evil
    
    double:
      extends: [.evil]
      script: |
        echo "$evil"
    
    single:
      rules:
        - when: never
      extends: [.evil]
      script: |
        echo '$evil'
    Here GitLab passes the values as variables, which are properly escaped. As such the jobs work:
    • unquoted just mangles the multiple blanks/spaces/newlines
    • double works correctly for all cases
    • single does not work as using single quotes prevent the required variable expansion
  • It does not work when inputs are directly used in script: as proper quoting is impossible:

    • Not quoting the input results is the worst as all expansions from above occur, including code execution and syntax errors for most other cases
    • Using single quotes works best, but breaks when single quotes are used themselves.
    • Using double quotes is in the middle and breaks when double quotes are used themselves.
    Sample pipeline

    .gitlab-ci.yml (abbreviated)

    variables:
      GIT_STRATEGY: none
    double:
      trigger:
        include:
          - local: child.yml
            inputs:
              evil: Double quote " END
        strategy: depend

    child.yml

    ---
    spec:
      inputs:
        evil:
          type: string
    ---
    variables:
      GIT_STRATEGY: none
    unquoted:
      script: |
        echo $[[ inputs.evil ]]
    double:
      script: |
        echo "$[[ inputs.evil ]]"
    single:
      script: |
        echo '$[[ inputs.evil ]]'

See https://gitlab.com/pmhahn/ci-playground/-/tree/shellescape?ref_type=heads for a full pipeline.

Summary

case unquoted double single
double syntax syntax
single syntax syntax
backtick syntax syntax
variable ⚠️expanded ⚠️expanded
code ⚠️executed
blanks syntax
wildcard expanded
parenthesis syntax
comment ⚠️dropped

As you can see there is not safe way to use $[[ inputs.… ]] within script: directly, which always works: Any input can break one or the other case.

Using a here document like the following also does not work as the input may contain the delimiter string 'EOF' itself:

cat <<'__EOF__'
$[[ inputs.var ]]
__EOF__

Mitigation

You can mitigate this issue by first storing the input in a variables: and then using that variable in your script::

example job:
  variables:
    var: $[[ input.var ]]
  script: echo "$var"

Proposals

Currently the documentation is full of examples, where $[[ inputs.… ]] is directly used in scipts:, for example https://docs.gitlab.com/ci/yaml/#triggerstrategy They are all vulnerable to this issue.

Proposal 1: go via variables

  1. Convert all cases to go via a variables:
  2. Add a "big fat warning` into the documentation.

Proposal 2: introduce shellescape (preferred)

  1. Introduce a function to manipulate input values like shellescape which calls Shellwords.shellescape() (which is already used by GitLab, so no new dependency).
  2. Update all documentation to use $[[ inputs.… | shellescape ]]

This is similar to bashs Parameter transformation: ${var@Q} quotes the variable value in such a way, that it can go though another round of eval unmodified, e.g. eval var2=${var1@Q}:

$ var1="foo \" bar ' baz \$(date) bam # buz"
$ echo "$var1"
foo " bar ' baz $(date) bam # buz
$ eval var2=${var1@Q}
$ [ "$var1" = "$var2" ]; echo $?
0

The important thing here is, that no quotes are put around the $[[ inputs.… | shellescape ]] itself as it is an implementation detail if Shellwords.shellescape()

  • uses no quotes as the value needs no quoting, e.g. a number or a string like "foo"
  • uses single quotes (if the value contains no single quotes itself)
  • uses double quotes
  • uses backslash-escaping of all shell meta characters
  • a mix of all the above.
Edited by 🤖 GitLab Bot 🤖