Vault integration for key/value secrets MVC

Problem to solve

Many of our customers are using Vault to store secure tokens, and here at GitLab we plan to eventually include a bundled or accessible Vault as part of the GitLab distribution (https://about.gitlab.com/direction/release/secrets_management/). By building a way to set up Vault-stored variables from within GitLab, to be used as part of the CI platform, we enable both use cases. Customers who have brought their own Vaults are able to access secrets there, and (eventually, when the GitLab instance is provided) secrets will be accessible from there.

Need statement: The user needs a secure way to fetch short-lived tokens from Vault inside GitLab so that they can use those secrets at runtime by a job in their CI/CD pipelines.

Intended users

This feature will be used by automation developers who need to access secrets as part of their build and deploy process, providing a new and more secure way to do so.

Further details

This is blocked by gitlab-runner#3202 (closed) (Go 1.9 upgrade, scheduled for the previous release) due to language features used by the Vault API that are required.

Proposal

We will enable Vault group or project-level variables to be defined in the GitLab UI, of either type KV1 or KV2. The keys to be fetched are specified with the path and key name, along with the name of the environment variable to be filled with the value. At runtime, the Runner will fetch the value and set it in the environment variable.

Because the credentials exist outside of the main GitLab application, this feature requires using a pre-configured runner that has the credentials available to it - this can be achieved through the use of Runner tags: https://docs.gitlab.com/ee/ci/runners/#using-tags; if a job runs on a Runner that is not configured with the Vault TLS credentials it will provide a warning in the job log but will proceed as normal (this will ensure you can configure your pipelines so that needed secrets are only available to the jobs that need them.)

Limitations

There are a lot of security considerations around how a private Vault instance could securely interact with a shared Runner. For the initial MVC, using the Vault integration will require a private runner that has the credentials to your Vault instance. In the future this could potentially be opened up for configuration in Rails and to support shared runners.
Vault supports several secrets engines, but for the MVC we will support the key value stores (KV1 and KV2).
Vault also supports more authentication methods that we could consider supporting in addition to TLS.
Identity/requestor mapping will eventually need be at a more granular level that at the level of a particular runner. In the future we can add support for use of additional tokens, perhaps through https://gitlab.com/gitlab-org/gitlab-ee/issues/9983, which might enable authenticating and fetching values based on GitLab project or job tokens.
In the future we plan to make it easy for you to install a Vault for use with your GitLab (through issues like https://gitlab.com/gitlab-org/gitlab-ee/issues/9982). For now, though, this will require you to bring your own instance. See https://about.gitlab.com/direction/release/secrets_management/ for the broader Secrets Management strategy.

GitLab UI Configuration

We will enable configuration in the GitLab UI where two new typed variables can be set: Vault KV1 and Vault KV2, representing the KV1 and KV2 secrets engines respectively. The name of the environment variable will be the name of the variable as set on the system, duration contains the length of the short-lived token, and the value field contains the (optionally namespaced) path to the key in Vault.

Vault variables should also be allowed to be set as scoped or protected environment variables. In this case the value for the variable is only attempted to be fetched by the runner for the protected branches, tags, or environments selected.

We'll rely on all existing variable functionality (masking/protected etc) to also apply for this feature
The aim is to support this in all variable functionality places throughout the application (project settings, group settings, manual jobs, pipeline schedules, manual pipelines)
This will additional variable types and a variable path field
- It is recommended due to usability concerns to include https://gitlab.com/gitlab-org/gitlab-ce/issues/59318 into this issue's scope. If this is not possible we can replace the variable value field with the path field to keep the scope small. If so, my recommendation is to fix https://gitlab.com/gitlab-org/gitlab-ce/issues/59318 in the next milestone.

Flow

graph LR
CMV[Create new gitlab variable]
SVT[Select variable type KV1 or KV2]
AVN[Align variable name with vault var name]
FAF[Fill in the additional path field]

CMV-->SVT
SVT-->AVN
AVN-->FAF

Runner Setup/Execution

Configuring the Runner to connect to Vault when needed requires a new configuration section, wherein the TLS certificate configuration is provided:

[runners]
  ...
  [runners.vault]
    [runners.vault.server]
      url = "https://127.0.0.1:8200/v1/auth/cert/login"
    [runners.vault.auth]
      tls_cacert = "/tmp/ca.pem"
      tls_cert = "/tmp/cert.pem"
      tls_key = "/tmp/key.pem"
      tls_name = "gitlab_runner"

This would perform authentication on the backend that is functionally equivalent to:

$ curl \
    --request POST \
    --cacert /tmp/ca.pem \
    --cert /tmp/cert.pem \
    --key /tmp/key.pem \
    --data '{"name": "gitlab_runner"}' \
    http://127.0.0.1:8200/v1/auth/cert/login

Once this is configured, when the runner receives a job that requires a Vault KV1 or Vault KV2 typed variable, it performs an extra operation to request the value from Vault. For example, if the job needs a Vault KV1 typed value with the name of MY_PASSWORD, a duration of 600 seconds, and a path of kv2/data/my-password, it will request a temporary token for the kv2/data/my-password value with a duration of 600 seconds, and place the value in the MY_PASSWORD environment variable that is available during the job run.

If a job that requires a Vault KV1 or Vault KV2 typed variable runs on a Runner that is not configured to connect to Vault, it should provide a warning in the job log. In the case where a Runner is configured with credentials but fails to successfully get the variable, the job should fail.

Vault integration for key/value secrets MVC

Problem to solve

Intended users

Further details

Proposal

Limitations

GitLab UI Configuration

Flow

Runner Setup/Execution

Permissions and Security

Documentation

Testing

What does success look like, and how can we measure that?

Links / references