Add cache headers to JWKS
Related to issue: #374001 (closed)
What does this MR do and why?
When using GitLab as an OIDC provider to AWS in CI/CD pipelines, running jobs that make use of AssumeRoleWithWebIdentity
in AWS IAM in parallel fails with InvalidIdentityToken
error:
An error occurred (InvalidIdentityToken) when calling the AssumeRoleWithWebIdentity operation: Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements
An example of this command is as follows:
$ STS=$(aws sts assume-role-with-web-identity --role-arn "$AWS_ROLE_ARN" --role-session-name "GitLabRunner-${CI_PROJECT_ID}-${ENV_NAME}-${CI_PIPELINE_ID}" --web-identity-token $CI_JOB_JWT_V2 --duration-seconds $STS_DURATION_SECONDS --query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]' --output text)
Answer from amazon
"The AWS Identity and Access Management team will be changing the behavior of how AssumeRoleWithWebIdentity calls to AWS Security Token Service (STS) endpoints are managed when the OIDC identity provider does not allow caching of its JWKS through setting either Pragma: no-cache, Cache-Control: no-cache, or Cache=Control: max-age=0 response headers. This change will reduce the frequency of “Couldn't retrieve verification key from your identity provider.” messages returned by AssumeRoleWithWebIdentity calls when the JWKS is not cached.
Your AWS account team will keep you updated as we move forward with delivering the update.
For lower latency handling of AsumeRoleWithWebIdentity calls, STS recommends that customers configure their OIDC identity providers to allow JWKS caching."
Due to a constraint encountered with AWS AssumeRoleWithWebIdentity, which occurs when AWS frequently interacts with GitLab as an OIDC, the sole solution available is to incorporate caching headers.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
The bug is closely associated with the AWS authentication process utilizing GitLab as an identity provider. To replicate it locally, we'll need to facilitate the passage of openId requests from AWS to the local setup, necessitating SSL and domain configuration.
The simplest method to replicate this is directly on GitLab.com itself:
- Setup a personal project with the following
.gitlab-ci.yml
assume role_one:
image: registry.gitlab.com/guided-explorations/aws/aws-cli-tools/aws-container-clis:latest
stage: deploy #deploy
id_tokens:
GITLAB_OIDC_TOKEN:
aud: https://gitlab.com
script:
- >
export $(printf "AWS_ACCESS_KEY_ID=%s AWS_SECRET_ACCESS_KEY=%s AWS_SESSION_TOKEN=%s"
$(aws sts assume-role-with-web-identity
--role-arn ${ROLE_ARN}
--role-session-name "GitLabRunner-${CI_PROJECT_ID}-${CI_PIPELINE_ID}"
--web-identity-token ${GITLAB_OIDC_TOKEN}
--duration-seconds 3600
--query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]'
--output text))
- aws sts get-caller-identity
- Following documentation: https://docs.gitlab.com/ee/ci/cloud_services/aws/index.html#add-the-identity-provider, create a role and trust on your AWS IAM configuration, with reference to GitLab project created in the step 3 (
"gitlab.example.com:sub": "project_path:mygroup/myproject:ref_type:branch:ref:main"
) - Define a masked variable on the project with the name
ROLE_ARN
and value equal to your role created on AWS IAM configuration, for examplearn:aws:iam::111111111111:role/GitLab
- Duplicate assume role_one job multiple times in
.gitlab-ci.yml
to get parallel jobs running using the same AWS account to authenticate. - Run the pipeline and check if one of the jobs will fail because of AWS's limitation on frequent requests to the OpenID JWKS controller.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.