validate-plan-apply failing at validate step due to missing environment variable
Hi folks,
I feel like I might be going nuts but, did this a bunch of times, don't understand it, but feel like it may be a bug.
Here's the CI file I'm using (root of project, .gitlab-ci.yml)
include:
- component: "$CI_SERVER_FQDN/components/opentofu/validate-plan@3.12.0"
inputs:
allow_developer_role_to_plan: true
artifacts_access: 'maintainer'
auto_define_backend: false
base_os: 'alpine'
enable_id_tokens: true
environment_name: 'Dev'
lockfile: 'readonly'
opentofu_version: 1.10.7
post_mr_plan_comment: true
root_dir: 'route53'
state_name: $GITLAB_TOFU_STATE_NAME
use_rootless_image: false
version: 3.12.0
rules:
- if: $CI_PIPELINE_SOURCE == 'schedule'
when: never
- if: $CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH && $CI_PIPELINE_SOURCE == 'merge_request_event'
- component: "$CI_SERVER_FQDN/components/opentofu/validate-plan-apply@3.12.0"
inputs:
allow_developer_role_to_plan: true
apply_artifacts_access: 'maintainer'
auto_define_backend: false
base_os: 'alpine'
drift_detection_mode: 'none'
enable_id_tokens: true
environment_name: 'Dev'
lockfile: 'readonly'
opentofu_version: 1.10.7
plan_artifacts_access: 'maintainer'
root_dir: 'route53'
state_name: $GITLAB_TOFU_STATE_NAME
use_mr_plan: true
use_rootless_image: false
version: 3.12.0
rules:
- if: $CI_PIPELINE_SOURCE == 'schedule'
when: never
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
- component: "$CI_SERVER_FQDN/components/opentofu/detect-drift@3.12.0"
inputs:
version: 3.12.0
opentofu_version: 1.10.7
base_os: 'alpine'
use_rootless_image: false
lockfile: 'readonly'
mode: 'refresh-only'
on_drift: 'fail'
enable_id_tokens: true
state_name: $GITLAB_TOFU_STATE_NAME
environment_name: 'Prod'
root_dir: 'route53'
auto_define_backend: false
allow_developer_role: true
rules:
- if: $DN_TOFU_DRIFT_DETECTION == 'true' && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH && $CI_PIPELINE_SOURCE == 'schedule'
- component: "$CI_SERVER_FQDN/components/opentofu/validate@3.12.0"
inputs:
version: 3.12.0
opentofu_version: 1.10.7
base_os: 'alpine'
use_rootless_image: false
lockfile: 'readonly'
root_dir: 'route53'
auto_define_backend: false
rules:
- if: $CI_COMMIT_BRANCH != $CI_DEFAULT_BRANCH
stages: [validate, build, deploy]
.gitlab-tofu:id_tokens:
id_tokens:
GITLAB_OIDC_TOKEN:
aud: https://gitlab.com
Here's the .gitlab/ci/setup-id-tokens.sh script I'm using:
apk add --no-cache aws-cli
echo "${GITLAB_OIDC_TOKEN}" > /tmp/web_identity_token
echo "Environment_Name: $CI_ENVIRONMENT_NAME"
echo "Role_ARN: $AWS_ROLE_ARN"
echo "Session_Name: $AWS_ROLE_SESSION_NAME"
echo "Token_File_Path: $AWS_WEB_IDENTITY_TOKEN_FILE"
aws sts get-caller-identity
When I commit to a non-main branch, the validate job in the pipeline that executes against the commit succeeds and doesn't call the .gitlab/ci/setup-id-tokens.sh script, which seems to make sense because I didn't pass the enable_id_tokens in that job.
When I create a merge request, the validate-plan template runs. fmt and validate both succeed, and based on the logs, neither calls the .gitlab/ci/setup-id-tokens.sh script. The plan job calls the .gitlab/ci/setup-id-tokens.sh script and succeeds. The echo statements show me that it is appropriately executing in the "Dev" environment and fetching the correct environment variables defined in the UI.
When I merge the MR, the validate-plan-apply template executes in a pipeline. The fmt job succeeds, but then the validate job fails, halting the pipeline.
When reviewing the job logs, I can tell it is executing the .gitlab/ci/setup-id-tokens.sh script, as it downloads the AWS CLI, and then fails to authenticate. In the echo statements, it outputs no value for the $CI_ENVIRONMENT_NAME and $AWS_ROLE_ARN variables (the $AWS_ROLE_ARN is only defined in the "Dev" environment).
Example log output (removed project and pipeline IDs from session name):
Environment_Name:
Role_ARN:
Session_Name: GitLabRunner-00000000-0000000000
Token_File_Path: /tmp/web_identity_token
The provided profile or the current environment is configured to assume role with web identity but has no role ARN configured. Ensure that the profile has the role_arnconfiguration set or the AWS_ROLE_ARN env var is set.
What is different about the validate job in the validate-plan-apply pipeline in general or in my case? I don't see any reason why it would behave differently.
Thank you in advance for any assistance!