Design UX for handling partially failed operations in Secrets Manager
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem statement
When creating or updating GitLab CI/CD secrets via the Secrets Manager, multiple API calls to OpenBao are required to complete the operation. These calls happen sequentially, and if any call fails, the subsequent operations are not attempted.
Each API call has a specific purpose and impact on the secret's functionality:
-
Value API call (
POST kv/data/{path}):- Creates/updates the actual secret value
- If this fails, the entire operation fails
-
Metadata API call (
POST kv/metadata/{path}):- Updates metadata like description, environment, and branch
- If this fails, the value will be updated but with no metadata (environment, branch, description)
- The secret would exist but be completely unusable in pipelines
- It may show up in the list of secrets with just its name
- All subsequent operations (policy and JWT) will also be skipped
-
Policy API calls (
POST sys/policies/acl/{policy-name}):- Creates or updates the policies that grant access to the secret based on environment/branch
- If this fails, the value and metadata will be updated, but pipelines won't have access
- For updates involving environment/branch changes:
- We first remove the secret from its old policy
- Then add it to the new policy
- If either fails, the access controls will be in an inconsistent state
-
JWT Role API call (
POST auth/{mount}/role/{role-name}):- Updates the JWT role with glob policies for wildcard patterns (e.g.,
staging-*,feature/*) - This is the final step, so if previous steps fail, this won't be attempted
- If this fails, pipelines matching wildcard patterns won't work, but exact matches might still work
- Updates the JWT role with glob policies for wildcard patterns (e.g.,
Both the create and update operations follow this same sequence of API calls, so they face the same potential failure scenarios.
Currently, there's no defined UX strategy for communicating these partial failures to users, which can lead to confusion when a secret appears to be created or updated but doesn't work as expected in pipelines.
Goal
Design a clear, user-friendly approach to handle and communicate partial failures in secret operations. Users should understand:
- Which aspects of their secret are functional and which aren't
- How the specific failure impacts their CI/CD pipelines
- What actions they can take to resolve the issues
Proposal
Create a UX strategy for handling partial failures that:
- Translates technical failures into user-meaningful outcomes
- Provides clear status indicators for each critical aspect of a secret's functionality
- Offers actionable guidance specific to the type of failure
- Maintains consistency between create and update operations
User experience
Consider the following technical failure scenarios and their user impact:
Scenario 1: Value API succeeds, Metadata API fails
- Technical Impact: Secret value exists but with no environment, branch, or description metadata
- User Impact: Secret appears in the list but is completely unusable in pipelines
- User Need: Understand that the secret was only partially created and needs proper configuration
Scenario 2: Value & Metadata APIs succeed, Policy API fails
- Technical Impact: Secret and metadata exist, but access policy doesn't match environment/branch
- User Impact: Pipelines can't access the secret despite UI showing correct configuration
- User Need: Understand why pipelines can't access the secret and how to fix it
Scenario 3: Value, Metadata & Policy APIs succeed, JWT Role API fails
- Technical Impact: Secret works for exact environment/branch matches but not wildcard patterns
- User Impact: Inconsistent behavior - some pipelines can access the secret, others can't
- User Need: Clear indication that wildcard functionality is impaired
Implementation details
This issue requires collaboration between:
- Frontend developers
- Backend developers
- UX designers
We need to:
- Define user-friendly error messages mapped to specific API failures
- Design status indicators that clearly communicate the secret's functional state
- Implement retry mechanisms for specific failed components when possible
- Create consistent patterns for handling failures across create/update operations
- Design a UI that clearly shows functional state without exposing implementation details
Links
- Related to #470397 (closed) (Update project secrets)
- Related to #537066 (Move secret value updates to frontend)