Allow manual jobs to run and access artifacts when their dependencies fail

Summary

Currently, GitLab CI/CD does not support a common use case: having a job fail hard to block the pipeline, while still allowing downstream manual jobs to access its artifacts and be triggerable for remediation.

Problem Description

When a job fails without allow_failure: true, any downstream jobs that depend on it (via needs) are automatically skipped, even if they are manual jobs. This prevents a natural workflow where:

A validation job fails and blocks the pipeline (preventing merges)
A manual remediation job remains available to fix the issue
The remediation job needs access to the validation job's artifacts

Real-World Use Case: Translation Validation & AI Remediation

We have a translations pipeline for a mobile app with 31+ locales:

Translation Linter Job:

Validates all translations for missing/outdated keys
Must fail hard to prevent merging PRs with incomplete translations
Produces artifacts containing:
- List of missing translation keys per locale
- List of outdated translation keys per locale
- Validation results in JSON format

AI Translation Job (Manual):

Uses AI services to automatically translate missing/outdated keys
Should be manually triggered because:
- Takes 10-15 minutes to process all locales
- Consumes external API credits
- Commits the translated files back to the branch
Requires the linter's artifacts to know what to translate

Current Workaround (Three Jobs)

We're forced to use a three-job pattern:

lint-translations:
  allow_failure: true # Don't block downstream jobs
  script:
    - run-translation-linter --noMissingTranslations=error
  artifacts:
    when: always
    paths:
      - .temp-translations/*.json

lint-translations-failure-checker:
  needs:
    - job: lint-translations
      artifacts: true
      optional: true
  script:
    -  # Read artifacts and fail hard if issues found
    - exit 1

ai-translate:
  when: manual
  needs:
    - job: lint-translations
      artifacts: true
      optional: true
  script:
    -  # Use artifacts to translate missing keys

Why this is suboptimal:

Adds complexity (third job that just reads artifacts and fails)
Less intuitive for developers
Duplicates failure logic that already exists in the linter

Desired Behavior (Two Jobs)

lint-translations:
  # Fails hard - no allow_failure
  script:
    - run-translation-linter --noMissingTranslations=error
  artifacts:
    when: always
    paths:
      - .temp-translations/*.json

ai-translate:
  when: manual
  needs:
    - job: lint-translations
      artifacts: true
      allow_failure: true # NEW: Allow dependency to fail
  script:
    -  # Use artifacts to translate missing keys

Or via a new keyword:

ai-translate:
  when: manual
  dependencies: # Artifacts-only dependency
    - lint-translations
  # No "needs" = no execution dependency

Related Issues

#438183 (closed) - Allow dependencies without needs, pulling artifacts from depended-on jobs
#273762 (closed) - (if relevant based on discussion)

Benefits

This would enable common patterns like:

✅ Linting with automatic remediation options
✅ Test failures with manual retry/debug jobs
✅ Security scans with manual override workflows
✅ Any "fail-fast, fix-manually" pattern

Additional Context

This is a frequently requested pattern in the community, and teams currently work around it with either:

Three-job patterns (our approach)
Dynamic child pipelines (overly complex)
Accepting allow_failure: true on the validation job (doesn't block merges)

None of these workarounds are ideal for the use case.

Visual Representation

Current Workaround (Three Jobs)

┌─────────────────────────────────────┐
│  Translation Linter                 │
│  - Detects missing/outdated         │
│  - allow_failure: true              │
│  - Produces artifacts ✓             │
└─────────────┬───────────────────────┘
              │
              ├──────────────────────────┐
              │                          │
              ▼                          ▼
┌─────────────────────────┐  ┌──────────────────────────┐
│  Failure Checker        │  │  AI Translation (manual) │
│  - Reads artifacts      │  │  - Reads artifacts       │
│  - Fails hard ❌        │  │  - Translates with AI    │
│  - Blocks pipeline      │  │  - Commits fixes         │
└─────────────────────────┘  └──────────────────────────┘

How it works:

Linter fails but allows pipeline to continue
Failure checker fails hard to block merges
AI job stays available for manual trigger

Desired Solution (Two Jobs)

┌─────────────────────────────────────┐
│  Translation Linter                 │
│  - Detects missing/outdated         │
│  - Fails hard ❌                    │
│  - Blocks pipeline                  │
│  - Produces artifacts ✓             │
└─────────────┬───────────────────────┘
              │
              │ (artifacts flow but no execution dependency)
              │
              ▼
┌──────────────────────────────────────┐
│  AI Translation (manual)             │
│  - Still runnable despite failure ⚠️ │
│  - Reads artifacts                   │
│  - Translates with AI                │
│  - Commits fixes                     │
└──────────────────────────────────────┘

Why it doesn't work currently:

GitLab requires needs when using dependencies
If linter fails → AI job gets skipped
No way to have "artifacts only" dependency without execution dependency

The gap: GitLab doesn't support accessing artifacts from failed jobs without creating an execution dependency that causes downstream jobs to be skipped.

Edited Oct 10, 2025 by 🤖 GitLab Bot 🤖