Skip to content

feat: container errors passed to CI

What

This MR modifies runwayctl job run to properly propagate container exit codes by checking Cloud Run Job execution results.

Changes:

  • Modified cmd/job/job.go to validate execution results after polling
  • Returns error (exit code 1) when tasks fail, are cancelled, or don't execute
  • Error messages include task counts: Error: job execution failed: 3 task(s) failed, 2 succeeded

⚠️ Breaking Change: runwayctl job run now returns non-zero exit codes on failure instead of always returning 0.

Mitigation: The CI template's Run job already has allow_failure: true, so pipelines won't fail. Teams can opt-in to strict checking by overriding this in their deployment project's runway.yml.

Why

Previously, runwayctl job run always returned exit code 0 even when container tasks failed. This meant CI pipelines showed success despite actual failures.

Root cause: The run() function only checked for API/polling errors, not execution results. While failure counts were displayed in logs, they were never used to determine the exit code.

Impact: This aligns the tool with standard CLI behavior where non-zero exit codes indicate failure, enabling proper CI pipeline failure detection for teams who want it.

Related Issue

  1. gitlab-com/gl-infra/tenant-scale/cells-infrastructure/team#463 (comment 2793281023)

Merge request reports

Loading