Skip to content

Use of allow_failure:exit_codes results in final job log message claiming "ERROR", indistinguishable from allow_failure:true

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem

For a job that specifies allow_failure:exit_codes, the resulting job log ends with a message in red text: ERROR: Job failed: exit code N. This is indistinguishable from allow_failure:true, when followed by a manual job.

The problem is: this misleading message leads many a new user (of my pipeline) to believe the job failed, when it actually did what it was supposed to do. Anyone looking into the (tail end of the) job log immediately believes there was a problem. There is no effective way to communicate to the user that all is well, other than through outside means - private conversation, Confluence page, README that nobody ever reads.

Use case

For example, in my case I want to use allow_failure:exit_codes to indicate that an infrastructure plan found pending changes.
If there were no pending changes, the exit code is 0, and the job gets a Big Green Checkmark.
If there were pending changes, the exit code is 2 which is in the set of allow_failure:exit_codes, and the job gets a Brown Exclamation. Without looking into the job log, one can view the pipeline page to determine if further action is necessary.

Environment

GitLab Enterprise Edition 15.2.4-ee
Running with gitlab-runner 14.8.2 (c6e7e194)
on (runner-pool :project X :id Y :revision v2021-05-03) Z

Potential Solutions

The message should, preferably, be customizable, or at least should not say ERROR.

Good

A single custom message for all exit_codes on the existing "allow_failure" block would be sufficient, though not ideal.

infra:plan
  allow_failure:
    exit_codes: [ 2,3,4 ]
    log_message: The ${CI_JOB_NAME} job completed successfully, but found pending changes (ec: ${CI_JOB_EXIT_CODE}). Please run infra:apply to sync up cloud resources.

Better

A custom message per exit code would be much better. That might look something like:

infra:plan
  allow_failure:
    descriptive_exit_codes:
      - exit_code: 2
        log_message: The ${CI_JOB_NAME} job completed successfully, but found pending changes (ec: ${CI_JOB_EXIT_CODE}). Please run infra:apply to sync up cloud resources.
        status: brown
      - exit_code: 0
        log_message: The ${CI_JOB_NAME} job completed successfully, and found no pending changes (ec: ${CI_JOB_EXIT_CODE}). Cloud infrastructure is good to go.
        status: green

where status could be specified to select the image to be shown in pipeline views.

Best

If one considers some exit codes to be acceptable, they might not be considered failures. Adding allow_success and allow_warning blocks would provide complete handling of job error codes and pipeline visualization:

infra:plan
  allow_success:
    - exit_codes: [ 0 ]
      log_message: The ${CI_JOB_NAME} job completed successfully, and found no pending changes (ec: ${CI_JOB_EXIT_CODE}). Local infrastructure is good to go.
    - exit_codes: [ 200, 204, 304 ]
      log_message: The ${CI_JOB_NAME} job completed successfully, and found no pending changes (ec: ${CI_JOB_EXIT_CODE}). Cloud infrastructure is good to go.
  allow_warning:
    - exit_codes: [ 2 ]
      log_message: The ${CI_JOB_NAME} job completed successfully, but found pending changes (ec: ${CI_JOB_EXIT_CODE}). Please run infra:apply to sync up cloud resources.
  allow_failure:
    - exit_codes [ 4 ]
      log_message: The ${CI_JOB_NAME} job took too long (ec: ${CI_JOB_EXIT_CODE}). Check the log and run infra:apply to sync up cloud resources, if necessary.
Edited by 🤖 GitLab Bot 🤖