Allow 'skipped -> created' state transitions in deployment jobs

Summary

Disallowing a skipped to created state transition on Deployment objects causes it to miss sending a deployment running state web-hook notification.

Steps to reproduce

  1. Setup a webhook receiver URL. I used https://webhook.site/ as an example.
  2. Create a project with CI enabled and configure it to use the webhook under Project > Settings > Webhooks
  3. Configure the webhook to only send Deployment events
  4. Add the following as the pipeline configuration for the project via CI/CD > Editor
image: bash:latest

build:
  stage: build
  when: manual
  script: echo

deploy:
  stage: deploy
  environment: production
  needs:
    - build
  script: echo
  1. Run the pipeline, and check the webhook URL for received events
  2. The following bug is observed:
  • Expectation: Two deployment events are received: deployment running and success states
  • Observed: Only a single deployment event is received: deployment success state

Example Project

This can be tested on any CI/CD enabled project. An example is difficult to provide as it requires measuring from an external receiver (webhook URL).

What is the current bug behavior?

Deployment running webhook events are not sent for affected jobs

What is the expected correct behavior?

Deployment running webhook events are sent for all deployment jobs

Relevant logs and/or screenshots

See customer support ticket: https://gitlab.zendesk.com/agent/tickets/254907 (internal link) or the instructions to use webhooks above.

Output of checks

This bug happens on GitLab.com

Possible fixes

Notes from implementation review

When the pipeline's created, the test job starts with an initial state of created, and so does its associated deployment.

The test job is then initially moved to skipped because it depends on a manual action of build job. This causes the Deployment object's state too to go from created (initial) to skipped.

However, when the build job is played, the test job's state updates from skipped to created instantly.

This action gets deemed illegal by the deployment state sync because there's an assumption that created is not a state that can ever recur. We do not handle receiving a created state from the job: https://gitlab.com/gitlab-org/gitlab/blob/989928ec3dd95afd52db9f26d3a9d9210fbf2b69/app/models/deployment.rb#L378-393 so if it ever came in the result would be an error of The status created is invalid and the state of the deployment job will remain skipped.

The pipeline moves on though despite the above (because its not a fatal exception, its logged and control is returned) and when the test job moves to running the deployment equally fails to move to running now because the deployment state machine does not accept a skipped to running movement, registering an error:

"exception.class": "Deployment::StatusSyncError"
"exception.message": "Cannot transition status via :run from :skipped (Reason(s): Status cannot transition via \"run\")"

The MR !72746 (merged) hid the logs The status created is invalid from being logged as it was decided in its discussion that the deployment's state must never need to go 'backwards' to created and that whatever was attempting it was a bug, but the way the job transitions occur for needs + when:manual jobs appears to suggest this is a possible and normal state flow. The issue of #341561 (closed) further causes skipped states to occur for jobs after they have been created.

Fix suggestion

  1. The handling at https://gitlab.com/gitlab-org/gitlab/blob/989928ec3dd95afd52db9f26d3a9d9210fbf2b69/app/models/deployment.rb#L378-393 should include accepting created events from the job state
  2. The event state machine should include a valid transition to :created from :skipped: https://gitlab.com/gitlab-org/gitlab/blob/989928ec3dd95afd52db9f26d3a9d9210fbf2b69/app/models/deployment.rb#L62-81

/cc @mlockhart @tmike