Allow 'skipped -> created' state transitions in deployment jobs
Summary
Disallowing a skipped to created state transition on Deployment objects causes it to miss sending a deployment running state web-hook notification.
Steps to reproduce
- Setup a webhook receiver URL. I used https://webhook.site/ as an example.
- Create a project with CI enabled and configure it to use the webhook under Project > Settings > Webhooks
- Configure the webhook to only send
Deployment events - Add the following as the pipeline configuration for the project via CI/CD > Editor
image: bash:latest
build:
stage: build
when: manual
script: echo
deploy:
stage: deploy
environment: production
needs:
- build
script: echo
- Run the pipeline, and check the webhook URL for received events
- The following bug is observed:
- Expectation: Two deployment events are received: deployment
runningandsuccessstates - Observed: Only a single deployment event is received: deployment
successstate
Example Project
This can be tested on any CI/CD enabled project. An example is difficult to provide as it requires measuring from an external receiver (webhook URL).
What is the current bug behavior?
Deployment running webhook events are not sent for affected jobs
What is the expected correct behavior?
Deployment running webhook events are sent for all deployment jobs
Relevant logs and/or screenshots
See customer support ticket: https://gitlab.zendesk.com/agent/tickets/254907 (internal link) or the instructions to use webhooks above.
Output of checks
This bug happens on GitLab.com
Possible fixes
Notes from implementation review
When the pipeline's created, the test job starts with an initial state of created, and so does its associated deployment.
The test job is then initially moved to skipped because it depends on a manual action of build job. This causes the Deployment object's state too to go from created (initial) to skipped.
However, when the build job is played, the test job's state updates from skipped to created instantly.
This action gets deemed illegal by the deployment state sync because there's an assumption that created is not a state that can ever recur. We do not handle receiving a created state from the job: https://gitlab.com/gitlab-org/gitlab/blob/989928ec3dd95afd52db9f26d3a9d9210fbf2b69/app/models/deployment.rb#L378-393 so if it ever came in the result would be an error of The status created is invalid and the state of the deployment job will remain skipped.
The pipeline moves on though despite the above (because its not a fatal exception, its logged and control is returned) and when the test job moves to running the deployment equally fails to move to running now because the deployment state machine does not accept a skipped to running movement, registering an error:
"exception.class": "Deployment::StatusSyncError"
"exception.message": "Cannot transition status via :run from :skipped (Reason(s): Status cannot transition via \"run\")"
The MR !72746 (merged) hid the logs The status created is invalid from being logged as it was decided in its discussion that the deployment's state must never need to go 'backwards' to created and that whatever was attempting it was a bug, but the way the job transitions occur for needs + when:manual jobs appears to suggest this is a possible and normal state flow. The issue of #341561 (closed) further causes skipped states to occur for jobs after they have been created.
Fix suggestion
- The handling at https://gitlab.com/gitlab-org/gitlab/blob/989928ec3dd95afd52db9f26d3a9d9210fbf2b69/app/models/deployment.rb#L378-393 should include accepting
createdevents from the job state - The event state machine should include a valid transition to
:createdfrom:skipped: https://gitlab.com/gitlab-org/gitlab/blob/989928ec3dd95afd52db9f26d3a9d9210fbf2b69/app/models/deployment.rb#L62-81
/cc @mlockhart @tmike