Retried jobs are not included in Pipeline Hook for Datadog integration
Summary
Jobs that are retried before the pipeline finishes are not included in the Pipeline Hook for Datadog integration. This was changed for performance reasons for generic Pipeline Hooks, but !67031 (merged) was added specifically for the Datadog integration to revert the behavior.
I'm not sure if this is a full regression of #331239 (closed) but it has been reported to me and I was also able to reproduce locally.
It's very important for us to have the full list of jobs, regardless or if they end up passing or not. Otherwise the visibility we offer is not reliable.
Steps to reproduce
I reproduced this by creating this pipeline and just retrying jobs while the sleep job is running.
build1:
stage: build
script:
- echo "Do your build here"
build2:
stage: build
script:
- sleep 600
test1:
stage: test
script:
- if (( RANDOM % 2 )); then echo "OK"; else fail; fi
test2:
stage: test
script:
- sleep 120
deploy1:
stage: deploy
script:
- echo "Do your deploy here"
What is the current bug behavior?
When the pipeline finished the Pipeline Hook only included the final run of each job.
What is the expected correct behavior?
The pipeline hook includes all retries of each job, no matter if they passed or failed.
Results of GitLab environment info
I reproduced this locally running master. We have also seen it internally running version 14.9.5
Possible fixes
I'm not sure. Current code has a lazy callback that is supposed to include retried jobs in the hook when it's for the Datadog integration. Perhaps this is been evaluated too early so the laziness doesn't do what we expect from it.
cc @arturoherrero if you could please assign it to someone that can help me. This has quite a big impact for us