CI: anomalies in job ordering/times
I've noticed a pipeline on which job were triggered in a surprising order leading to surprising things... this seems very weird.
The pipeline is: https://gitlab.com/sylva-projects/sylva-core/-/pipelines/1300911328
- deploy-management from 13:53 to 14:33 (all good)
- deploy-workload from 14:33 to 14:53 (all good)
- update-management from 14:53 to 14:56 (all good)
- deployment-failure:cleanup at 16:06
- this is of course much much too early
- but since this is a delayed cleanup (MR has delay-capo-ci-cleanup-on-failure), it did not cleanup the cluster
- update-workload from 16:28 to 16:30
- more than 1.5h afterwards !
- it fails because
unable to create new content in namespace rke2-capo because it is being terminated
-->🤔
- delete-workload-cluster from 16:30 to 16:32
- fail because
helmreleases.helm.toolkit.fluxcd.io "sylva-units" not found
- (most likely the same cause as previous issue: the namespace is gone)
- fail because
Looking at the workload cluster namespace, we see deletionTimestamp: "2024-05-22T16:02:37Z"
.
I can't see any job having been running at that time.
All this needs to be understood:
- why deployment-failure was triggered before other jobs are done ?
- what triggered
rke2-capo
namespace deletion ?