Prevent tests passing if they are testing a different version from the intended one
In production#6793 (closed) we had a repeat of the failing asset job problem, but in this case the problem was found on gprd-cny. This issue is to investigate how the gstg-cny pipeline managed to succeed.
Timeline in table form:
Pipeline for **14.10.202204080235-5f55c18e9bd.3a269117307**
|
Pipeline for 14.10.202204080321-2c6c6e93026.44de3c8d361
|
---|---|
04:23 - gstg-cny failed for package **14.10.202204080235-5f55c18e9bd.3a269117307** - https://gitlab.slack.com/archives/C8PKBH3M5/p1649391799275609
|
|
05:23 - gstg-cny started to deploy the next package 14.10.202204080321-2c6c6e93026.44de3c8d361 , failed due to the ongoing deployment https://gitlab.slack.com/archives/C8PKBH3M5/p1649391815413389
|
|
05:44 - gstg-cny is unlocked | |
06:20 - gstg-cny QA pipeline of **14.10.202204080235-5f55c18e9bd.3a269117307** is run a 3rd time - https://ops.gitlab.net/gitlab-org/quality/staging-canary/-/pipelines/1136910. This happens 3 minutes before the 0321 package completed deployment to gstg-cny. So, this QA pipeline ran against the 0321 package, instead of the 0235 package.
|
|
06:23 - gstg-cny finished deployment of 14.10.202204080321-2c6c6e93026.44de3c8d361 (The gstg-cny-kubernetes job completed at this time) - https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/pipelines/1136771
|
|
06:31 - gstg-cny finished deployment of **14.10.202204080235-5f55c18e9bd.3a269117307** - https://gitlab.slack.com/archives/C8PKBH3M5/p1649399831626699
|
|
06:35 - 3rd QA pipeline of **14.10.202204080235-5f55c18e9bd.3a269117307** succeeds. |
|
06:37 - gprd-cny starts to deploy **14.10.202204080235-5f55c18e9bd.3a269117307** - https://gitlab.slack.com/archives/C8PKBH3M5/p1649399854692499
|
Linear timeline:
- 04:23 - gstg-cny failed for package
**14.10.202204080235-5f55c18e9bd.3a269117307**
- https://gitlab.slack.com/archives/C8PKBH3M5/p1649391799275609 - 05:23 - gstg-cny started to deploy the next package
14.10.202204080321-2c6c6e93026.44de3c8d361
, failed due to the ongoing deployment https://gitlab.slack.com/archives/C8PKBH3M5/p1649391815413389 - 05:44 - gstg-cny is unlocked
- 06:20 - gstg-cny QA pipeline of
**14.10.202204080235-5f55c18e9bd.3a269117307**
is run a 3rd time - https://ops.gitlab.net/gitlab-org/quality/staging-canary/-/pipelines/1136910. - 06:23 - gstg-cny finished deployment of
14.10.202204080321-2c6c6e93026.44de3c8d361
(Thegstg-cny-kubernetes
job completed at this time) - https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/pipelines/1136771. This happens 3 minutes before the0321
package completed deployment to gstg-cny. So, this QA pipeline ran against the0321
package, instead of the0235
package. - 06:31 - gstg-cny finished deployment of
**14.10.202204080235-5f55c18e9bd.3a269117307**
- https://gitlab.slack.com/archives/C8PKBH3M5/p1649399831626699 - 06:35 - 3rd QA pipeline of
**14.10.202204080235-5f55c18e9bd.3a269117307**
succeeds. - 06:37 - gprd-cny starts to deploy
**14.10.202204080235-5f55c18e9bd.3a269117307**
- https://gitlab.slack.com/archives/C8PKBH3M5/p1649399854692499
Somehow it seems like the unlock command is allowing jobs to be skipped/failing and still marking a deployment as passing.
@a_mcdonald - From https://gitlab.slack.com/archives/C03A62X25K9/p1649396423115089 it sounds like the QA tests were failing. Was anything changed/re-run that could have led to these tests passing on this pipeline?
Edited by Amy Phillips