PDM pipeline sometimes thinks there are no migrations to be executed
Summary
In every gprd deployment, there is a job called gprd-fetch-post-migrations
that fetches the pending post-deploy migrations and triggers a release/tools
pipeline. The release/tools
pipeline saves the list of migrations as an artifact. This artifact is then read by the post deploy migration execution pipeline to determine if there are migrations to execute. Currently, the artifact is automatically deleted after 1 day. So if the post deploy migration execution pipeline is run more than a day after the last gprd deployment, the artifact will already have been deleted and the pipeline will think there are no migrations to be executed.
Recent post deploy migration execution pipeline where it didn't find any migrations because the artifact had already expired. The RM (@ahyield) retried the "build artifact" job to re-create the artifact.
- Post deploy pipeline: https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/10429782.
- 1st build artifact job: https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/10412271.
- Retried build artifact job: https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/10429925.
Proposal
Run a db:migrate:status
command in the post deploy migration pipeline to determine if there are pending migrations. If there are pending migrations, continue the pipeline, else stop the pipeline.
- Currently, we run the pending_migrations.yml Ansible playbook in the
gprd-fetch-post-migrations
job during a gprd deployment. - Instead of running the playbook during the
gprd
deployment and then storing the result in an artifact to be read by the post-deploy migration pipeline, we can run the playbook during the post-deploy migration pipeline itself. - The list of pending migrations will need to be passed back to
release-tools
(to post a comment on the monthly release issue). We can use an artifact to pass the list of migrations torelease-tools
. But in this case, the artifact will be read immediately, by the upstream pipeline. We won't have to worry about the artifact expiring or having stale information. The artifact is only used because that is the only way to pass information upstream to the triggering pipeline.