Skip to content

fix: disable statement timeout for layers simplified usage migrations

João Pereira requested to merge statement-timeouts into master

The post-deployment migrations introduced in !1189 (merged) failed to be applied in production due to a statement timeout set on the postgres server side (15s): gitlab-com/gl-infra/production#8212 (comment 1237508756)

We need to disable statement timeouts around the execution of the statements for each of these post-deployment migrations. We still have to find a way to circumvent PgBouncer AND use a single connection, but we definitely need to disable timeouts for the sessions where these will be applied.

Usually, we should not change existing migrations but this change does not impact any of the environments where these were already applied (pre/gtsg), it will only make it work for gprd as well. Note that I don't disable statement timeouts for the last batch (6), as that only creates the index on the parent layers table, and that should only take a few milliseconds (unless there is something wrong, and we want it to fail so that we notice).

This is tricky to test, but what I did was:

  1. Set default statement timeout to 1000 (ms) in postgresql.conf of my local server;
  2. Replace the Up statements of 20221129145757_post_add_layers_simplified_usage_index_batch_2 with:
    "SELECT pg_sleep(.5)",
    "SELECT pg_sleep(1.1)",
    "SELECT pg_sleep(2)",
  3. Executed the migration. Looking at the server logs, I can see the 500ms sleep succeeds and the 1.1s one was aborted due to the statement timeout.
  4. Surround Up statements with:
    "SET statement_timeout TO 0",
    "SELECT pg_sleep(.5)",
    "SELECT pg_sleep(1.1)",
    "SELECT pg_sleep(2)",
    "RESET statement_timeout",
  5. Executed the migration once again, this time all statements were executed successfully.
Edited by João Pereira

Merge request reports