Slow query causing the dataloss subcommand to timeout.
I was playing with Gitaly cluster deployed via GET. It seems to work well as I also managed to run GPT against the instance.
However, I noticed that running the following command:
/opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss
resulted in an error:
error checking: rpc error: code = Unknown desc = query: pq: canceling statement due to statement timeout
This is the statement where it timed out:
2021-08-19_10:06:13.12404 ERROR: canceling statement due to statement timeout
2021-08-19_10:06:13.12437 STATEMENT:
2021-08-19_10:06:13.12445 SELECT
2021-08-19_10:06:13.12452 json_build_object (
2021-08-19_10:06:13.12460 'RelativePath', relative_path,
2021-08-19_10:06:13.12467 'Primary', "primary",
2021-08-19_10:06:13.12474 'Storages', json_agg(
2021-08-19_10:06:13.12482 json_build_object(
2021-08-19_10:06:13.12498 'Name', storage,
2021-08-19_10:06:13.12507 'BehindBy', behind_by,
2021-08-19_10:06:13.12515 'Assigned', assigned,
2021-08-19_10:06:13.12522 'Healthy', healthy,
2021-08-19_10:06:13.12529 'ValidPrimary', valid_primary
2021-08-19_10:06:13.12536 )
2021-08-19_10:06:13.12541 )
2021-08-19_10:06:13.12547 )
2021-08-19_10:06:13.12552 FROM (
2021-08-19_10:06:13.12558 SELECT
2021-08-19_10:06:13.12564 relative_path,
2021-08-19_10:06:13.12570 repositories.primary,
2021-08-19_10:06:13.12576 storage,
2021-08-19_10:06:13.12581 repository_generations.generation - COALESCE(storage_repositories.generation, -1) AS behind_by,
2021-08-19_10:06:13.12591 repository_assignments.storage IS NOT NULL AS assigned,
2021-08-19_10:06:13.12597 healthy_storages.storage IS NOT NULL AS healthy,
2021-08-19_10:06:13.12603 valid_primaries.storage IS NOT NULL AS valid_primary
2021-08-19_10:06:13.12609 FROM storage_repositories
2021-08-19_10:06:13.12615 FULL JOIN (
2021-08-19_10:06:13.12620 SELECT virtual_storage, relative_path, storage
2021-08-19_10:06:13.12628 FROM repositories
2021-08-19_10:06:13.12634 CROSS JOIN (SELECT unnest($2::text[]) AS storage) AS configured_storages
2021-08-19_10:06:13.12641 WHERE (
2021-08-19_10:06:13.12646 SELECT COUNT(*) = 0 OR COUNT(*) FILTER (WHERE storage = configured_storages.storage) = 1
2021-08-19_10:06:13.12653 FROM repository_assignments
2021-08-19_10:06:13.12658 WHERE virtual_storage = repositories.virtual_storage
2021-08-19_10:06:13.12666 AND relative_path = repositories.relative_path
2021-08-19_10:06:13.12672 AND storage = ANY($2::text[])
2021-08-19_10:06:13.12673 )
2021-08-19_10:06:13.12674 ) AS repository_assignments USING (virtual_storage, relative_path, storage)
2021-08-19_10:06:13.12674 JOIN repositories USING (virtual_storage, relative_path)
2021-08-19_10:06:13.12674 JOIN repository_generations USING (virtual_storage, relative_path)
2021-08-19_10:06:13.12675 LEFT JOIN healthy_storages USING (virtual_storage, storage)
2021-08-19_10:06:13.12675 LEFT JOIN valid_primaries USING (virtual_storage, relative_path, storage)
2021-08-19_10:06:13.12675 WHERE virtual_storage = $1
2021-08-19_10:06:13.12676 ORDER BY relative_path, "primary", storage
2021-08-19_10:06:13.12676 ) AS outdated_repositories
2021-08-19_10:06:13.12676 GROUP BY relative_path, "primary"
2021-08-19_10:06:13.12677 HAVING bool_or(NOT valid_primary) FILTER(WHERE assigned)
2021-08-19_10:06:13.12677 ORDER BY relative_path, "primary"
Here is the EXPLAIN ANALYZE
version of the query: https://explain.depesz.com/s/JOUP#html
Created the issue as requested by @samihiltunen