Update existing namespace statistics wiki wiki_size
What does this MR do?
In former MRs (!54969 (merged) and !54967 (merged)) we've introduced the necessary logic to update NamespaceStatistics
when there is a change in the group wiki repository.
Nevertheless, we need now to update the existing NamespaceStatistics
with the size of existing group wiki repositories. We're doing so with this background migration. It's only for EE
Queries
Migration query:
SELECT group_wiki_repositories.group_id
FROM group_wiki_repositories
The times for this query are:
Time: 1.704 ms
- planning: 0.060 ms
- execution: 1.644 ms
- I/O read: N/A
- I/O write: N/A
Shared buffers:
- hits: 545 (~4.30 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
And the plan is https://explain.depesz.com/s/as5m
Then, in the post migration we can have the select query (and preload several relations to avoid N+1) plus either updates or inserts:
SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" IN (183, 184)
SELECT "routes".* FROM "routes" WHERE "routes"."source_type" = 'Namespace' AND "routes"."source_id" IN (183, 184)
SELECT "namespace_statistics".* FROM "namespace_statistics" WHERE "namespace_statistics"."namespace_id" IN (183, 184)
SELECT "group_wiki_repositories".* FROM "group_wiki_repositories" WHERE "group_wiki_repositories"."group_id" IN (183, 184)
SELECT "shards".* FROM "shards" WHERE "shards"."id" = 1
INSERT INTO "namespace_statistics" ("namespace_id", "storage_size", "wiki_size") VALUES (234, 41943, 41943)
UPDATE "namespace_statistics" SET "wiki_size" = 41943, "storage_size" = 41943 WHERE "namespace_statistics"."id" = 441
In local, processing 100 groups takes around 1.5s. Each batch is 500, so each job will take 7.5s. There are around 6250 records in prod, therefore, there will be 13 iterations. The total time of this post-migration will be 13*120s=26mins.
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Related to #230465 (closed)