Add background migration to fix zero packages_sizes project statistics
🚀 Context
In the project statistics object, we have a metric called packages_size
that is the aggregation (sum) of all the package files size
.
In #363010 (closed), we discovered that some packages_size
seemed to be "out of sync". For example, a package_size
of 10MB
but the sum of all package files would be 5KB
. The presumed root cause of this desync is concurrent updates: the packages_size
metric could be updated in a burst. During that burst race conditions could appear and make some updates getting lost.
To solve this, we switched the packages_size
metric updates to a "buffered" counter. In very short words, instead of updating the database column directly we use a redis entry where we incrby
. Then, once in a while (each 10 minutes max), a job is responsible to move (or "flush") the updates from redis to the database. This way, redis acts as a "protection layer" for the database.
That's all nice but what about the existing out of sync data? Well, we don't have a lot of choices here, we need background migrations to detect incoherent packages_size
and fix them. That's Background migration to fix incorrect statistic... (#378433 - closed).
Given the amount/scale in play here, we are going to split the background migrations in two:
- Project statistics
packages_size
different than0
. That's !107799 (closed). - Project statistics
packages_size
set to0
and some package files exist.👈 This is this MR
🔬 What does this MR do and why?
- Introduce a background migration to deal with incoherent and zero
packages_size
.- Only for project that have packages (obviously).
- Introduce a temporary indexes to support that migration.
- Add the related specs.
📺 Screenshots or screen recordings
None
⚗ How to set up and validate locally
Let's create some packages for the first project. In a rails console:
def fixture_file_upload(*args, **kwargs)
Rack::Test::UploadedFile.new(*args, **kwargs)
end
10.times { FactoryBot.create(:npm_package, project: Project.first) }
Wait at least 10
minutes, so that updates to packages_size
are flushed.
Now, check the original packages_size
and update it to make it incoherent.
Project.first.statistics.packages_size # 4096000 note this.
Project.first.statistics.update!(packages_size: 0)
Let's run the migration:
rails db:migrate
or
$ rake db:migrate:down VERSION=20230105132223
$ rake db:migrate:up VERSION=20230105132223
Wait for at least 10
minutes and check the packages_size
again. It should be fixed.
Project.first.statistics.packages_size # 4096000 as it was originally
The background migration is working as expected
🏎 MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
💾 Database review
⤴ Migration up
main: == 20230110125926 AddTemporaryZeroPackagesSizeIndexToProjectStatistics: migrating
main: -- transaction_open?()
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0070s
main: -- index_exists?(:project_statistics, [:id, :project_id], {:where=>"packages_size = 0", :name=>"tmp_idx_project_statistics_on_zero_packages_size", :algorithm=>:concurrently})
main: -> 0.0075s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0007s
main: -- add_index(:project_statistics, [:id, :project_id], {:where=>"packages_size = 0", :name=>"tmp_idx_project_statistics_on_zero_packages_size", :algorithm=>:concurrently})
main: -> 0.0048s
main: -- execute("RESET statement_timeout")
main: -> 0.0007s
main: == 20230110125926 AddTemporaryZeroPackagesSizeIndexToProjectStatistics: migrated (0.0397s)
main: == 20230111125926 QueueFixIncorrectZeroPackagesSizeOnProjectStatistics: migrating
main: == 20230111125926 QueueFixIncorrectZeroPackagesSizeOnProjectStatistics: migrated (0.0593s)
⤵ Migration down
main: == 20230111125926 QueueFixIncorrectZeroPackagesSizeOnProjectStatistics: reverting
main: == 20230111125926 QueueFixIncorrectZeroPackagesSizeOnProjectStatistics: reverted (0.0350s)
main: == 20230110125926 AddTemporaryZeroPackagesSizeIndexToProjectStatistics: reverting
main: -- transaction_open?()
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.1257s
main: -- indexes(:project_statistics)
main: -> 0.0095s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0006s
main: -- remove_index(:project_statistics, {:algorithm=>:concurrently, :name=>"tmp_idx_project_statistics_on_zero_packages_size"})
main: -> 0.0041s
main: -- execute("RESET statement_timeout")
main: -> 0.0007s
main: == 20230110125926 AddTemporaryZeroPackagesSizeIndexToProjectStatistics: reverted (0.2118s)
📊 Queries analysis
🚀 Background migration execution time analysis for gitlab.com
We have 28252538 packages_size = 0
statistics.
We are using a batch size of 200
. We will need 141263
loops.
From the query analysis, we have around 150ms
of execution time in queries.
That gives us:
141263
* 150ms
+ 141262
* 2000ms
= 303713450ms
= 303713.45s
= 5061.89m
= 84.36h
= ~ 3.5 days