Skip to content

Manage packages_size statistic with a counter attribute

Context

In the usage quota page, one metric taken into account is the size of the Package Registry. Basically, how much storage each package is using.

This statistic is linked to the attribute size of ::Packages::PackageFile.

On master, the statistic is updated in the following way:

  1. Rails callbacks are registered so that after_save or after_destroy, we take action.
  2. Those callbacks will call ProjectStatistics.increment_statistic with the proper amount (which can be negative).
  3. After chaining several functions, we end up using .update_counters from rails.

(3.) is done outside of a database transaction. The reason of why it is outside a transaction, it's because:

@fabiopitino said:

The reason why we moved the increment in a separate transaction is because project_statistics is a highly contended table and before it was causing a lot of statement timeout errors on many concurrent updates. By moving it to a different transaction (after_commit) we separated the main transaction (add/remove model) from the side-effect (update statistics) and made the main transaction more resilient. See !20852 (merged) for context.

The problem is that being outside a transaction, we introduced a race condition risk.

Add lease to update project statistics row and ... (!97912 - merged) improved the monitoring around those statistics updates. Among other things, we now detect concurrent updates. Guess who is the main culprit here? Yes, packages_size 😢

Because of those concurrent updates, we are noticing a loss of accuracy in the usage quota page where the packages_size metric is no longer the sum of all package files sizes. This is issue #363010 (closed).

🌬 The solution

Basically, the solution used in this MR is to avoid or at least lower those concurrent updates. We already have a tool in place for this: CounterAttribute.

In very short words, a counter attribute will "stack" the counter updates in Redis and enqueue a job that will run in 10.minutes. That job will "simply" flush the counter update from Redis to the database.

This works because in Redis, we have means to guarantee that we will never have concurrent updates.

This MR is thus as simple as move packages_size updates to a CounterAttribute.

🤔 What does this MR do and why?

  • Declare packages_size in ProjectStatistics as a counter_attribute.
  • Add a feature flag support when updating the packages_size so we can still decide if the update is sync (old approach) or async/delayed (new approach).
  • Update/Create related specs.

📺 Screenshots or screen recordings

None

How to set up and validate locally

  1. Have GDK ready with one project and a Personal Access Token.
  2. To keep things simple, we're going to use the generic package registry. With a terminal, let's create 5 packages:
    $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic1/1.1.2/file.txt"
    $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic2/1.1.2/file.txt"
    $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic3/1.1.2/file.txt"
    $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic4/1.1.2/file.txt"
    $ curl --upload-file <dummy text file path> "http://<username>:<PAT>@gdk.test:8000/api/v4/projects/310/packages/generic/generic5/1.1.2/file.txt"
  3. Now check the project usage quota page http://gdk.test:8000/<project full path>/-/usage_quotas:
    • Screenshot_2022-11-04_at_10.16.03
    • We have some storage being used .

Everything is setup properly.

Let's have a run without the feature flag enabled.

  1. Delete all packages in a rails console:
    Project.last.packages.destroy_all
  2. In the rails console, you should see these SQL queries:
    ProjectStatistics Update All (1.1ms)  UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/
    ProjectStatistics Update All (0.8ms)  UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/
    ProjectStatistics Update All (0.5ms)  UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/
    ProjectStatistics Update All (0.5ms)  UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/
    ProjectStatistics Update All (0.5ms)  UPDATE "project_statistics" SET "packages_size" = COALESCE("packages_size", 0) - 8, "storage_size" = COALESCE("storage_size", 0) - 8 WHERE "project_statistics"."id" = 307 /*application:console,db_config_name:main,console_hostname:worky.local,console_username:david,line:/app/models/concerns/counter_attribute.rb:135:in `block in update_counters_with_lease'*/
  3. Check the usage quota page again, it's down to 0 bytes.

Ok, that's the "synchronous" packages_size updates.

Let's re upload 5 packages (step 2 from our setup above). Let's enable the feature flag now:

Feature.enable(:packages_size_counter_attribute)
  1. (Make sure that you have background jobs running!)
  2. Delete all packages in a rails console:
    Project.last.packages.destroy_all
  3. This time around, no SQL updates on project_statistics.
  4. While we wait the flush job to kick in (10 minutes), we can check that we have the updates in Redis:
    ps = Project.last.statistics
    key = ps.counter_key(:packages_size)
    Gitlab::Redis::SharedState.with { |r| r.get(key) }
    => "-40"
    • This means that our -40 update to packages_size is waiting for the job. Note that while in this waiting state, any update on packages_size will affect this redis key (eg. we upload a file of 100 bytes, that key will get updated to 60 (-40 + 100)).
  5. Also while waiting, check the usage quota page. It still shows 40 bytes.
  6. (After waiting 10 minutes) The job runs and the packages_size is updated accordingly. The redis key is gone (nil) and the usage quota page is updated accordingly.
    • Even if this is a very small example, we just combined 5 UPDATE statements into a single one = less chances to have concurrent updates.

Async packages_size updates are working properly! 🎉

🚦 MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by David Fernandez

Merge request reports