Efficient counters (#24469) · Issues · GitLab.org / GitLab

Efficient counters

We have various places where we want to count things. * For example, we keep track of number of repositories and wikis in `site_statistics`. However, the implementation here is not efficient because all requests that change the count are serialized on this one record, sometimes resulting in timeouts errors. * The usage ping requires a lot of counting * We're adding more counting every now and then, e.g. in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/22007. In short, concurrent updates to counters in statistics tables (e.g. `project_statistics`) often fail with query timeouts due to resource contention. We need to make updates more performant and non-blocking while still allow the values to be read correctly. The proposal here is to provide an efficient implementation of exact counters. By exact, we mean consistent with the postgres database (MVCC compatible). ## Proposal (implemented in https://gitlab.com/gitlab-org/gitlab/-/merge_requests/35878) Let's use the `ProjectStatistics` model as an example. We want to efficiently update `build_artifacts_size` counter without incurring into query timeouts. In this MR we introduce a new module `ConterAttribute` that brings counter attributes functionality. It provides a methods `increment_counter` that increments the counter on Redis and schedules a worker after some time to flush the increments to the database. This way: * writes to the primary columns are only performed by the background worker * reads can be performed against the primary columns (with some delay in accuracy) or including pending increments ## Blocker This issue is currently blocking: * **All groups from adding useful usage ping statstics** * Issues listed under **Blocks** in [Linked issues](https://gitlab.com/gitlab-org/gitlab/-/issues/24469#related-issues) section of this issue

issue