Skip to content

GitLab Next

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • GitLab FOSS GitLab FOSS
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
  • Merge requests 2
    • Merge requests 2
  • Requirements
    • Requirements
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Code review
    • Insights
    • Issue
    • Repository
    • Value stream
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.org
  • GitLab FOSSGitLab FOSS
  • Issues
  • #62214

Closed
Open
Created May 23, 2019 by Alessio Caiazza@nolith🚀Maintainer8 of 8 tasks completed8/8 tasks

Store root-namespace storage statistics on database

Problem to solve

Today we check storage statistics using a GROUP BY operator on ProjectStatistics and it's one of the longest running transaction in production (https://gitlab.com/gitlab-org/gitlab-ce/issues/62488)

We're using this information as part of a public API on storage counter at group level. And once we start enforcing storage limits we will need to rely on this query more often.

Also, our billing schema is based on root-namespace aggregation and this query do not aggregate to root-namespace.

Technical bits

  • On gitlab.com we have namespaces with ~15k projects, this query takes 1.2seconds to run.
  • If we try to analyze it with Chatops it timeouts: https://ops.gitlab.net/gitlab-com/chatops/-/jobs/528372
  • On our EE we have a EE::NamespaceStatistics table that keeps the root-namespace aggregation but it's only used for tracking pipelines minutes.

Proposal

  1. Create a new model with the same attributes as ProjectStatistics.*_size. The purpose of this model will be to hold the information in an aggregated form.
  2. Update the statistics in this model in an async way, to avoid large database transactions. (See backend section for the technical details)
  3. Rework !28277 (merged) to make use of this new query - https://gitlab.com/gitlab-org/gitlab-ce/issues/62796

Development log

Decisions

  • There is some prework that needs to be done before starting working on this issue.
  • Since it was reported (here and on https://gitlab.com/gitlab-org/gitlab-ce/issues/62488), that the pattern we currently use for updating project_statistics doesn't scale properly for GitLab.com, we've decided to go with a different approach for updating the namespace statistics: With a CTE refresh strategy based on the namespace routes. (https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996#note_178132519)
    • On backend implications we've outlined all the technical details
  • While working through the CTE approach, WE noticed that it might not be easy to implement and not going to be compatible with MySQL (https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996#note_181094357, https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996#note_180759005)
    • There's another possible approach of adding a new column on namespaces table that tracks the root namespace and calculate the statistics based on this column (https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996#note_178311781)
      • This will involve migrating namespaces table (one of the largest database tables on GitLab.com)
      • Because the migration in the background can take some time (up to hours or days), we've decided to ship this first in %12.1, and then continue the backend work on %12.2
  • We agreed that https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/29837?commit_id=110478466ab85ac7a7ff69cd6dee300169b05128#note_182994031 it's fast enough for an async processing job in sidekiq, and it will allow us to avoid running a migration on namespaces. The meeting was recorded
    • Regular query was implemented and merged on https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996
  • After https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996, we discovered an edge case that needs to be solved https://gitlab.com/gitlab-org/gitlab-ce/issues/62214#note_187584895
  • Bug was detected before https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996 reached production: gitlab-org/gitlab-ce#64079
    • https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30305
  • We decided to measure the group storage statistics on staging and production. Details here https://gitlab.com/gitlab-org/gitlab-ce/issues/64092
    • Performance was measured on staging and production. No inconvenient or error was found. All details in the issue.

Backend implications

Prework

  • %12.0 ~backstage remove nils from project_statistics.packages_size https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28400 https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/13163 (@nolith)
  • %12.0 gitlab-ee#11675 affects root-namespace aggregation on NamespaceStatistics and should be fixed before doing this. (@nolith)

Technical details (%12.1 )

  1. Create root_namespace_storage_statistics with all the ProjectStatistics.*_size attributes
  2. Create a second table (namespace_aggregation_schedules) with two columns id and namespace_id.
  3. Whenever the statistics of a project changes, we insert a row into namespace_aggregation_schedules
    • We don't insert a new row if there's already one related to the namespace.
    • Insertion is done through a callback and with a Sidekiq job. We can't do it in the same transaction as ProjectStatistics is already involved in a large one (https://gitlab.com/gitlab-org/gitlab-ce/issues/62488)
  4. After inserting the row, we schedule a new worker X hours into the future.
  5. This job will:
    • Update the root namespace storage statistics by querying all the namespaces through a service.
    • Delete the related namespace_aggregation_schedules after the update
  6. We also need to create another Sidekiq job that will traverse any remaining rows on namespace_aggregation_schedules and schedule jobs for every pending row.
  7. Hide all these changes behind a FF
  8. we will read the interval of caching time form redis defaulting to once every 3 hours
  9. we will experiment tweaking the interval aiming for a smaller value
  10. when we will remove the feature flag, the interval must be hardcoded or converted to an application setting (to be decided)

Merge Requests

  • Step 1 & 2 are implemented on https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/29570
  • Step 3 to 8 are implemented on https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/28996
  • Never release the redis lease gitlab-org/gitlab-ce#64079 - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30305
  • Schedule a Namespace::AggregationSchedule worker when some columns are refreshed on ProjectStatistics.refresh! - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30329
  • Hardcore the lease time depending on the analisis https://gitlab.com/gitlab-org/gitlab-ce/issues/64092 - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31341
  • Remove the feature flag - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31392
Edited Aug 06, 2019 by Mayra Cabrera
Assignee
Assign to
Time tracking