Skip to content
GitLab
Next
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • gitaly gitaly
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 561
    • Issues 561
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 56
    • Merge requests 56
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test Cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.orgGitLab.org
  • gitalygitaly
  • Issues
  • #3286
Closed
Open
Issue created Nov 11, 2020 by Craig Furman@craigf

Praefect metrics queries should not hit the database

Praefect's prometheus metrics endpoint produces some metrics that are derived from database queries. These queries can be expensive, especially when interlaced from many praefect servers being scraped in parallel, hitting the same single database primary in parallel. This was a contributing factor to the gitlab.com incident gitlab-com/gl-infra/production#2918 (closed).

This is a prometheus anti-pattern. As a GitLab operator scales praefect out, which as a stateless server process they may well do, increasing pressure is put on the database as every praefect node is scraped by prometheus.

A possible solution is to ensure that all metrics scraped by prometheus don't require DB access. Separate out those that do into a separate process, that can run as a singleton prometheus exporter. Alternatively, we could use the SQL exporter to yield metrics from given database queries.

We do something similar with gitlab-rails. DB-derived metrics are scraped by the gitlab exporter, which runs as a singleton against a replica. If every application server instance hit the DB on every prometheus scrape, we would grind to a halt.

cc @albertoramos @bjk-gitlab @nnelson @samihiltunen

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking