Skip to content

Resolve "SaaS Service Ping Metrics Failures causing null UMAU value"

What does this MR do and why?

Follow up to !73273 (merged) which scheduled async index creation

Every metric that is calculated based on data coming from database, service collection ServicePing data run actually 3 types of queries under the hood:

  1. Query to locate starting point, where is should begin counting metric from
  2. Query to locate finish point, where counting of metric should be stopped
  3. Query to calculate metric within boundaries of discovered at 1 and 2

Metric usage_activity_by_stage_monthly.manage.events started to time out 1st query since planner used pkey index to look for starting point (see old query) After experimenting on postres.ai it was found out that index on created_at and id combined with dedicated start query should be able to vastly improve metric performance

Database

  1. Old timeouting min query: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7064/commands/24990
  2. Index candidate: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7145/commands/25312
  3. New min query: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7145/commands/25315
  4. New max query https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/7145/commands/25316

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #343679 (closed)

Edited by Mikołaj Wawrzyniak

Merge request reports