Skip to content

Optionally use aggregated VSA backend

Adam Hegyi requested to merge 335391-new-vsa-queries into master

What does this MR do and why?

This MR optionally enables the aggregated value stream analytics backend. The change is behind a feature flag (use_vsa_aggregated_tables) that is disabled by default. We plan to try it on gitlab-org since we already aggregated the data for that group.

VSA runs a few different queries:

  • Median (implemented in this MR)
  • Count (implemented in this MR)
  • Average (will be implemented as a follow-up)
  • Related records (will be implemented as a follow-up)

image

The new database queries are using special database tables for querying VSA tables. These are storing eventually consistent, de-normalized data. A high-level table diagram can be seen here: !71279 (merged)

VSA supports filtering the data by several parameters, these have been implemented so the new DB tables can be used:

  • date range filter on the start or end event timestamps (&6046 (closed))
  • labels
  • assignee
  • author
  • milestone

Implementation

We have a central class that builds VSA queries: Gitlab::Analytics::CycleAnalytics::DataCollector (this will go away at some point). Within this class, we optionally call the new queries by invoking Gitlab::Analytics::CycleAnalytics::Aggregated::DataCollector.

The scopes and the base query builder is tested within the MR. The ee/spec/lib/gitlab/analytics/cycle_analytics/data_collector_spec.rb test files have been modified to test both cases (current and new). This test file runs various high-level tests related to VSA.

How to set up and validate locally

  1. Enable the feature
    Feature.enable(:use_vsa_aggregated_tables)
  2. Seed a new VSA project
    SEED_CYCLE_ANALYTICS=true SEED_VSA=true FILTER=cycle_analytics rake db:seed_fu
  3. The seed script prints the project path, copy it and navigate to the project.
  4. Go to the group.
  5. Go to Analytics > Value Stream
  6. Open the top right dropdown and Create new Value Stream
  7. Add a name and save.
  8. Start rails console and aggregate the data
    group = Group.find(x)
    Analytics::CycleAnalytics::DataLoaderService.new(group:group, model: Issue).execute
    Analytics::CycleAnalytics::DataLoaderService.new(group:group, model: MergeRequest).execute
  9. Load the VSA page again.
  10. Inspecting the median and count endpoint requests, we should see that the _stage_events tables are being used.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #335391 (closed)

Edited by Adam Hegyi

Merge request reports