Optionally use aggregated VSA backend
What does this MR do and why?
This MR optionally enables the aggregated value stream analytics backend. The change is behind a feature flag (use_vsa_aggregated_tables
) that is disabled by default. We plan to try it on gitlab-org
since we already aggregated the data for that group.
VSA runs a few different queries:
- Median (implemented in this MR)
- Count (implemented in this MR)
- Average (will be implemented as a follow-up)
- Related records (will be implemented as a follow-up)
The new database queries are using special database tables for querying VSA tables. These are storing eventually consistent, de-normalized data. A high-level table diagram can be seen here: !71279 (merged)
VSA supports filtering the data by several parameters, these have been implemented so the new DB tables can be used:
- date range filter on the start or end event timestamps (&6046 (closed))
- labels
- assignee
- author
- milestone
Implementation
We have a central class that builds VSA queries: Gitlab::Analytics::CycleAnalytics::DataCollector
(this will go away at some point). Within this class, we optionally call the new queries by invoking Gitlab::Analytics::CycleAnalytics::Aggregated::DataCollector
.
The scopes and the base query builder is tested within the MR. The ee/spec/lib/gitlab/analytics/cycle_analytics/data_collector_spec.rb
test files have been modified to test both cases (current and new). This test file runs various high-level tests related to VSA.
How to set up and validate locally
- Enable the feature
Feature.enable(:use_vsa_aggregated_tables)
- Seed a new VSA project
SEED_CYCLE_ANALYTICS=true SEED_VSA=true FILTER=cycle_analytics rake db:seed_fu
- The seed script prints the project path, copy it and navigate to the project.
- Go to the group.
- Go to Analytics > Value Stream
- Open the top right dropdown and
Create new Value Stream
- Add a name and save.
- Start rails console and aggregate the data
group = Group.find(x) Analytics::CycleAnalytics::DataLoaderService.new(group:group, model: Issue).execute Analytics::CycleAnalytics::DataLoaderService.new(group:group, model: MergeRequest).execute
- Load the VSA page again.
- Inspecting the
median
andcount
endpoint requests, we should see that the_stage_events
tables are being used.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #335391 (closed)