Skip to content

Optionally show in progress items in VSA

Adam Hegyi requested to merge 299265-vsa-support-for-in-progress-items into master

What does this MR do?

This MR adds an extra parameter to the VSA backend which changes the underlying query to filter for in progress items.

How it works today?

VSA currently works on both on project and group levels, on the group level there is extra functionality where the users can customize the filters.

Structure:

  • Group can have many value streams
  • A value stream can have many stages
  • A stage defines a time range based on timestamp expressions
    • Start event: MR created (merge_requests.created_at column)
    • End event: MR closed (merge_request.closed_at column)
    • These timestamps are used to calculate duration (end_event - start_event) used in various metrics: median, average, etc.

Base query example for getting duration:

SELECT (issues.closed_at - issues.created_at) as duration
FROM issues
WHERE
project_id = x AND
created_at > '2010-01-01' AND created_at < '2010-02-01'

The query only works on finished items where both columns are NOT NULL. This MR adds an option to control the end event parameter and make it open-ended.

How to show in progress items?

Showing in progress items requires an extra condition to the query where we express that the end_event timestamp expression must be NULL. To calculate the duration we take the current timestamp.

SELECT (CURRENT_TIMESTAMP - issues.created_at) as duration
FROM issues
WHERE
project_id = x AND
created_at > '2010-01-01' AND created_at < '2010-02-01' AND
closed_at IS NULL

Implementation

Since VSA stages are customizable (user can select the start and end event pairs), a generic solution was needed.

Each event is expressed as a class, the full list of events can be found here: https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/analytics/cycle_analytics/stage_events

The class defines the following:

  • column_list: what columns are part of the event. This is usually 1 column, for example: merge_requests.created_at
    • Some events have two columns (legacy)
  • timestamp_projection: this method defines the Arel SQL expression for the event, examples:
    • merge_requests.created_at
    • COELASCE(merge_requests.updated_at, merge_requests.closed_at), special two column definition
  • apply_query_customization: some columns are defined on other tables, so sometimes JOIN is required. An event can override this method to add the join or other filters if needed.
  • apply_negated_query_customization: new method, that adds the timestamp_projection IS NULL condition to the query when we filter for in-progress items.

To toggle between in-progress and finished items a new option was introduced: end_event_filter.

  • If end_event_filter == :finished, executes the original behavior.
  • If end_event_filter == :in_progress, instead of apply_query_customization, apply_negated_query_customization method will be called for the end event.
    • The duration calculation also changes.
  • I also changed the tests to use let_it_be so it runs faster.

Test changes

Since VSA is customizable, we have a large test file that tests various event combinations: ee/spec/lib/gitlab/analytics/cycle_analytics/data_collector_spec.rb

How it works:

  • Create 3 resources (issues or MRs) with different durations.
  • Assert the calculated durations in each test case.
  • This MR adds 2 extra resources where the end event is missing (open ended).

How to test it

  1. Make sure you have ultimate license
  2. Go to the generated group: Analytics / Value Stream
  3. Create a new value stream (dropdown top right)
  4. Tick Create from no template and select start event: issue created, end event: issue closed
  5. The page should show no items
  6. Create a new issue within the group (any project)
  7. Open the network tab
  8. Find the records API call, copy the URL and add `&end_event_filter=in_progress
  9. The created issue should show up

If you close the issue, the issue should disappear and it should show up on the UI where the end_event_filter is not applied.

Database

There are many query combinations, so I only picked a few. The query base is always the same: filtering the issues.created_at or merge_requests.created_at (1 month range) and then apply additional filters.

There is no significant difference between old and new queries, the performance is generally not great due to the recursive namespace queries.

Cached timings:

NEW | Issue stage, in progress query (130ms)

SELECT
    EXTRACT(EPOCH FROM percentile_cont(0.5) WITHIN GROUP (ORDER BY TO_TIMESTAMP(1619431013) - "issues"."created_at")) AS median
FROM
    "issues"
    INNER JOIN "projects" ON "projects"."id" = "issues"."project_id"
    INNER JOIN "issue_metrics" ON "issue_metrics"."issue_id" = "issues"."id"
    LEFT JOIN project_features ON projects.id = project_features.project_id
WHERE (issues.confidential IS NOT TRUE
    OR (issues.confidential = TRUE
        AND (issues.author_id = 4156052
            OR EXISTS (
                SELECT
                    TRUE
                FROM
                    issue_assignees
                WHERE
                    user_id = 4156052
                    AND issue_id = issues.id)
                OR EXISTS (
                    SELECT
                        1
                    FROM
                        "project_authorizations"
                    WHERE
                        "project_authorizations"."user_id" = 4156052
                        AND (project_authorizations.project_id = issues.project_id)
                        AND (project_authorizations.access_level >= 20)))))
    AND "projects"."namespace_id" IN (
        SELECT
            "namespaces"."id"
        FROM
            "namespaces"
        WHERE (traversal_ids @> ('{9970}')))
AND (EXISTS (
        SELECT
            1
        FROM
            "project_authorizations"
        WHERE
            "project_authorizations"."user_id" = 4156052
            AND (project_authorizations.project_id = projects.id)
            AND (project_authorizations.access_level >= 10))
        OR projects.visibility_level IN (10, 20))
AND ("project_features"."issues_access_level" IS NULL
    OR "project_features"."issues_access_level" IN (20, 30)
    OR ("project_features"."issues_access_level" = 10
        AND EXISTS (
            SELECT
                1
            FROM
                "project_authorizations"
            WHERE
                "project_authorizations"."user_id" = 4156052
                AND (project_authorizations.project_id = projects.id)
                AND (project_authorizations.access_level >= 10))))
AND "issues"."created_at" <= '2021-04-26'
AND "issues"."created_at" >= '2021-03-26 09:56:51.286255'
AND COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at") IS NULL

Plan: https://explain.depesz.com/s/ZCAv

Issue stage, original query, just for comparison (150ms): https://explain.depesz.com/s/JUlo


NEW | Plan stage, in progress query (70ms)

SELECT
    EXTRACT(EPOCH FROM percentile_cont(0.5) WITHIN GROUP (ORDER BY TO_TIMESTAMP(1619431018) - COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at"))) AS median
FROM
    "issues"
    INNER JOIN "projects" ON "projects"."id" = "issues"."project_id"
    INNER JOIN "issue_metrics" ON "issue_metrics"."issue_id" = "issues"."id"
    LEFT JOIN project_features ON projects.id = project_features.project_id
WHERE (issues.confidential IS NOT TRUE
    OR (issues.confidential = TRUE
        AND (issues.author_id = 4156052
            OR EXISTS (
                SELECT
                    TRUE
                FROM
                    issue_assignees
                WHERE
                    user_id = 4156052
                    AND issue_id = issues.id)
                OR EXISTS (
                    SELECT
                        1
                    FROM
                        "project_authorizations"
                    WHERE
                        "project_authorizations"."user_id" = 4156052
                        AND (project_authorizations.project_id = issues.project_id)
                        AND (project_authorizations.access_level >= 20)))))
    AND "projects"."namespace_id" IN (
        SELECT
            "namespaces"."id"
        FROM
            "namespaces"
        WHERE (traversal_ids @> ('{9970}')))
AND (EXISTS (
        SELECT
            1
        FROM
            "project_authorizations"
        WHERE
            "project_authorizations"."user_id" = 4156052
            AND (project_authorizations.project_id = projects.id)
            AND (project_authorizations.access_level >= 10))
        OR projects.visibility_level IN (10, 20))
AND ("project_features"."issues_access_level" IS NULL
    OR "project_features"."issues_access_level" IN (20, 30)
    OR ("project_features"."issues_access_level" = 10
        AND EXISTS (
            SELECT
                1
            FROM
                "project_authorizations"
            WHERE
                "project_authorizations"."user_id" = 4156052
                AND (project_authorizations.project_id = projects.id)
                AND (project_authorizations.access_level >= 10))))
AND "issues"."created_at" <= '2021-04-26'
AND "issues"."created_at" >= '2021-03-26 09:56:57.367377'
AND ("issue_metrics"."first_added_to_board_at" IS NOT NULL
    OR "issue_metrics"."first_associated_with_milestone_at" IS NOT NULL)
AND "issue_metrics"."first_mentioned_in_commit_at" IS NOT NULL
AND "issue_metrics"."first_mentioned_in_commit_at" IS NULL

Plan: https://explain.depesz.com/s/tvl6

Plan stage, original query, just for comparison (130ms): https://explain.depesz.com/s/y1vU

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #299265 (closed)

Edited by Adam Hegyi

Merge request reports