Skip to content

Update Cube schema's to exclude anonymous users

What does this MR do and why?

This MR adds a new segment which filters out any user types that aren't cookie or identify. This restricts the resulting query to only non-anonymous users as anonymous users can often distort the resulting data set.

At the same time, it updates our specs as they were woefully out-of-date with the rest of the stack:

  • Updates docker-compose.ci.yml to extend docker-compose.yml
    • It doesn't need to be different right now, but it gives us room to deviate if needed.
  • Updates the created table to use the latest configurator table structure.
  • Added user ID types to the test data.
  • Adds a new test to validate the segment works as expected.

This segment is going to be used in the UI with a toggle to add or remove the segment.

Analytics stack MR copying the schema changes: https://gitlab.com/gitlab-org/analytics-section/product-analytics/analytics-stack/-/merge_requests/142+

Screenshots or screen recordings

Before After
Screenshot_2023-11-01_at_16.50.07 Screenshot_2023-11-01_at_16.50.13

How to set up and validate locally

  1. Pull this branch and restart your Cube container.
  2. Add known and anonymous users to your dataset (if you don't already have them, an easy way to do this is to use the Browser SDK test files)
  3. Visit http://localhost:4000/#/build and test with and without the new segment to see the user count go down. E.g. http://localhost:4000/#/build?query={%22measures%22:[%22TrackedEvents.count%22],%22order%22:{%22TrackedEvents.derivedTstamp%22:%22asc%22},%22limit%22:100,%22filters%22:[{%22member%22:%22TrackedEvents.derivedTstamp%22,%22operator%22:%22inDateRange%22,%22values%22:[%222023-09-29%22,%222023-10-09%22]}],%22timeDimensions%22:[{%22dimension%22:%22TrackedEvents.derivedTstamp%22,%22granularity%22:%22day%22}],%22segments%22:[%22TrackedEvents.knownUsers%22]}

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Relates to #23 (closed)

Edited by Robert Hunt

Merge request reports