Skip to content

Adds migration for sessions partition table

About:

This aims to partition an existing sessions table for projects that've been onboarded to product analytics (partition by YYYYMM). This change aims to achieve better query performance for the PA dashboards. This also adds a migration rollback command to rollback the last executed migration on projects.

Strategy:

  1. List all the databases from activeApps table. For every project onboarded to product analytics, there is a corresponding database.
  2. Create a partitioned sessions table for all those databases, if the table doesn't already exist. (some of the newly onboarded projects might already have this table, ref !9 (merged))
  3. Create an additional materialized view from snowplow_events to sessions_partition table. This will ensure any new writes to sessions table is also propagated to sessions_partition without incurring a dependency on sessions table and view so that we can deprecate the sessions table and view in future without the need to change source for sessions_partition .
  4. A migration to copy existing data from sessions to sessions_partition - How and when to run this query is TBD

Steps to setup and validate locally

  • Select any project database (project should be gitlab project already onboarded to product analytics), preferably the one having some entries in sessions and snowplow_events table)

  • Run the migration manually by following steps in README.md.

  • Validate that a partition table for sessions named sessions_partition is created for the project. SHOW TABLES from gitlab_project_#{project_id} .

  • Validate that a new materialized view to sessions_partition is also created. Above command foe showing tables also lists materialized views.

  • Try generating an event for that project database or enter some data into snowplow_events. Goal is to create some entries in snowplow_events table. E.g.

    INSERT INTO gitlab_project_#{id}.snowplow_events 
    SELECT * FROM gitlab_project_#{id}.snowplow_events limit 10
  • Validate that the entry is being saved to the correct partition of sessions_partition table. To show all the partitions run the below command.

    select count() from system.parts where table in ('sessions_partition') and active
  • Validate that the entry is also being saved to original sessions table.

Closes #15 (closed)

Edited by Surabhi Suman

Merge request reports