Skip to content

Fix add_on_name inconsistency in Clickhouse user_add_on_assignments_history table

Problem

The user_add_on_assignments_history table in Clickhouse has inconsistent add_on_name values:

  • Some records store addon names as strings (e.g., "code_suggestions", "duo_enterprise")
  • Other records store addon names as numbers (e.g., 1, 2, 3, 5)

This causes filtering issues when querying by addon name string, resulting in missing or inaccurate data.

Investigation Details

From the investigation in #568860:

  • Postgres subscription_add_ons table stores addon IDs as numbers (1, 2, 3, 5) with string names
  • Postgres subscription_user_add_on_assignment_versions stores add_on_name as strings
  • Clickhouse has a mix of both formats, causing data inconsistency
  • When filtering by "code_suggestions" string, records stored as "3" (the ID) are missed

Current Enum Values (GitlabSubscriptions::AddOn)

  • 1 = code_suggestions
  • 2 = product_analytics
  • 3 = duo_enterprise
  • 5 = duo_nano

Solution

  1. Data Migration: Update all existing records in Clickhouse user_add_on_assignments_history table to use string addon names instead of numeric IDs

    • Map numeric IDs to their string equivalents based on the GitlabSubscriptions::AddOn enum
    • Ensure consistency with Postgres data
  2. Validation: Verify the migration results match expected counts as postgres:

Related Issues

Edited by Amr Taha