Skip to content

Create aggregated materialized views for events

What does this MR do and why?

As a prep work for #428260 (closed) this MR adds the aggregated materialized views for the events table for ensuring eventual consistency. Additionally, it also includes a utility class for batching over ClickHouse tables.

  • event_authors: collect distinct events.author_id values.
  • event_namespace_paths: collect distinct events.path values.

How to set up and validate locally

Testing the iterator:

Enable FFs:

Feature.enable(:event_sync_worker_for_click_house)
  1. Ensure that you're on ultimate
  2. Ensure that CH is configured: https://docs.gitlab.com/ee/development/database/clickhouse/clickhouse_within_gitlab.html
  3. For prepping the DB schema you can invoke: bundle exec rake gitlab:clickhouse:migrate
  4. If your GDK is seeded, then you probably have some events records, you can sync them to CH: ClickHouse::EventsSyncWorker.new.perform

Run the iterator in the console:

ClickHouse::Iterator.new(query_builder: ClickHouse::QueryBuilder.new('events'), connection: ClickHouse::Connection.new(:main)).each_batch(of: 10) do |scope|
  puts ClickHouse::Client.select(scope.to_sql, :main)
end

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #428260 (closed)

Edited by Adam Hegyi

Merge request reports