Skip to content
Snippets Groups Projects

ClickHouse data collector

Merged charlie ablett requested to merge 414610-cablett-clickhouse-data-collector into master

What does this MR do and why?

ClickHouse data collector

Introduces feature flag clickhouse_data_collector #420257 (closed)

Screenshots or screen recordings

No user-facing changes

How to set up and validate locally

Make sure you have GitLab Ultimate license on your local.

  1. Install and run the ClickHouse server as per these directions :point_right: !124295 (merged)
  • If you do the one-line install rather than installing via apt, you can remove the password from the yml file because the default user it comes with doesn't have a password. There is a way to set up other users or set a password, but for the purposes of testing I don't think it's necessary.
  1. Run the ClickHouse client, connecting to the gitlab_clickhouse_test database.
  2. Create the events table into the gitlab_clickhouse_test database (run the code as in the db/click_house/main/20230705124511_create_events.sql file in clickhouse client)
  3. Create the contribution_analytics_events table - run db/click_house/main/20230724064832_create_contribution_analytics_events.sql file in this MR in clickhouse client
  4. Create the materialised view - run db/click_house/main/20230724064918_contribution_analytics_events_materialized_view.sql in this MR in clickhouse client

OR as @ahegyi says: just do #414938 (closed) :sweat_smile:

  1. Find a project with some MR activity (code pushes, etc) or seed some.
  2. Fill out the table:
def format_row(event)
  namespace = event.project.try(:project_namespace) || event.group
  path = namespace.traversal_ids.join('/')

  action = Event.actions[event.action]
  [
    event.id,
    "'#{path}/'",
    event.author_id,
    event.target_id,
    "'#{event.target_type}'",
    action,
    event.created_at.to_f,
    event.updated_at.to_f
  ].join(',')
end


values = []
Event.all.each do |event|
  values << "(#{format_row(event)})"
end

insert_query = <<~SQL
INSERT INTO events
(id, path, author_id, target_id, target_type, action, created_at, updated_at)
VALUES
#{values.join(',')}
SQL

ClickHouse::Client.execute(insert_query, :main)
  1. Tail logs tail -f log/development.log
  2. To get the data:
  • PG: make sure Feature.enabled?(:clickhouse_data_collection) is false
  • CH: make sure Feature.enabled?(:clickhouse_data_collection) is true

Using the GDK group flightjs:

{
  group(fullPath: "flightjs") {
    contributions(from: "2023-06-01", to: "2023-08-01") {
      nodes {
        user {
          id
        }
        totalEvents
        repoPushed
      }
    }
  }
}

Try with the FF off first, note the result, and then with FF on. The resulting data should be the same. If the FF is off, you'll see a MATERIALIZED VIEW query hit the Postgres DB. If the FF is on, it will be missing.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #414610 (closed)

Edited by Pedro Pombeiro

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • charlie ablett
  • charlie ablett added 1 commit

    added 1 commit

    • efc748d9 - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett added 1 commit

    added 1 commit

    • db307ee1 - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett added 1 commit

    added 1 commit

    • 52760dac - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett added 1 commit

    added 1 commit

    • f0ba8a97 - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett added 1 commit

    added 1 commit

    • 3a117beb - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett added 1 commit

    added 1 commit

    • 16d1becd - ClickHouse data collector for contribution analytics

    Compare with previous version

  • A deleted user added frontend label

    added frontend label

  • charlie ablett
  • charlie ablett
  • charlie ablett added 1 commit

    added 1 commit

    • c924945d - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett added 1 commit

    added 1 commit

    • e8575adb - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett added 1 commit

    added 1 commit

    • 013cdbc7 - ClickHouse data collector for contribution analytics

    Compare with previous version

  • charlie ablett
  • charlie ablett added 1 commit

    added 1 commit

    • 9ec21de9 - ClickHouse data collector for contribution analytics

    Compare with previous version

  • Adam Hegyi
  • Adam Hegyi
  • Adam Hegyi
  • Adam Hegyi
  • Thanks, @cablett! Looks great! I left a few minor comments.

  • Adam Hegyi
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading