Skip to content

Introduce worker to import finished pipelines to ClickHouse

Sequence MR
1 Create p_ci_finished_pipeline_ch_sync_events table (!158060 - merged)
2 Implement service to sync pipelines to ClickHouse (!158362 - merged)
3 you are here Introduce worker to import finished pipelines t... (!159083 - merged)
4 omnibus-gitlab!7783 (merged)
5 gitlab-org/charts/gitlab!3839 (merged)

What does this MR do and why?

This MR introduces a worker that leverages the service being introduced in !158362 (merged) to import finished pipelines to ClickHouse (it is very similar to the existing !132436 (merged) which does the same for builds). This worker is only meant for GitLab.com for now (since customers won't have a ClickHouse installation).

It doesn't include a changelog entry as this feature is still behind a FF.

Part of #470079 (closed)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

image http://gdk.test:3000/admin/background_jobs

image

image

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

NOTE: These steps were already tested in the upstream MR !132010 (merged), the only thing that this MR adds is the worker that will call the service

  1. Ensure you have ClickHouse installed.

  2. Go to the shell in your GDK gitlab directory and run bundle exec rake "gitlab:seed:runner_fleet". This will seed your GDK with some runners and pipelines required for testing this MR.

  3. Enabled the FF:

    Feature.enable(:ci_pipelines_data_ingestion_to_click_house)
  4. Create a new Ci::FinishedPipelineChSyncEvent record for each finished pipeline on the GDK console:

    Ci::Pipeline.finished.where.not(finished_at: nil).order(finished_at: :asc).each_batch(of: 40000) { |batch| project_namespace_ids = ::Project.id_in(batch.pluck(:project_id).uniq).pluck(:id, :project_namespace_id).to_h; Ci::FinishedPipelineChSyncEvent.transaction { Ci::FinishedPipelineChSyncEvent.insert_all(batch.map { |pipeline| { pipeline_id: pipeline.id, pipeline_finished_at: pipeline.finished_at, project_namespace_id: project_namespace_ids[pipeline.project_id] } }, unique_by: [:pipeline_id, :partition]) } }
  5. Invoke the service to import all the finished pipelines (or go to http://gdk.test:3000/admin/background_jobs and wait for at most 4 minutes for the job to run):

    Ci::ClickHouse::FinishedPipelinesSyncWorker.new.perform
  6. The worker should import all the finished pipelines into the ClickHouse ci_finished_pipelines table.

Edited by Pedro Pombeiro

Merge request reports