Skip to content

Export product analytics clickhouse tables and upload to object storage

Overview

As a project maintainer (or higher), I should be able to export all of my product analytics event data as a SQL dump or CSV and be able to download that file for use locally or to be imported in to another analytics system.

Implementation plan

  • Implement chproxy to allow GitLab to make direct calls to the clickhouse database containing the product analytics event data.
  • Create a background worker that exports the correct table (authenticated by GitLab) to a CSV file: SELECT * FROM gitlab_project_xxxx INTO OUTFILE 'file' FORMAT CSV
  • Upload this file to object storage (with a lifecycle policy to delete the file after 24 hours), if configured. Or in local file system, where not configured.
  • Email the requester information on how to retrieve the file.
  • Create graphQL mutation to trigger the export worker, implementing request deduplicating.
Edited by Max Woolf