Export product analytics clickhouse tables and upload to object storage
Overview
As a project maintainer (or higher), I should be able to export all of my product analytics event data as a SQL dump or CSV and be able to download that file for use locally or to be imported in to another analytics system.
Implementation plan
- Implement chproxy to allow GitLab to make direct calls to the clickhouse database containing the product analytics event data.
- Create a background worker that exports the correct table (authenticated by GitLab) to a CSV file:
SELECT * FROM gitlab_project_xxxx INTO OUTFILE 'file' FORMAT CSV
- Upload this file to object storage (with a lifecycle policy to delete the file after 24 hours), if configured. Or in local file system, where not configured.
- Email the requester information on how to retrieve the file.
- Create graphQL mutation to trigger the export worker, implementing request deduplicating.
Edited by Max Woolf