Draft: Spike Research: ClickHouse as the Datastore for VSA API
Problem to solve
- For GL VSM to be the SSOT for DevOps Analytics users need to aggregate multiple data records into one VSA.
- THE VSA API need to ingests raw event data from almost any DevOps tool and to normalizes it into one stream
- PostgreSQL is not set up for analytical workloads.
Reference use cases
(in that priority order)
- Custom VSA stages base on Jira events.
- Custom VSA stages base on Gitlab Webhooks - Add Start/End Event for Issue Assigned
- Expose aggregated VSA metrics for external BI tools
Investigation and clarification questions:
- Are there any consistency problems we might encounter if we store VSA API events in CH?Can we move the VSA table to ClickHouse?
- What enhancements need to be done for the VSA stage event schema?
- What should be the authentication and authorization approach? Assuming SaaS first.
- Are there any other use cases for reference?
Expected Outcome
- Outline (at a high level) the major steps that we need to take
- Technical proposal for a POC
Edited by Haim Snir