[CA PoC] Audit the click_house gem / implement our own

Why are we doing this work

To quickly get results from the PoC we need an simple way to talk to a ClickHouse database. Based on the proposal Draft out ClickHouse read layer interface there are two Ruby gems which could be used.

Let's rule out clickhouse-activerecord for now, reasoning:

clickhouse-activerecord implements an AR adapter which could causes issues in the application because there is an assumption that an AR base class always uses PostgreSQL, see the experiment for more details: !104716 (closed)

To accelerate the PoC, audit the click_house gem which uses the HTTP interface to talk to ClickHouse.

Ideally, we should be able to read/write/update data in the ClickHouse database. DB schema management (migrations) is not part of the PoC.

Check the following:

  • Check whether there is a workflow/pipeline for the gem
    • Ruby 3, 3.1, 3.2
  • Uses c extension?
  • Is the HTTP library can be swapped? Can we use Gitlab::HTTP?
  • How well-tested the gem? Is there coverage info?
  • Would it be feasible to fork the gem and set up test pipeline?
  • How can we configure the database credentials? (YAML, env variables?)
  • Would it be feasible to implement a new client from scratch (assuming that the gem is not suitable for us)?
  • Check how the gem logs statements. Are there logger calls or puts invocation in the codebase?

How to use ClickHouse

ClickHouse is available via GDK: gitlab-development-kit!2535 (merged)

Edited by Adam Hegyi