Add experimental ClickHouse HTTP client
What does this MR do and why?
This MR adds a lightweight gem for interacting with ClickHouse databases via HTTP interface. We're following the gem guidelines here: https://docs.gitlab.com/ee/development/gems.html
The client was created so groupoptimize can move forward with the PoC where we plan to rebuild a feature using ClickHouse.
The client should support executing an aggregation query and return counts. At a later step it should also support bulk inserts. Models and validations are out of the scope.
Why not use an already existing gem?
The original idea was to add an external gem (#414386 (closed)) however, the gem does much more than we need for this PoC and adding an external gem would require careful review of the codebase. This would eventually mean:
- Review the code and assess its quality.
- Forking the gem and making sure we have a running pipeline for the supported ruby versions.
How does it work?
The gem supports multiple databases which can be specified via a YAML file. The Gitlab::ClickHouse::Client::Configuration
object contains the databases and other options so we can use our utilities (Gitlab::HTTP
, Gitlab::Json
).
Without the YAML file the gem will not do anything, configuration is optional.
Query execution flow:
- Invoke the
execute
method using the configured database and by passing a raw SQL string. - The
Client
invokes thehttp_post_proc
callable object which invokes a HTTP request. - The response will be turned into a
Response
object which can tell whether the response is successful or not. - The body of the response contains a raw JSON string, this will be parsed with the configured
json_parser
. - The JSON response includes the returned columns with their respective data types. Using this data, the
Formatter
casts the values if necessary. - The raw data (array of hashes) are returned.
How to test it
- Set up a ClickHouse server and a user with password: https://clickhouse.com/docs/en/install
- Start a console:
clickhouse-client --password
- Create a new database:
create database gitlab_clickhouse_test;
- Connect to the new database:
clickhouse-client -d gitlab_clickhouse_test --password
- Create a new table:
CREATE TABLE issues (
`id` UInt64,
`title` String DEFAULT '',
`description` Nullable(String),
`created_at` DateTime64(6, 'UTC') DEFAULT now(),
`updated_at` DateTime64(6, 'UTC') DEFAULT now()
)
ENGINE = ReplacingMergeTree(updated_at)
ORDER BY (id)
- Insert some data:
insert into issues (id, title, description) values (1, 'Title 1', 'description');
insert into issues (id, title, description) values (2, 'Title 2', 'description');
insert into issues (id, title, description) values (3, 'Title 3', null);
- Create a config file in your gitlab dir:
config/click_house.yml
development:
main:
database: gitlab_clickhouse_test
url: 'http://localhost:8123'
username: default
password: clickhouse
- Start rails console and execute:
Gitlab::ClickHouse::Client.execute("select * from issues", :main)
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.