Manage Clickhouse migrations with a dedicated CRD

(Mostly copied from !10 which has an implementation proposal)

This is a proposal for how to do more robust database changes by introducing a new CRD: ClickhouseMigration.

Background

Running migration queries straight from the EventTaxonomy reconciliation is not robust enough.

Sometimes these queries can take a long time, and there is nothing stopping the EventTaxonomy controller from trying to execute the same ALTER while another is already running from a previous reconcile.

Also, there is no way of observing the progress of each DDL query.

Solution

Having a queue for DDL queries sounds like a good start. A separate reconciliation loop for a new CRD would provide this, as well as a mechanism for adding the observability we need through status subresources.

Example

Here's an example of a new ClickhouseMigration custom resource for handling this:

apiVersion: dwh.zedge.io/v1beta1
kind: ClickhouseMigration
metadata:
  name: migration1
spec:
  dsn: tcp://clickhouse-server:9000?username=user
  passwordRef: user-secret
  subject:
    database: mydb
    table: mytable
    column: newcolumn
    cluster: v2
  query: "ALTER TABLE mydb.mytable ON CLUSTER v2 ADD COLUMN newcolumn Int32"
status:
  queryStatus: Pending
  # should add useful timestamps here too

Ordering

There are cases where you want to run queries in a certain order. I'm proposing that we handle this by putting a list of names of other ClickhouseMigration objects in .spec.dependencies:

apiVersion: dwh.zedge.io/v1beta1
kind: ClickhouseMigration
metadata:
  name: migration2
spec:
  dependencies:
    - migration1

I haven't been able to come up with real-world examples of queries that need more than one dependency yet, but having a list doesn't hurt.

Naming

With a ClickhouseMigration CRD, the EventTaxonomy controller would start to create ClickhouseMigration objects, and will need to have a naming strategy for them. I think appending a content-based base36-encoded hash seems like a good approach, for example taxonomyname-wz5adbrrm680.