Manage Clickhouse migrations with a dedicated CRD
(Mostly copied from !10 which has an implementation proposal)
This is a proposal for how to do more robust database changes by introducing a new CRD: ClickhouseMigration
.
Background
Running migration queries straight from the EventTaxonomy reconciliation is not robust enough.
Sometimes these queries can take a long time, and there is nothing stopping the EventTaxonomy controller from trying to execute the same ALTER while another is already running from a previous reconcile.
Also, there is no way of observing the progress of each DDL query.
Solution
Having a queue for DDL queries sounds like a good start. A separate reconciliation loop for a new CRD would provide this, as well as a mechanism for adding the observability we need through status subresources.
Example
Here's an example of a new ClickhouseMigration
custom resource for handling this:
apiVersion: dwh.zedge.io/v1beta1
kind: ClickhouseMigration
metadata:
name: migration1
spec:
dsn: tcp://clickhouse-server:9000?username=user
passwordRef: user-secret
subject:
database: mydb
table: mytable
column: newcolumn
cluster: v2
query: "ALTER TABLE mydb.mytable ON CLUSTER v2 ADD COLUMN newcolumn Int32"
status:
queryStatus: Pending
# should add useful timestamps here too
Ordering
There are cases where you want to run queries in a certain order. I'm proposing that we handle this by putting a list of names of other ClickhouseMigration objects in .spec.dependencies
:
apiVersion: dwh.zedge.io/v1beta1
kind: ClickhouseMigration
metadata:
name: migration2
spec:
dependencies:
- migration1
I haven't been able to come up with real-world examples of queries that need more than one dependency yet, but having a list doesn't hurt.
Naming
With a ClickhouseMigration CRD, the EventTaxonomy controller would start to create ClickhouseMigration objects, and will need to have
a naming strategy for them. I think appending a content-based base36-encoded hash seems like a good approach, for example taxonomyname-wz5adbrrm680
.