[sdlc] Create an ETL simulator for ClickHouse cloud datalake at scale
Problem To Solve
There's no way to validate the SDLC module's behavior at production-like scale with multiple namespaces and gigabytes of data in ClickHouse cloud datalake. Without this, we can't be confident the module will hold up under real-world conditions or produce the evidence needed for PREP.
Proposed Solution
Build an ETL simulator for the SDLC module that populates data in ClickHouse cloud datalake at varying scales across multiple namespaces, supporting multiple gigabytes of data. It should be testable under different conditions (load, failure modes, data shapes) and produce artifacts usable for PREP reviews.