feat(xtask): wire up e2e harness to tilt + hardening

What does this MR do and why?

Hardens the cargo xtask E2E runner for reliable cleanroom runs: adds the Tilt/GKG stack lifecycle (--gkg-only iteration loop, post-Tilt error handling, GKG teardown), replaces sequential siphon polling with concurrent table gates, fixes four Helm chart bugs discovered during live testing, and adds siphon_namespace_details to the pre-dispatch poll to prevent an intermittent gl_group: 0 race condition. A full teardown-then-setup now passes 38/38 redaction tests on the first attempt.

Continues from !319 (merged) (merged). Ports from !304 (closed).

What changed

  • main.rs--gkg-only flag for both setup and teardown; on setup runs just phase 3 (steps 15–25), on teardown tears down only Tilt/GKG keeping GitLab + Colima running

  • pipeline/gkg.rs — post-Tilt error wrapper that kills Tilt on failure; ensure_tilt_secrets() regenerates .secrets from running GitLab if missing; concurrent siphon poll (step 21) checks all SIPHON_POLL_TABLES each cycle; concurrent graph table poll (step 22) tracks all 9 GL_TABLES in a single loop; datalake migration output captured to .dev/clickhouse-migrate.log; datalake diagnostics dump between steps 24 and 25; kill_tilt() public helper shared by teardown and error handler

  • constants.rsSIPHON_POLL_TABLES (3 tables including siphon_namespace_details) replaces SIPHON_MR_TABLE/SIPHON_KG_NS_TABLE; unified SIPHON_POLL_INTERVAL; new constants for GKG teardown resources; removed dead per-table constants; INDEXER_POLL_TIMEOUT 300s → 600s

  • teardown.rs — new step 1: teardown_gkg_stack (kill Tilt, tilt down, delete ClickHouse/NATS/siphon/secrets/PVCs/Helm release); all steps renumbered (1–6); --gkg-only early exit preserving .secrets

  • e2e/tilt/Tiltfileallow_k8s_contexts includes colima-cng; caproni Postgres overrides inlined via helm() set=

  • e2e/tilt/values.yaml — E2E Helm values with nativePort: 9000 for siphon consumer, 11 table mappings, local GKG image

How to validate locally

# Full run (all phases including redaction tests)
GITLAB_SRC=~/Desktop/Code/gdk/gitlab cargo xtask e2e setup --gkg

# Iterate on just GKG stack (phases 1+2 already done)
GITLAB_SRC=~/Desktop/Code/gdk/gitlab cargo xtask e2e setup --gkg-only

# Tear down just GKG stack, keep GitLab + Colima
GITLAB_SRC=~/Desktop/Code/gdk/gitlab cargo xtask e2e teardown --gkg-only

# Full teardown
GITLAB_SRC=~/Desktop/Code/gdk/gitlab cargo xtask e2e teardown

References

  • Continues: !319 (merged) (merged)
  • Derived from: !304 (closed) (shell-based E2E harness)
  • Completes the full cargo xtask e2e setup --gkg pipeline (steps 1–25)

Testing

E2E test now passes, 38/38 tests.

Performance Analysis

  • This merge request does not introduce any performance regression. If a performance regression is expected, explain why.
Edited by Michael Usachenko

Merge request reports

Loading