perf(compiler): close variable-length traversal cliff
A 3-rel multi-hop count had two compiler issues: it silently undercounted paths by ~8% and ran 22 to 34% slower than necessary. Both surface in the optimizer's handling of variable-length UNION arms.
The cliff
Query: User -> AUTHORED -> MR -> IN_PROJECT -> Project <- CONTAINS x{1..3} <- Group, user pinned, aggregation count. The compiler emitted SQL with two compounding issues:
- Wrong SIP placement. The to-side cascade IN-subquery was injected on
e1.target_idfor every UNION arm regardless of depth. At depth>1e1.target_idis an intermediate hop, not the to-side, so depth>1 arms returned no rows and the count silently undercounted. - Static kind constraints stopped at the OUTER alias.
inject_entity_kind_filtersconstrainedhop_e<i>.source_kindandtarget_kind, but ClickHouse will not propagate those into per-arm scans. Each arm scanned every CONTAINS edge in the namespace instead of using the kind-led PK projections (by_rel_source_kind,by_rel_target_kind).
Numbers
3 runs each, this MR rebased on top of !1067 (merged). ClickHouse-reported metrics (query_duration_ms, read_rows, read_bytes, memory_usage) against a representative dataset.
| Metric | Before (main HEAD) | After (this MR) | Change |
|---|---|---|---|
| Count | 2043 (under) | 2207 (correct) | +164 paths recovered |
| Wall | 3.64-4.78 s | 2.84-3.64 s | -22 to -34% |
query_duration_ms |
3175-4323 ms | 2389-3207 ms | -25 to -36% |
read_rows |
6.27 M (under-scan) | 13.20 M (correct work) | +110% (real paths) |
read_bytes |
214 MB | 657 MB | +207% (correct work) |
memory_usage |
426-553 MB | 682-722 MB | higher (driven by correct depth-2/3 inclusion) |
What changed
All in optimize.rs.
Anchor parameter on inject_sip_for_aliases. Adds ArmAnchor::{First,Last}. The to-side cascade IN-subquery now lands on the chain's rightmost edge per arm (e<depth>.target_id outgoing, e<depth>.source_id incoming), matching the column the to-side node exposes. Restores correct counts.
New pass push_kind_literals_into_variable_length_arms. For each variable-length relationship whose endpoints have entity types, walks the corresponding hop_e<i> UNION ALL and injects static e1.<source_kind> = '<from>' and e<depth>.<target_kind> = '<to>' predicates into every arm. Static literals enable kind-led PK granule pruning; dynamic IN-subqueries cannot.
Suppress the per-arm to-side IN-subquery when both endpoints are kind-pinned. The arm-internal probe duplicates the outer node-table cascade SIP and was the dominant cost on top of an already kind-pruned scan. Pass order matters: the new pass runs immediately after inject_entity_kind_filters, so SIP injection knows whether arms already carry the kind constraint and can skip the redundant probe.
M1 input DSL
{
"query_type": "aggregation",
"nodes": [
{"id": "u", "entity": "User", "node_ids": [116]},
{"id": "mr", "entity": "MergeRequest"},
{"id": "p", "entity": "Project"},
{"id": "g", "entity": "Group"}
],
"relationships": [
{"type": "AUTHORED", "from": "u", "to": "mr"},
{"type": "IN_PROJECT", "from": "mr", "to": "p"},
{"type": "CONTAINS", "from": "g", "to": "p", "min_hops": 1, "max_hops": 3}
],
"aggregations": [{"function": "count", "target": "u", "alias": "n"}],
"limit": 3
}Before: depth>1 arms get filtered to zero by `e1.target_id IN _cascade_p`
INNER JOIN (
SELECT e1.target_id AS end_id, 1 AS depth FROM gl_edge AS e1
WHERE rel_kind = 'CONTAINS' AND e1.target_id IN _cascade_p
UNION ALL
SELECT e2.target_id AS end_id, 2 AS depth FROM gl_edge AS e1
INNER JOIN gl_edge AS e2 ON e1.target_id = e2.source_id AND e2.rel_kind = 'CONTAINS'
WHERE rel_kind = 'CONTAINS'
AND e1.target_id IN _cascade_p -- intermediate, never matches Project IDs
UNION ALL
SELECT e3.target_id AS end_id, 3 AS depth ...
WHERE rel_kind = 'CONTAINS'
AND e1.target_id IN _cascade_p -- intermediate, never matches Project IDs
) AS hop_e2 ...After: per-arm kind literals; outer p.id IN _cascade_p JOIN provides the to-side filter
INNER JOIN (
SELECT e1.target_id AS end_id, 1 AS depth FROM gl_edge AS e1
WHERE e1.rel_kind = 'CONTAINS'
AND e1.source_kind = 'Group' AND e1.target_kind = 'Project'
UNION ALL
SELECT e2.target_id AS end_id, 2 AS depth FROM gl_edge AS e1
INNER JOIN gl_edge AS e2 ON e1.target_id = e2.source_id AND e2.rel_kind = 'CONTAINS'
WHERE e1.rel_kind = 'CONTAINS'
AND e1.source_kind = 'Group' AND e2.target_kind = 'Project'
UNION ALL
SELECT e3.target_id AS end_id, 3 AS depth ...
WHERE e1.rel_kind = 'CONTAINS'
AND e1.source_kind = 'Group' AND e3.target_kind = 'Project'
) AS hop_e2 ON hop_e2.end_id = p.id
INNER JOIN (SELECT * FROM gl_project AS p WHERE p.id IN _cascade_p ...) AS p ...Tests
320 compiler unit tests pass (4 new): leftmost vs rightmost anchor walking, per-arm injection at all depths, Incoming direction column swap, skip-when-no-entity. An end-to-end test in lib.rs renders the M1 query through compile() and asserts each arm carries the expected static kind literals and that the redundant to-side IN-subquery is suppressed.
Related
!1067 (merged) adds traversal hop frontier CTEs that narrow the from-side of intermediate edges in multi-hop traversals (Traversal queries only). This MR addresses the to-side correctness gap and the kind-pruning gap, and applies to Traversal and Aggregation. The two are complementary; no expected merge conflict.