Orbit: Ruby DSL declarations and `prepend_mod_with` are invisible to the source_code indexer (CanCanCan policies, ActiveRecord scopes, CE/EE composition)
## Summary
The Orbit `source_code` domain indexer captures conventional Ruby/JS code structure well (`def` methods, `class A < B`, `include Module`) but does not walk Ruby DSL-shaped declarations. For a Rails codebase the size and shape of `gitlab-org/gitlab`, the dominant dependency-shaped relationships are expressed through DSL patterns, none of which the indexer captures today.
Three concrete coverage gaps surfaced during UC-10 customer-zero testing (gitlab-org/orbit/knowledge-graph#606), all sharing the same root cause:
1. **Call edges through DSL block bodies are not extracted.** `condition(:guest) { team_member? }` makes `team_member?` a logical dependency, but the indexer does not register a CALLS edge from the enclosing class to it. CanCanCan policy DSL (`condition`, `rule`, `policy`, `enable`, `prevent`), ActiveRecord scopes (`scope :X, -> { ... }`), validations (`validates`), callbacks (`before_save`), and route helpers all fall in this bucket.
2. **Ruby `include Module` is not captured as `ImportedSymbol`.** ImportedSymbol coverage is Go (`workhorse/`) and JavaScript (`spec/frontend/`) only. Querying for `ImportedSymbol.identifier_name = "Gitlab::InternalEventsTracking"` returns zero results despite real `include Gitlab::InternalEventsTracking` statements in at least 8 source files.
3. **`prepend_mod_with(...)` macro is invisible to the `EXTENDS` edge.** This is the GitLab-specific dynamic-prepend mechanism used across the entire `gitlab-org/gitlab` codebase to mix EE modules into CE classes. CE `Project` has 0 incoming EXTENDS edges from `EE::Project` and `EE::Project` has 0 outgoing EXTENDS edges to CE Project — even though they ARE the EE override relationship at runtime.
These are filed as one consolidated issue because they share a root cause (Ruby DSL-shaped declarations and GitLab-specific macros not walked by the indexer), and because addressing them together is likely a single workstream on the indexer side.
## Positive partial finding (worth keeping)
The `EXTENDS` edge captures more than the schema describes. Schema says *"inheritance, interface implementation, struct embedding"* — but the edge also captures Ruby `include Module` mixin relationships. CE `Project` has 48 outgoing EXTENDS edges including dozens of `include`-d concerns (`Routable`, `Sortable`, `EachBatch`, etc.). This is meaningfully more coverage than the schema description implies, and is the partial mitigation for gap #2 above.
The schema docs should be updated to reflect this — agents reading the schema literally would not expect `include` relationships in `EXTENDS`.
## Reproducers
### Gap 1: DSL block body call extraction
```bash
# CE app/policies/project_policy.rb is 1049 lines with extensive DSL:
# 66 condition declarations
# 98 rule blocks
# 56 enable :X grants
# 251 prevent :X denials
# 11 def methods
# Orbit captures only the 5 `def`-declared helpers + 1 class:
glab orbit remote query (File DEFINES Definition for app/policies/project_policy.rb)
→ 6 Definitions total. 471 DSL-shaped declarations invisible.
# And calls through DSL bodies don't register either:
glab orbit remote query (neighbors, incoming CALLS edges to team_member?)
→ 1 trivial self-edge (ProjectPolicy → team_member?)
# REST ground truth: team_member? is referenced from at least 6 condition blocks
# across CE + EE policies. None captured.
```
### Gap 2: include Module as ImportedSymbol
```bash
glab orbit remote query (ImportedSymbol lookup, identifier_name eq "InternalEventsTracking")
→ 0 results
glab orbit remote query (ImportedSymbol lookup, any, project_id = 278964)
→ returns only Go imports (workhorse/) and JS imports (spec/frontend/)
# REST ground truth: ≥ 8 files include Gitlab::InternalEventsTracking
# (app/models/ci/pipeline.rb, ee/app/graphql/resolvers/..., lib/gitlab/auth.rb, etc.)
# Substitute path that works: CALLS edges to the module's primary method
glab orbit remote query (neighbors, incoming CALLS to Gitlab::InternalEventsTracking::track_internal_event)
→ 100 CALLS edges, 101 caller Definitions across real service/controller/worker files
```
### Gap 3: prepend_mod_with invisible to EXTENDS
```bash
# CE Project gets EE Project prepended via:
# app/models/project.rb:4151 → Project.prepend_mod_with('Project')
# Not a literal `prepend EE::Project`; the macro resolves dynamically in EE builds.
glab orbit remote query (neighbors, incoming EXTENDS to CE Project)
→ 2 edges only (QA::Resource::Fork, API::Entities::ProjectWithAccess)
Not present: EE::Project
glab orbit remote query (neighbors, outgoing EXTENDS from EE::Project)
→ 0 edges
# Both Definitions exist in the graph; their relationship does not.
```
## Impact
- **UC-10 (Dependency Analysis / Full Stack)** is the most directly affected. Three of its four scenarios surface these gaps. The "1-2 min full-stack dependency map" impact claim is not achievable on Rails codebases without addressing these.
- **UC-4 (Faster Code Review via Dependency Mapping)** is likely affected — review-time dependency analysis on Rails code hits the same gaps.
- **UC-2 (Blast Radius Analysis)** for code-level dependencies hits gaps #1 and #3.
- **UC-7 (Team Expertise / Bus Factor)** when expertise is concentrated in policy/routing/scope authorship hits gap #1.
For `gitlab-org/gitlab` specifically — the project the public-beta UAT is explicitly testing — these gaps are the bulk of the codebase's dependency structure. Most Ruby files in CE + EE are heavily DSL-driven.
## Suggested fixes (in order of impact)
1. **Extend the indexer to walk Ruby DSL block bodies for CALLS extraction.** Specifically: `condition`, `rule`, `policy`, `enable`, `prevent` for CanCanCan; `scope` for ActiveRecord; `validates`, `before_*`, `after_*` for AR callbacks; route DSL helpers. Each of these takes a block whose body is real Ruby — the indexer just needs to walk them.
2. **Capture Ruby `include`, `extend`, `prepend` as ImportedSymbols (or document the EXTENDS-as-include mitigation).** Either approach works; the agent needs SOME path to "which files compose this module." The EXTENDS-edge mitigation already exists for `include`; documenting it would be a quick win.
3. **Recognize GitLab's `prepend_mod_with` macro.** This is project-specific code but it's the dominant CE/EE composition mechanism in `gitlab-org/gitlab`. Either the indexer special-cases `prepend_mod_with('X')` → resolves to `prepend EE::X`, or the GitLab codebase exposes the relationship in a more graph-friendly form. Both are workable; without one of them, CE/EE dependency analysis is structurally blocked.
4. **Update the EXTENDS edge schema description** to call out that Ruby `include Module` is captured. Schema currently reads "inheritance, interface implementation, struct embedding" — agents reading literally would not query EXTENDS for mixin relationships.
## Environment
- `glab` version: `1.94.0 (aa456f48)`
- Endpoint: production Orbit (`POST /api/v4/orbit/query` on gitlab.com)
- Tested 2026-05-14 against `gitlab-org/gitlab` (project ID 278964)
## Suggested severity
`severity::2` — these gaps materially block UC-10 testing for the public-beta UAT scope. The blast radius of "you can't ask Orbit what depends on a Ruby ability or what overrides a CE class" on `gitlab-org/gitlab` is large enough that fixing this is a beta-readiness concern, not a polish concern.
## References
- Parent customer-zero issue: gitlab-org/orbit/knowledge-graph#602
- Surfaced during UC-10 testing under gitlab-org/orbit/knowledge-graph#606 (S1, S2, S3)
- Customer Zero bug-reporting epic: gitlab-org&21852
- Related (different root cause, same issue family): gitlab-org/orbit/knowledge-graph#577 (Definition `definition_type` case-sensitivity), gitlab-org/orbit/knowledge-graph#582 (queries silently returning empty), gitlab-org/gitlab#600140 (`source_code` nodes lack `IN_PROJECT` edge)
issue