Orbit: `Definition.commit_sha` and `start_line`/`end_line` are null on all returned records despite being documented properties
## Summary
`Definition.commit_sha` and `Definition.start_line` (along with `end_line`, `start_byte`, `end_byte`, `start_char`, `end_char`) are documented properties on `Definition` nodes — exposed in the schema, accepted as columns by the query DSL, and accepted as filter targets — but are **null on every returned record** when querying `gitlab-org/gitlab` (project ID 278964).
This blocks any UC-11 (Code & Data Lineage) scenario that needs to identify the commit a symbol was last touched in. It would also block UC-2 (Blast Radius — historical deps), UC-9 (Incident Root Cause when a symbol-level granularity is needed), and any agent workflow that wants to link a Definition back to its source commit or file location.
Adjacent to but distinct from gitlab-org/orbit/knowledge-graph#582 ("queries that should return results silently return empty"). That issue is about **multi-node aggregations returning zero rows**. This issue is about **scalar properties on returned rows being null** when they should carry source-control / source-location data.
## Reproducer
```json
{
"query": {
"query_type": "traversal",
"node": {
"id": "d",
"entity": "Definition",
"filters": {"project_id": {"op": "eq", "value": 278964}}
},
"limit": 50
}
}
```
```console
$ glab orbit remote query /tmp/q.json --format raw | jq '{
total: ([.result.nodes[] | select(.type == "Definition")] | length),
with_commit_sha: ([.result.nodes[] | select(.type == "Definition" and .commit_sha != null and .commit_sha != "")] | length),
with_start_line: ([.result.nodes[] | select(.type == "Definition" and .start_line != null)] | length)
}'
{
"total": 50,
"with_commit_sha": 0,
"with_start_line": 0
}
```
Every one of the 50 sampled Definitions has `commit_sha: null` and `start_line: null`. Same result holds when filtering to specific files (e.g. `app/models/user.rb` Definitions all return null on these properties).
## Schema documents these properties
From `glab orbit remote schema Definition`:
```json
{
"name": "Definition",
"properties": [
{"name": "id", ...},
{"name": "project_id", ...},
{"name": "branch", ...},
{"name": "commit_sha", "data_type": "String", "nullable": true, ...},
{"name": "file_path", ...},
{"name": "fqn", ...},
{"name": "name", ...},
{"name": "definition_type", ...},
{"name": "start_line", "data_type": "Int", "nullable": true, ...},
{"name": "end_line", "data_type": "Int", "nullable": true, ...},
{"name": "start_byte", "data_type": "Int", "nullable": true, ...},
{"name": "end_byte", "data_type": "Int", "nullable": true, ...},
{"name": "start_char", ...},
{"name": "end_char", ...},
{"name": "content", ...}
]
}
```
All seven location/source-control properties (`commit_sha`, `start_line`, `end_line`, `start_byte`, `end_byte`, `start_char`, `end_char`) are declared with `nullable: true` — which technically permits the observed behavior — but the practical consequence is that an agent reading the schema would expect these fields to *typically* be populated (the whole point of a code graph is anchoring symbols to source) and instead gets uniform null.
## Why this matters
For UC-11 specifically, **the entire "where did this code come from?" question depends on `commit_sha`**. Without it the workflow degrades to:
- Find the Definition (works)
- Read its commit_sha → **null**
- Fall back to grep'ing the file externally, using git blame outside Orbit
That's not "Orbit + REST as a two-tool dance" (which UC-9 and UC-10 demonstrated as a working pattern) — it's "Orbit can't contribute at all to this question." The graph identifies the symbol but anchors it to nothing in time or space.
For UC-2 (Blast Radius) and any blame-style or evolution-style query, the same gap blocks the natural workflow.
## Adjacent but distinct findings
- gitlab-org/orbit/knowledge-graph#582 — silent empty results in aggregations. Different shape (multi-node aggregation returning zero rows). Both surface a pattern of "schema looks right but data isn't there."
- gitlab-org/gitlab#600162 — Ruby DSL declarations invisible. Different layer (indexer-level missing relationships). Same UAT-readiness concern.
- gitlab-org/gitlab#600140 — `source_code` domain lacks `IN_PROJECT` edge. Different shape (schema-design gap). All four findings together describe a `source_code` domain that is structurally incomplete for UC-2 / UC-10 / UC-11.
## Suggested fix paths
1. **Populate `commit_sha` and `start_line`/`end_line` from the indexer's existing data.** The indexer must already know which commit a Definition was extracted from (it walks a specific tree-sha snapshot). Surfacing that data into the property fields is the cheapest fix.
2. **If the data genuinely isn't available, mark the properties as deprecated or remove them from the schema.** A documented-but-null field is worse than a missing field because the agent will repeatedly request it expecting a value.
3. **Document the limitation explicitly.** If the data won't be populated soon, the schema description should call out that these properties are reserved for future use.
## Environment
- `glab` version: `1.94.0 (aa456f48)`
- Endpoint: production Orbit (`POST /api/v4/orbit/query` on gitlab.com)
- Tested 2026-05-14 against `gitlab-org/gitlab` (project ID 278964)
- Sample sizes: 50 Definitions cross-project, 8+ Method Definitions in `app/models/user.rb`, all return null on the listed properties
## Suggested severity
`severity::3` — does not block use entirely, but materially blocks UC-11's symbol-level provenance workflow and any blame-style / evolution-style query on the source_code domain.
## References
- Parent customer-zero issue: gitlab-org/orbit/knowledge-graph#602
- Surfaced during UC-11 S2 testing under gitlab-org/orbit/knowledge-graph#607
- Customer Zero bug-reporting epic: gitlab-org&21852
- Related: gitlab-org/orbit/knowledge-graph#582, gitlab-org/gitlab#600162, gitlab-org/gitlab#600140
issue