feat(indexer): split benign code-pipeline file skips out of errors_total
Splits benign per-file skips out of gkg_indexer_code_errors_total into a new gkg_indexer_code_files_skipped_total{reason} counter so policy outcomes stop dominating the error rate.
In orbit-prd over 24h, errors_total{stage="parse"} showed 1111 events while real failures (repository_fetch) showed only 7. The parse bucket is filled by the JS pipeline emitting record_error after warning js: skipped file for files larger than 2 MiB or with lines longer than 5000 bytes. The per-file watchdog (Internal { context: "sentinel_timeout" }) also flowed through the same counter. None of these are failures.
What changed
- New metric
gkg.indexer.code.files.skipped(Prom:gkg_indexer_code_files_skipped_total), labelreason, declared incrates/gkg-observability/src/indexer/code.rs. - Pipeline context grows
record_file_skipped(path, reason)next torecord_error. Skips drain intoPipelineResult.files_skipped(crates/code-graph/src/v2/pipeline.rs). - JS pipeline classifies per-file analyzer messages:
"refusing oversize file"->oversize,"line too long"->line_too_long, anything else stays onrecord_error(crates/code-graph/src/v2/langs/custom/js/pipeline.rs#L62-75). - Rust pipeline uses the same classifier symmetrically (
crates/code-graph/src/v2/langs/custom/rust/mod.rs#L146-157). - Per-file watchdog reroutes from
Internal { sentinel_timeout }tofiles.skipped{reason="timeout_sentinel"}(crates/code-graph/src/v2/pipeline.rs#L982-994). - Indexer dispatch increments the new counter from the drained list (
crates/indexer/src/modules/code/indexing_pipeline.rs#L294-301). CodeGraphError::stage()is unchanged; only the recording side moves. Genuine grammar failures still surface aserrors_total{stage="parse"}.
Tests and checks
- New unit tests in
js/pipeline.rswrite an oversize file and a line-too-long file and assertfiles_skippedincrements whilegraph_errorsstays empty. - Pipeline-level tests cover
classify_skip_messageand therecord_file_skippedcollection path used by the watchdog. mise test:fastpasses (1900/1900).mise lint:codeclean.mise run metrics:catalogandmise run dashboardsregenerated; pre-commitmetrics-catalog-checkanddashboards-checkboth green. The auto-generated reference panel for the new metric appears inorbit-gkg-indexer.dashboard.jsonandorbit-all-metrics.dashboard.json; no story-shaped panels were added.docs/design-documents/observability.mdlists the new metric and notes the rationale.
Closes #529 (closed) Refs &20992