[ActiveContext Code] Token count exceeding limit
From incident https://gitlab.enterprise.slack.com/archives/C0AEXA6A3LP:
We got elevated errors for `Unable to submit request because the input token count is 21483 but the model supports up to 20000` on the `text-embedding-005` endpoint.
{width=900 height=343}
[source](https://log.gprd.gitlab.net/app/r/s/fxOey)
{width=900 height=392}
[source](https://log.gprd.gitlab.net/app/r/s/HUp9M)
This repository was AdHoc indexed at roughly the same time as the elevated error rate:
{width=900 height=68}
{width=900 height=301}
[source](https://dashboards.gitlab.net/explore?schemaVersion=1&panes=%7B%22jbh%22%3A%7B%22datasource%22%3A%22mimir-runway%22%2C%22queries%22%3A%5B%7B%22refId%22%3A%22A%22%2C%22expr%22%3A%22sum+by+%28model_name%29+%28rate%28model_inferences_total%7Btype%3D%27ai-gateway%27%2C+error%3D%5C%22yes%5C%22%2C+model_engine%3D%27vertex-ai%27%7D%5B5m%5D%29%29%22%2C%22range%22%3Atrue%2C%22instant%22%3Atrue%2C%22datasource%22%3A%7B%22type%22%3A%22prometheus%22%2C%22uid%22%3A%22mimir-runway%22%7D%2C%22editorMode%22%3A%22code%22%2C%22legendFormat%22%3A%22__auto%22%7D%5D%2C%22range%22%3A%7B%22from%22%3A%22now-24h%22%2C%22to%22%3A%22now%22%7D%2C%22compact%22%3Afalse%7D%7D&orgId=1)
There’s likely one or a couple of documents that went over the limit with chunking. They would have been retried once, then added to the dead queue.
We need to
1. Fix this on the chunker
2. Force re-index the project(s) with the offending document(s)
3. Clear the dead queue
---
**Impact**
There might be some repositories that are never marked as ready if their embeddings kept failing, meaning they are not searchable.
epic