Create:Code Creation - Investigate Error Budget

We recently set up the error budget reporting for groupcode creation. Unfortunately, we are already way over our budget.

More information about error budgets can be found in the handbook

Dashboards

Here is the Grafana Error Budget Dashboard for Create:Code Creation

Screenshot_2023-09-13_at_1.20.56_PM

It looks like the biggest contributing factor is our Apdex:

Screenshot_2023-09-13_at_1.22.53_PM

Initial Investigation

Using the Rails Request Apdex search from the dashboard, it looks like an endpoint that needs attention is POST /api/:version/code_suggestions/completions. This is currently measured against the default threshold of 1s and we had over 10,000 requests over that threshold per day for the past week. That was about 1% of all of our requests to that endpoint.

Next Steps

We should verify that the error budget violations are coming from POST /api/:version/code_suggestions/completions. The error budget dashboard also shows that runway_ingress could be a source of the failures.

Since we are using the default target, we should think about whether using a custom target for this endpoint would make sense.