Update thresholds for code suggestions
Related to gitlab-org/gitlab#425095 (comment 1739978289)
$ ./test-dashboard.sh ./ai-gateway/main.dashboard.jsonnet
Installed https://dashboards.gitlab.net/dashboard/snapshot/T6c67Xc0LNIIIvgSkTmzf0o4qaQkWJ0Z - ai-gateway: Overview
Update thresholds for code suggestions
- increase default threashol for code suggestions (as code generations take longer time)
- use separate apdex for completions and generations
Merge request reports
Activity
added typemaintenance label
assigned to @jprovaznik
2 Warnings ⚠ This merge request is definitely too big (3035 lines changed), please split it into multiple merge requests. ⚠ This merge request does not refer to an existing milestone. If needed, you can retry the
🔁 danger-review
job that generated this comment.Generated by
🚫 Danger- Resolved by Chance Feick
- Resolved by Chance Feick
- Resolved by Jan Provaznik
- Resolved by Chance Feick
@cfeick could you please take a quick look at this attempt? I'm not familiar much with customizing dashboards and I might need some help finishing this: apparently I'm doing something wrong here because there are no data in the generated dashboard for complet/generation specific sections.
requested review from @cfeick
- Resolved by Chance Feick
Thanks @cfeick, I've updated the MR, could you please take a look?
requested review from @cfeick
- Resolved by Bob Van Landuyt
- Resolved by Bob Van Landuyt
Thanks, @jprovaznik! Left one more comment about traffic cessation, otherwise LGTM.
added 1 commit
- e53435c5 - Add trafficCessationAlertConfig based on feedback
requested review from @cfeick
- Resolved by Bob Van Landuyt
Thanks, Jan! LGTM
requested review from @reprazent
40 41 41 42 apdex: histogramApdex( 42 43 histogram='http_request_duration_seconds_bucket', 43 selector=baseSelector { status: { noneOf: ['4xx', '5xx'] } }, 44 satisfiedThreshold=2.5, 44 selector=baseSelector { status: { noneOf: ['4xx', '5xx'] }, handler: { noneOf: ['/v2/code/completions', '/v2/completions', '/v2/code/generations'] } }, Extracting this stuff out per route works okay in the short term, ideally we'd be moving this into the application in the form of the Application SLIs, like we have for the Rails application. That way, the application specifies what is "fast enough" and we don't need to create separate SLIs for different types of routes.
Ideally, this stuff would move to labkit, and we'd have a labkit-python version to support the AI-gateway.
But that's out of scope of this, obviously... I'll create issues to discuss this further in Scalability. I think there's some overlap between Observability and Practices for this (cc @cfeick, @abrandl as we've discussed this in the past).
Created gitlab-com/gl-infra/scalability#2793 for this
That way, the application specifies what is "fast enough" and we don't need to create separate SLIs for different types of routes.
@reprazent good point, it would be better to do it that way
👍
Thanks @cfeick, @jprovaznik, I had a thought. But nothing that should block here. So I'll merge.
Please keep in mind that for the error budget for stage groups, it will take 28d for the metrics with the old SLIs to roll out of the budget.
mentioned in commit 5ef2da26
A pipeline is running on a mirror related to this merge request.
Status: starting
https://ops.gitlab.net/gitlab-com/runbooks/-/pipelines/2815157
mentioned in issue gitlab-com/gl-infra/scalability#2793
mentioned in incident gitlab-com/gl-infra/production#17518 (closed)
🎉 This MR is included in version 2.358.2🎉 The release is available on GitLab release.
Your semantic-release bot
📦 🚀 mentioned in issue gitlab-org/gitlab#425095 (closed)