Skip to content

Update AI blueprint

Jan Provaznik requested to merge jp-ai-bp into master

What does this MR do and why?

Updates AI blueprint to cover also direct client<->AI gateway connections.

Why?

Code completion requests are very latency-sensitive. Using direct client<->AI gateway connection (and skipping Rails) seems to be the only effective solution to noticeably reduce latency (by 145ms according this breakdown. Also latency can be decreased further when we support multi-region AI-gateway which should decrease client<->AI gateway latency.

More details in &12224

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Edited by Jan Provaznik

Merge request reports