Add ability to stream chat responses
What does this MR do and why?
This adds the capabilities to stream messages for GitLab Duo chat.
The streamed chunks get send over aiCompletionResponse
but the order
is not guaranteed on the client. Each chunk has a chunkId
which has a
guaranteed order, starting with 1
.
The client_subscription_id
is meant for the client to identify which
mutation belongs to which subscription response. However, chat is a
special feature where we want to sync responses between multiple chat
instances of the same user.
This means we want to have two different subscriptions for the same request: 1. User-specific to sync between multiple clients 2. Request-specific for streaming
By not streaming the response when a client_subscription_id
is not
present, we also make sure that older clients still work when the
stream_gitlab_duo
is enabled. Otherwise, they would receive a streamed
response on a subscription on user_id
and resource_id
.
NOTE: Related FE MR by @dmishunov !130347 (merged)
Screenshots or screen recordings
The following only works with the FE and graphql changes. There is an MR that combines both: !130415 (closed)
Without FF | With FF |
---|---|
streaming-example-no-ff | streaming-example |
Before | After |
---|---|
How to set up and validate locally
- Have GitLab Duo and all AI features set up
- Enable the FF
stream_gitlab_duo
- Checkout !130415 (closed) as it combines this MR with the FE changes and updates the GraphQL subscriptions
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.