Streaming/chunked graphQL API for duo chat returns inconsistent first chunk
Bug
As discussed in Slack (internal) there is a weird inconsistency in how our chunks are streamed back in terms of stripping out trailing parts of the leading Final Answer:
from the LLM response.
This also presents itself in the UI with the chat response changing from having some extra leading characters which are then stripped out once the streaming output is finished.
While Streaming | When finished |
---|---|
![]() |
![]() |
This is intermittent and presumably varies depending on how the response is chunked up.
How to reproduce
- Send message
hello
to duo chat - Watch the streamed response closely
- This is intermittent so you may need to try a few times before it happens
Technical notes
This seems to be related to https://gitlab.com/gitlab-org/gitlab/-/blob/baddce2e7ceaa2fc05b016fc2a2cf48791034d81/ee/lib/gitlab/llm/chain/streamed_zero_shot_answer.rb#L14 . We need to investigate what kinds of chunks result in this behaviour and update the spec https://gitlab.com/gitlab-org/gitlab/-/blob/baddce2e7ceaa2fc05b016fc2a2cf48791034d81/ee/spec/lib/gitlab/llm/chain/streamed_zero_shot_answer_spec.rb