Skip to content

Fix incomplete unknown event stream from V2 Chat Agent

Shinya Maeda requested to merge fix-v2-chat-unknown-event-incomplete into master

This is a high priority MR for Switch to Chat Agent V2 (gitlab-org#13533). Please prioritize the review and merge.

What does this MR do and why?

This MR fixes [V2 Chat Agent Bug] A1002 Gitlab::Llm::Chain::... (#490668) and Chunked encoding streaming from AI Gateway is n... (#490376).

Gitlab::HTTP seems to have a bug that it can't iterate streamed events as-is sent by AI Gateway, but it split the data further by the BUFSIZE = 1024 * 16. This effectively causes a bug in "Unknown event" that, if the data size exceeds the size, the event can't be parsed/recognized in the step executor.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to reproduce

  1. Apply the following patch in AI Gateway:
diff --git a/ai_gateway/chat/agents/react.py b/ai_gateway/chat/agents/react.py
index b274e666..6c93ebba 100644
--- a/ai_gateway/chat/agents/react.py
+++ b/ai_gateway/chat/agents/react.py
@@ -234,22 +234,24 @@ class ReActAgent(Prompt[ReActAgentInputs, TypeAgentEvent]):
         astream = super().astream(input, config=config, **kwargs)
         len_final_answer = 0
 
-        async for event in astream:
-            if is_feature_enabled(FeatureFlag.EXPANDED_AI_LOGGING):
-                log.info("Response streaming", source=__name__, streamed_event=event)
-
-            if isinstance(event, AgentFinalAnswer) and len(event.text) > 0:
-                yield AgentFinalAnswer(
-                    text=event.text[len_final_answer:],
-                )
-
-                len_final_answer = len(event.text)
-
-            events.append(event)
-
-        if any(isinstance(e, AgentFinalAnswer) for e in events):
-            pass  # no-op
-        elif any(isinstance(e, AgentToolAction) for e in events):
-            yield events[-1]
-        elif isinstance(events[-1], AgentUnknownAction):
-            yield events[-1]
+        yield AgentUnknownAction(text="a" * 20000)
+        # 16338
+        # async for event in astream:
+        #     if is_feature_enabled(FeatureFlag.EXPANDED_AI_LOGGING):
+        #         log.info("Response streaming", source=__name__, streamed_event=event)
+
+        #     if isinstance(event, AgentFinalAnswer) and len(event.text) > 0:
+        #         yield AgentFinalAnswer(
+        #             text=event.text[len_final_answer:],
+        #         )
+
+        #         len_final_answer = len(event.text)
+
+        #     events.append(event)
+
+        # if any(isinstance(e, AgentFinalAnswer) for e in events):
+        #     pass  # no-op
+        # elif any(isinstance(e, AgentToolAction) for e in events):
+        #     yield events[-1]
+        # elif isinstance(events[-1], AgentUnknownAction):
+        #     yield events[-1]
  1. Run gdk
  2. Execute chat command in Rails console:
prompt_message = Gitlab::Llm::ChatMessage.new(
    content: "Hello",
    role: "user",
    user: User.first,
    ai_action: "chat",
    context: Gitlab::Llm::AiMessageContext.new(resource: User.first)
)
ai_prompt_class = nil
options = {:content=>"Hello", :extra_resource=>{}, :action=>:chat}
Gitlab::Llm::Completions::Chat.new(prompt_message, ai_prompt_class, options).execute
Edited by Shinya Maeda

Merge request reports