compute a overall confidence score for results from semantic_code_search
What does this MR do and why?
issue: #579851 (closed) related to: !219467 (merged). #581762.
Compute a confidence bucket based on score distribution:
- high: strong top candidate + steep drop-off
- medium: reasonable cluster / partial relevance
- low: flat distribution / ambiguous meaning
This helps the LLM decide:
- When to answer directly (high)
- When to hedge / propose alternatives (medium)
- When to ask clarifying questions (low)
now the response will look like:
Confidence: MEDIUM
1. docs/flow_registry/contribution_guidelines.md (score: 0.8866)
**For `experimental` Version:**
- Breaking changes are allowed
- Suitable for prototyping and experimentation
- NOT suitable for external feature development
2. ........
.......
References
Screenshots or screen recordings
| Before | After |
|---|---|
How to set up and validate locally
- set the mcp client:
::Feature.enable(:post_process_semantic_code_search_overall_confidence)
{
"content": [
{
"type": "text",
"text": "Confidence: MEDIUM\n\n1. docs/flow_registry/contribution_guidelines.md (score: 0.8866)\n - Removing or renaming a parameter.\n - Non backwards compatible change of the type of the parameter value.\n - Adding a new parameter if it is required.\n- New optional parameters can be added\n- Deprecation warnings should precede removal of functionality (follow GitLab deprecation process)\n- Documentation must be updated with all changes\n\n**For `experimental` Version:**\n\n- Breaking changes are allowed\n- Suitable for prototyping and experimentation\n- NOT suitable for external feature development\n- Can be used to validate designs before implementing in stable version\n\n## Core Architecture\n\n### State Management\n\nFlow Registry uses a fixed LangGraph state structure to ensure consistency and compatibility across all components and routers. The state structure is defined in [`duo_workflow_service/agent_platform/v1/state/base.py`](/duo_workflow_service/agent_platform/v1/state/base.py).\n\n#### State Structure\n\nThe state includes the following top-level attributes:\n\n\n2. duo_workflow_service/agents/base.py (score: 0.8795)\n from datetime import datetime, timezone\nfrom typing import Any\n\nfrom langchain_core.messages import BaseMessage\nfrom langchain_core.runnables import RunnableBinding\n\nfrom ai_gateway.prompts import Prompt\nfrom duo_workflow_service.entities.state import (\n MessageTypeEnum,\n SlashCommandStatus,\n ToolInfo,\n ToolStatus,\n UiChatLog,\n)\n\n\nclass BaseAgent(RunnableBinding[Any, BaseMessage]):\n name: str\n prompt: Prompt\n workflow_id: str\n\n def __init__(self, prompt: Prompt, **kwargs) -> None:\n super().__init__(\n prompt=prompt, bound=prompt, **kwargs\n ) # type: ignore[call-arg] # seems that mypy checks only against the immediate parent's init arguments\n\n "
}
],
"structuredContent": {
"items": [
{
"project_id": 1000000,
"path": "docs/flow_registry/contribution_guidelines.md",
"content": " - Removing or renaming a parameter.\n - Non backwards compatible change of the type of the parameter value.\n - Adding a new parameter if it is required.\n- New optional parameters can be added\n- Deprecation warnings should precede removal of functionality (follow GitLab deprecation process)\n- Documentation must be updated with all changes\n\n**For `experimental` Version:**\n\n- Breaking changes are allowed\n- Suitable for prototyping and experimentation\n- NOT suitable for external feature development\n- Can be used to validate designs before implementing in stable version\n\n## Core Architecture\n\n### State Management\n\nFlow Registry uses a fixed LangGraph state structure to ensure consistency and compatibility across all components and routers. The state structure is defined in [`duo_workflow_service/agent_platform/v1/state/base.py`](/duo_workflow_service/agent_platform/v1/state/base.py).\n\n#### State Structure\n\nThe state includes the following top-level attributes:\n\n",
"name": "contribution_guidelines.md",
"language": "",
"_score": 0.88655543,
"blob_id": "e5d768db3405cc3ff8ef7d70e3affcc63ca62d86",
"start_byte": 1874,
"length": 974,
"start_line": 39
},
{
"project_id": 1000000,
"path": "duo_workflow_service/agents/base.py",
"content": "from datetime import datetime, timezone\nfrom typing import Any\n\nfrom langchain_core.messages import BaseMessage\nfrom langchain_core.runnables import RunnableBinding\n\nfrom ai_gateway.prompts import Prompt\nfrom duo_workflow_service.entities.state import (\n MessageTypeEnum,\n SlashCommandStatus,\n ToolInfo,\n ToolStatus,\n UiChatLog,\n)\n\n\nclass BaseAgent(RunnableBinding[Any, BaseMessage]):\n name: str\n prompt: Prompt\n workflow_id: str\n\n def __init__(self, prompt: Prompt, **kwargs) -> None:\n super().__init__(\n prompt=prompt, bound=prompt, **kwargs\n ) # type: ignore[call-arg] # seems that mypy checks only against the immediate parent's init arguments\n\n ",
"name": "base.py",
"language": "python",
"_score": 0.87950397,
"blob_id": "25b07e48d4f261b5c0a306c001e2d5ac1ddb21d8",
"start_byte": 0,
"length": 706,
"start_line": 1
}
],
"metadata": {
"count": 2,
"has_more": false,
"confidence": "medium"
}
},
"isError": false
}
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #579851 (closed)
Edited by Tian Gao
