post-process Strategy Group Snippets by File (!219805) · Merge requests · GitLab.org / GitLab

What does this MR do and why?

reference: !219658 (merged) gitlab-org#19745

multiple snippets in a same logic group can make it hard for LLM to reason about. For example, 4 snippets from server.rb, the first one is from line 1-line 10, the second is from line 11-line 20, the third is from line 100-121, the forth is from 120-130

These snippets share the same project_id, path, file_name, language, blob id. These information is repeated and it might negatively affect LLM. We can group and merge them.

now the content looks like:


Confidence: MEDIUM

1. tests/api/v2/test_v2_code.py (score: 0.7998)
   Lines 75-77:
    @pytest.mark.parametrize("prompt_version", [1])
    def test_request_latency(
        self,
        prompt_version: int,
        mock_client: TestClient,
        mock_completions: Mock,
    ):
   Lines 141-142:
        if prompt_version == 2:
            data.update(
                {
                    "prompt": current_file["content_above_cursor"],
                }
            )
            
   Lines 170-170:
        def get_request_duration(cap_logs):
            event = 'testclient:50000 - "POST /completions HTTP/1.1" 200'
            entry = next(entry for entry in cap_logs if entry["event"] == event)

            return entry["duration_request"]


2.......
........
........

References

#587194

Screenshots or screen recordings

Before	After

How to set up and validate locally

set the mcp client:

!205297 (comment 2756113040)

::Feature.enable(:post_process_semantic_code_search_group_by_file)

{
  "content": [
    {
      "type": "text",
      "text": "Confidence: MEDIUM\n\n1. tests/api/v2/test_v2_code.py (score: 0.7998)\n   Lines 141-142:\n                  ),\n            \n\n2. tests/duo_workflow_service/components/human_approval/test_tools_approval.py (score: 0.7981)\n   Lines 390-390:\n                  ]"
    }
  ],
  "structuredContent": {
    "items": [
      {
        "path": "tests/api/v2/test_v2_code.py",
        "project_id": 1000000,
        "language": "python",
        "blob_id": "f8be989993d07f0e7d472e32db1f6f7b0bc00981",
        "ranges": [
          {
            "start_line": 141,
            "end_line": 142,
            "content": "            ),\n            ",
            "score": 0.7998023
          }
        ],
        "score": 0.7998023
      },
      {
        "path": "tests/duo_workflow_service/components/human_approval/test_tools_approval.py",
        "project_id": 1000000,
        "language": "python",
        "blob_id": "96985e997aacef1469d89769173b55222fa10507",
        "ranges": [
          {
            "start_line": 390,
            "end_line": 390,
            "content": "            ]",
            "score": 0.7981448
          }
        ],
        "score": 0.7981448
      }
    ],
    "metadata": {
      "count": 2,
      "has_more": false,
      "confidence": "medium"
    }
  },
  "isError": false
}

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #579852 (closed)

[skip feature-flag]

Edited Jan 30, 2026 by Tian Gao

post-process Strategy Group Snippets by File

What does this MR do and why?

References

Screenshots or screen recordings

How to set up and validate locally

MR acceptance checklist

Merge request reports