Skip to content

feat(mcp): add read_repository_files tool POC

A simple (and ugly) POC outlining proper tool schema design and potential implementation paths.

See !204566 (comment 2746436731) and !203763 (closed) for more details.

in the simple case of read_repository_file as a 1:1 mapping, which is a granular tool, we run the risk of context window saturation.

Take a look at this API https://docs.gitlab.com/api/repository_files/#get-file-from-repository

There are no parameters I can see to grab a line range or the ability to "expand" a truncated file that's above a reasonable file line limit.

So the current tool params look like:

Current 1:1 Mapped Tool Schema:

{
  "tool": "read_repository_file",
  "parameters": {
    "project_id": "string",
    "file_path": "string",
    "ref": "string"
  }
}

This returns the entire file content base64 encoded, which could be massive and immediately blow through context limits.

Better Tool Design with Context Engineering:

{
  "tool": "read_repository_files",
  "parameters": {
    "project_path": "string",
    "files": [
      {
        "path": "string",
        "ref": "string",
        "line_start": "integer (optional)",
        "line_end": "integer (optional)",
        "max_lines": "integer (optional, default: 100)"
      }
    ],
    "include_metadata": "boolean (default: true)"
  }
}

Example Response with System Instructions:

<repository_files>
  <file path="app/models/user.rb" ref="main">
    <metadata>
      <total_lines>450</total_lines>
      <returned_lines start="1" end="100"/>
      <truncated>true</truncated>
      <size_bytes>15234</size_bytes>
    </metadata>
    <content>
      <!-- First 100 lines of content here -->
    </content>
    <system_instruction>
      File truncated. To view more content, use:
      - Lines 101-200: {"line_start": 101, "line_end": 200}
      - Lines around specific area: {"line_start": 250, "line_end": 300}
      - Remaining lines: 350 lines available
    </system_instruction>
  </file>
  <file path="config/routes.rb" ref="main">
    <metadata>
      <total_lines>75</total_lines>
      <returned_lines start="1" end="75"/>
      <truncated>false</truncated>
    </metadata>
    <content>
      <!-- Complete file content -->
    </content>
  </file>
</repository_files>

This approach:

  1. Prevents context saturation by defaulting to reasonable chunks
  2. Allows batch operations with read_files (plural) reducing round trips
  3. Provides navigation hints so the model knows how to request more
  4. Uses XML structure for better model parsing (as per context engineering best practices)
  5. Includes metadata to inform decision-making without requiring full content

This is precisely why we need proper tool design patterns rather than just exposing raw API endpoints through MCP.

Edited by Michael Angelo Rivera

Merge request reports

Loading