Proposal: Re-Evaluate AI Agent Tools Design and Architecture

Problem to Solve

High-Level Summary

The current tool architecture for our AI Features may face several challenges that impact performance, cost, model accuracy, and user experience in the near future. As the number of available tools increases, the context provided to the language model in each request expands, which can lead to higher latency and a greater likelihood that the model will select the incorrect tool.

Furthermore, the design of some tool parameters, such as the use of internal GitLab numeric or global IDs, does not align with the public-facing data formats (like project paths and IIDs) on which LLMs are predominantly trained. This misalignment can lead to less reliable tool usage.

Finally, the flat structure of the toolset creates ambiguity between tools that operate on a user's local environment (via an executor) and those that interact with remote GitLab resources, which can cause confusion for the model and result in degraded behavior.

Original Finding: !3027 (comment 2649388900)
Slack Thread: #1328 (comment 2652591727)

Problem 1: Tool Proliferation and Performance Degradation

The number of tools provided to the language model in every request is growing.

The anthropic documentation outlines that

When you call the Anthropic API with the tools parameter, we construct a special system prompt from the tool definitions, tool configuration, and any user-specified system prompt. The constructed prompt is designed to instruct the model to use the specified tool(s) and provide the necessary context for the tool to operate properly:

A recent analysis found as many as 28 distinct tools being included in the system prompt for a single request. This practice can have direct, negative consequences on performance, as outlined in Anthropic's documentation on tool use. Every tool's name, description, and schema are added to the prompt, which increases the input token count.

Full Original Slack Message

Hi Team 👋 the Knowledge Graph team is looking to integrate its tools into the Language Server (and agentic chat). We have a few questions. Can someone help us understand?

Is there a way to disable these tools provided to the LLM? It would be helpful to test the traversal in isolation. We've set up AIGW locally and can connect successfully with a hard-coded string, but is there a setting for filtering out tools sent to the LLM on the initial prompt call?
Will adding additional tools to the number of tools here degrade the quality and speed of the results? We are looking to add 3-4 tools. I've attached our tool list for comparison with [redacted]'s tool list for the initial prompt. Thank you!

A recent benchmark from LangChain found:

Both more context and more tools degrade agent performance

Agents that require longer trajectories degrade more quickly

Problem 2: Tool Parameter Design

Problem 1's expansive tool list design may lead to a tool parameter design that compounds the degraded performance issue. From what I can see, AIGW contains several tool implementations that can utilize a combination of GitLab global IDs (gids) or project paths (fullPath) along with internal IDs (iid).

The AI Gateway below uses project_id as a parameter; however, this may be problematic.

LLMs are trained mainly on publicly available data, such as GitLab URLs and the GitLab API documentation, which typically include project paths (e.g., gitlab-org/gitlab) and iids (e.g., #123 (moved) for issues). Using gids, which are internal identifiers like gid://gitlab/Issue/123456, are statistically likely to be less common in training data. Based on my own experience when building the deep research agent, I have found that LLMs are not able to use gids (or "project ids") as well as project paths and iids. Below, are examples of tools in the AI Gateway using gids and how we can explore changing them to improve tool call effectiveness.

Example of Tool Using Global/Numerical IDs vs. Project Paths

The following example is based on the get_merge_request tool in the AI Gateway repository's duo_workflow_service/tools/merge_request.py file. This tool fetches merge request details and uses project_id (a string that can be a numerical ID or global identifier) and merge_request_iid. Cross-reference this with this sample MCP server (src/main.ts), which uses fullPath (project path) and iid in its ServerItemToolArgsSchema.

get_merge_request Tool

File: duo_workflow_service/tools/merge_request.py

Current Implementation: The tool uses project_id (str, often a numerical/global ID like 13 or gid://gitlab/Project/1234) and merge_request_iid (int) to identify the project and MR. The description allows either project_id or a URL, but examples prioritize numerical project_id. Validation resolves URLs to a project_id for API calls.

class GetMergeRequest(DuoBaseTool):
    name: str = "get_merge_request"
    description: str = f"""Fetch details about the merge request.

    {MERGE_REQUEST_IDENTIFICATION_DESCRIPTION}

    For example:
    - Given project_id 13 and merge_request_iid 9, the tool call would be:
        get_merge_request(project_id=13, merge_request_iid=9)
    - Given the URL https://gitlab.com/namespace/project/-/merge_requests/103, the tool call would be:
        get_merge_request(url="https://gitlab.com/namespace/project/-/merge_requests/103")
    """
    args_schema: Type[BaseModel] = MergeRequestResourceInput

Potential Problem: When using project_id as a numerical/global ID (e.g., 13), it does not match public training data. LLMs encounter URLs with paths like gitlab-org/gitlab and iids like 123 (e.g., gitlab-org/gitlab!123). Numerical IDs are internal and less common in training corpora. Even though URLs are supported, the tool defaults to resolving to project_id for API calls, and examples emphasize numerical IDs.

Cross-Reference with this sample MCP Server: This sample MCP server handles merge requests via the fetch_gitlab_resource tool (src/main.ts), using fullPath (str, e.g., gitlab-org/gitlab) and iid (str) in ServerItemToolArgsSchema (src/main.ts):

export const ServerItemToolArgsSchema = z.array(
    z.object({
        itemType: z.enum([ItemType.Issue, ItemType.MergeRequest, ItemType.Epic]),
        iid: z.string(),
        fullPath: z.string(),
    }),
);

server.registerTool(
    "fetch_gitlab_resource",
    {
        title: "Fetch GitLab Resource",
        description: "Fetches detailed information about GitLab resources like issues, merge requests, and epics.",
        inputSchema: {
            items: ServerItemToolArgsSchema,
        },
    },
    async ({ items }) => {
        const fetchedResources = await fetchGitLabItems(items);
        const content = fetchedResources.map(formatItemAsXml).join("\n\n");
        return { content: [{ type: "text", text: content }] };
    },
);

In fetchGitLabItems (src/tools.ts), it uses fullPath and iid for API fetches:

switch (itemType) {
    case ItemType.MergeRequest:
        resource = await cachedFetcher.getMergeRequestDetailsByFullPath(fullPath, iid, true);
        // ...
}

This aligns with public data (e.g., parsing gitlab-org/gitlab!123).

Why use publicly accessible ID formats?

LLM Training Data: LLMs see public GitLab URLs with paths (gitlab-org/gitlab) and iids (!123). Numerical/global IDs are internal and rare in training data, which can potentially lead to inaccurate tool calls. While models are expected to become smarter with reasoning over time, we should be mindful of customers with custom models or models with fewer parameters.
Convention: Moving to parameters like fullPath and iid (which can be optional in the schema), further opens the possibility of using the same tool for several resources like issues, epics, etc.
Current Archictecture: Based on what I see on the overall AIGW/Workflow architecture, the AIGW will always execute resource fetching and iteration through the GitLab API, meaning that the agentic code is not executed in Rails. Because of this, I don't see using public ids as a one-way door.
API Limitations: Based on my experience writing both the GitLab Research Agent and gitlab-wrapped, Epics cannot be fetched directly via gid and are only accessible via fullPath and iid. The group id, which is the full path, is always required.

Problem 3: Ambiguity Between Local and Remote Tools

When GitLab Duo operates in an environment with a local code repository (e.g., in an IDE), the LLM is presented with tools for both local file operations (provided by the executor) and remote GitLab repository operations (provided by the AI Gateway). With both Problem 1 and Problem 2, this can create ambiguity and can lead to the model choosing the wrong tool for the task.

This issue was highlighted by @elwyn-gitlab (link):

Can we discourage duo from reading remote repository files when running in a local repo environment? It's slower and risks duo not understanding the actual local state of the codebase it's working on Example workflow 1032915

Also highlighted by @jfypk (link):

@jfypk

the run_command isn't allowed to run git commands. there's a separate run_git_command tool that isn't available in the tool yet. there's some discussion here: gitlab-org/gitlab#556137 (comment 2633861243)

@lkorbasiewicz thanks @jeff Park yeah that I understand, my point is that it would be nice if the extension knew about it and didn’t try using a tool it doesn’t have

@jfypk we actually had a conversation with Anthropic about this yesterday. Some of the problem lies in the sheer number of tools we have available for the extension. cc: @jessie Young

This confusion arises because the LLM may see two tools with similar purposes but different contexts:

Local File Tool: read_file from duo_workflow_service/tools/filesystem.py, which reads a file from the local disk via the executor.
Remote File Tool: get_repository_file from duo_workflow_service/tools/repository_files.py, which reads a file from a remote GitLab repository via the GitLab API.

For a task like "read the contents of main.py," the model might incorrectly choose get_repository_file, fetching the version from the remote default branch instead of the user's local, potentially uncommitted version.

This can be potentially caused by:

Problem 1: having an expansive tool list for every GitLab and Coding Agent use case
Problem 2: having many parameters in for every tool in the expansive list
Problem 3 (this compounded problem): having a flat tool structure with many tools/parameters where tools can be confused with each other

Proposed Solution

While the system works right now, we can and should explore further enhancements to provide the best-in-class user experience. Here are four potential solution paths the team has discussed. They can be implemented independently or in combination.

Option 1: Tool Consolidation

We can reduce the number of tools by combining multiple, similar tools into a single, more generic tool with a parameter to specify the desired action. This approach simplifies the toolset presented to the LLM and number of tokens used.

A primary candidate for consolidation is fetching GitLab resources. The tools get_issue, get_merge_request, and get_epic can be merged into a single fetch_gitlab_resource tool.

This sample MCP server at michaelangeloio/gitlab-mcp provides a working example of this.

MCP Server Tool Definition (src/main.ts):

server.registerTool(
	"fetch_gitlab_resource",
	{
		title: "Fetch GitLab Resource",
		description: "Fetches detailed information about GitLab resources like issues, merge requests, and epics.",
		inputSchema: {
			items: ServerItemToolArgsSchema, // Uses the schema with itemType, iid, and fullPath
		},
	},
	async ({ items }) => {
		const fetchedResources = await fetchGitLabItems(items);
		const content = fetchedResources.map(formatItemAsXml).join("\n\n");
		return {
			content: [{ type: "text", text: content }],
		};
	},
);

The implementation in src/tools.ts uses a switch statement on the itemType to call the appropriate API endpoint.

export async function fetchGitLabItems(items: FetchItemsArgs): Promise<FetchedResource[]> {
    // ...
	for (const item of items) {
		const { itemType, fullPath, iid } = item;
		// ...
		try {
			switch (itemType) {
				case ItemType.MergeRequest:
					resource = await cachedFetcher.getMergeRequestDetailsByFullPath(fullPath, iid, true);
					// ...
					break;
				case ItemType.Issue:
					resource = await cachedFetcher.getIssueDetailsByFullPath(fullPath, iid);
					// ...
					break;
				case ItemType.Epic:
					resource = await cachedFetcher.getEpicDetailsByFullPath(fullPath, iid);
					// ...
					break;
				// ...
			}
        // ...
        }
    }
}

To implement this in the AI Gateway, we could replace duo_workflow_service/tools/issue.py, merge_request.py, and epic.py with a single gitlab_resource.py file containing the consolidated tool.

Additionally, we can consider adding more resources to this tool:

the remote repository file fetching in this tool by adding a RepositoryFile type.
retrieving project information (replacing project.py since iid is optional and the LLM is already trained on gitlab project paths
any other gitlab resources that can be fetched via fullPath and iid

Furthermore, if we adopt @@mikolaj_wawrzyniak's proposal, this resource tool can be exposed by the official GitLab MCP server: gitlab-org&18413 (closed)

How this addresses tool confusion: A single, well-described, semantically distinct fetch_gitlab_resource can help:

Problem 1 by reducing token count
Problem 2 by adhering to publicly trained data that the LLM can more easily call tools with
Problem 3 by being semantically distinct from read_file. This reduces the chance of the LLM confusing a request to read a local file with a request to fetch a remote GitLab entity.

Option 2: Tool Retrieval

This approach, outlined by @jshobrook1, introduces a preliminary LLM call that acts as a "retrieval" or "routing" step. Based on the user's initial prompt, this step can select a small, relevant subset of tools to provide to the main agent for execution.

Example Flow:

User Prompt: "Summarize the discussion in issue gitlab-org/gitlab#123."
Tool Retrieval LLM Call:
- Input: User prompt + full list of all possible tools.
- Output: A list of relevant tool names, e.g., ['fetch_gitlab_resource'].
Main Agent LLM Call:
- Input: User prompt + only the fetch_gitlab_resource tool definition.
- Output: A call to fetch_gitlab_resource with the correct arguments.

Other examples of similar concepts at play:

semantic-router: https://github.com/aurelio-labs/semantic-router
llama-index routers: https://docs.llamaindex.ai/en/stable/module_guides/querying/router/
langchain routing: https://python.langchain.com/v0.1/docs/use_cases/query_analysis/techniques/routing/

How this addresses tool confusion: This method acts as a dynamic filter. For a prompt about local file modifications, the retrieval step would only select local filesystem tools (read_file, edit_file). For a prompt about a remote MR, it would only select GitLab resource tools. We can prevent it from choosing an inappropriate tool by ensuring the main agent only sees the tools relevant to the current task.

Option 3: Sub-Agent System

We can explore routing tasks to specialized "sub-agents," each equipped with a limited and specific toolset.

This pattern is supported natively by tools like Claude Code, as detailed in Anthropic's documentation on sub-agents.

Sub agents are pre-configured AI personalities that Claude Code can delegate tasks to. Each sub agent:

Has a specific purpose and expertise area

Uses its own context window separate from the main conversation

Can be configured with specific tools it’s allowed to use

Includes a custom system prompt that guides its behavior

When Claude Code encounters a task that matches a sub agent’s expertise, it can delegate that task to the specialized sub agent, which works independently and returns results.

The GitLab Research Agent is another example of this architecture, which was built using LangGraph.

An Orchestrator Agent (orchestrator_agent/graph.ts) analyzes the user's request and determines which GitLab items need to be researched.

It then invokes a Server Item Agent (server_item_agent/graph.ts) for each item. This sub-agent has a specialized toolset for deep research on GitLab resources, defined in server_item_agent/nodes/research_iteration.ts:

// gitlab_research_agent_source/server_item_agent/nodes/research_iteration.ts
const AllToolSchemas = {
    [TOOL_GRAPH_DB_BFS]: tool({
        description: "Search the graph database...",
        parameters: GraphDBToolArgsSchema,
    }),
    [TOOL_FETCH_COMMENTS]: tool({
        description: "Fetch all comments for a given GitLab Issue, Merge Request, or Epic.",
        parameters: FetchCommentsToolArgsSchema,
    }),
    [TOOL_FETCH_MR_FILE_CONTENT]: tool({
        description: "Fetch the content of one or more files from a given Merge Request.",
        parameters: FetchMrFileToolArgsSchema,
    }),
    // ... and other specialized tools
};

This architecture can be extended to create distinct sub-agents for different domains:

Codebase Agent: Equipped only with local file tools (read_file, edit_file, list_dir, grep).
GitLab Resource Agent: Equipped only with GitLab resource tools (fetch_gitlab_resource, update_gitlab_resource).
GitLab Knowledge Graph Agent: Equipped with specialized tools for the Knowledge Graph

For graphs like chat, we can pass the conversation history to each agent as they decide the next step in the graph node.

Option 4: Session-Based Tool Routing via MCP Server

The official GitLab MCP server, Language Server (node executor), and AIGW could implement a session-based routing mechanism that allows the LLM to control its available toolset dynamically during a conversation. This was proposed by @john-slaughter before.

This would involve a meta-tool that the LLM can call to switch its "context" or active toolset.

The LLM analyzes the user's request.
It calls a special meta-tool on the MCP server, such as activate_toolset(toolset_name='ci_operations').
For the duration of the session (or until deactivated), the MCP server modifies the context provided to the LLM to only include tools from the activated toolset (e.g., read_pipeline, run_job). Other tools are hidden from the LLM's view.
The LLM can switch contexts by calling the meta-tool again (e.g., activate_toolset(toolset_name='gitlab_resources')).

With @mikolaj_wawrzyniak's ADR for making MCP a first-class citizen, we can expand this methodology to the IDE and language server. This would allow the Language Server to become an MCP server with the same session-based tool router in addition to the Rails MCP Router.

Example Flow:

User: "Refactor the utils.py file in my current project."
LLM -> MCP: activate_toolset(toolset_name='local_filesystem')
MCP -> LLM: The next prompt from the MCP server to the LLM will only contain definitions for read_file, edit_file, list_dir.
LLM -> MCP: read_file(file_path='utils.py')
(...refactoring continues...)
User: "Now, create a merge request with these changes."
LLM -> MCP: activate_toolset(toolset_name='gitlab_resources')
MCP -> LLM: The next prompt from the MCP server to the LLM will only contain definitions for create_gitlab_resource, fetch_gitlab_resource, etc.
LLM -> MCP: create_gitlab_resource(...merge_request)

Architecture Diagram

sequenceDiagram
    participant User
    participant LS as Language Server<br/>(MCP)
    participant AIGW as AI Gateway
    participant LLM
    participant Rails as GitLab Rails<br/>(MCP)

    User->>LS: "Read utils.py"
    LS->>AIGW: Request + all tools
    AIGW->>LLM: Request + all tools
    LLM->>AIGW: activate_toolset('local')
    AIGW->>LS: activate_toolset('local')
    LS-->>AIGW: local tools only
    AIGW->>LLM: Updated context
    LLM->>AIGW: read_file()
    AIGW->>LS: read_file()
    LS-->>AIGW: file content
    AIGW-->>LS: Response
    LS-->>User: Response

    User->>LS: "Create MR"
    LS->>AIGW: Request
    AIGW->>LLM: Request
    LLM->>AIGW: activate_toolset('gitlab')
    AIGW->>LS: activate_toolset('gitlab')
    LS->>Rails: GET /mcp/tools (gitlab)
    Rails-->>LS: gitlab tools
    LS-->>AIGW: gitlab tools only
    AIGW->>LLM: Updated context
    LLM->>AIGW: create_merge_request()
    AIGW->>LS: create_merge_request()
    LS->>Rails: POST /create_merge_request
    Rails-->>LS: MR created
    LS-->>AIGW: MR created
    AIGW-->>LS: Response
    LS-->>User: Response

How this addresses tool confusion: This is a direct and explicit solution where the LLM itself manages its context. By activating a specific toolset, it removes tool ambiguity and ensures it can only call tools that are appropriate for the next immediate task.

Sudo Language Server Code

// GitLab Language Server - Maintains toolset state per session
class GitLabLanguageServerMCP extends McpServer {
  private currentToolset: string = 'default';
  private sessionId: string;
  
  private toolRegistry = {
    // Local filesystem tools
    'read_file': {
      name: 'read_file',
      description: 'Read contents of a file from local filesystem',
      inputSchema: { /* ... */ },
      handler: this.readFile.bind(this),
      toolsets: ['local_filesystem', 'default']
    },
    'edit_file': {
      name: 'edit_file',
      description: 'Edit a file in the local filesystem', 
      inputSchema: { /* ... */ },
      handler: this.editFile.bind(this),
      toolsets: ['local_filesystem', 'default']
    },
    'list_dir': {
      name: 'list_dir',
      description: 'List contents of a directory',
      inputSchema: { /* ... */ },
      handler: this.listDirectory.bind(this),
      toolsets: ['local_filesystem', 'default']
    },
    
    // GitLab resource tools (these are derived from MCP protocol from GitLab Rails)
    'fetch_gitlab_resource': {
      ...
    },
    'create_gitlab_resource': {
     ....
    },

    // Special toolset activation tool
    'activate_toolset': {
      name: 'activate_toolset',
      description: 'Switch the active toolset context',
      inputSchema: {
        type: 'object',
        properties: {
          toolset_name: {
            type: 'string',
            enum: ['local_filesystem', 'gitlab_resources', 'ci_operations', 'default'],
            description: 'The toolset to activate'
          }
        },
        required: ['toolset_name']
      },
      handler: this.activateToolset.bind(this),
      toolsets: ['*'] // Always available
    }
  };

  async handleRequest(request: JSONRPCRequest) {
    switch (request.method) {
      case 'tools/list':
        return this.listAvailableTools();
      case 'tools/call':
        return this.callTool(request.params);
      default:
        throw new Error(`Unknown method: ${request.method}`);
    }
  }

  async listAvailableTools() {
    // Filter tools based on current toolset state
    const availableTools = Object.values(this.toolRegistry).filter(tool => 
      tool.toolsets.includes(this.currentToolset) || 
      tool.toolsets.includes('*')
    );

    return {
      tools: availableTools.map(tool => ({
        name: tool.name,
        description: tool.description,
        inputSchema: tool.inputSchema
      })),
      metadata: {
        activeToolset: this.currentToolset,
        sessionId: this.sessionId
      }
    };
  }

  async activateToolset(args: { toolset_name: string }) {
    const previousToolset = this.currentToolset;
    this.currentToolset = args.toolset_name;
    
    // Get newly available tools
    const availableTools = await this.listAvailableTools();
    
    return {
      content: [{
        type: 'text',
        text: `Toolset switched from '${previousToolset}' to '${args.toolset_name}'. Available tools: ${availableTools.tools.map(t => t.name).join(', ')}`
      }],
      metadata: {
        toolsetChanged: true,
        previousToolset,
        currentToolset: this.currentToolset,
        toolCount: availableTools.tools.length
      }
    };
  }

  async callTool(params: { name: string; arguments: any }) {
    const tool = this.toolRegistry[params.name];
    if (!tool) {
      throw new Error(`Tool not found: ${params.name}`);
    }

    // Verify tool is available in current toolset
    if (!tool.toolsets.includes(this.currentToolset) && !tool.toolsets.includes('*')) {
      throw new Error(`Tool '${params.name}' is not available in toolset '${this.currentToolset}'`);
    }

    return await tool.handler(params.arguments);
  }
}

Caveats

Its important to call out any caveats with the proposed options, as some of them would require some pre-requisite approval or architecture decisions. This proposal involves a few assumptions:

Decouple Tools from AI Gateway

A step for any dynamic tool management strategy is to stop hard-coding the list of available tools in the AI Gateway. The ADR Duo Workflow ADR: Unify around MCP standard for all Duo Workflow integrations proposes a solution where the executor defines and advertises its available tools to the AI Gateway upon connection.

The contract.proto file in the AI Gateway already defines the gRPC messages for this mechanism:

message StartWorkflowRequest {
    // ...
    repeated McpTool mcpTools = 8;
}

message McpTool {
    string name = 1;
    string description = 2;
    string inputSchema = 3;
}

By implementing this ADR, the AI Gateway can dynamically build its list of available tools from various sources (executors, MCP servers), which is a necessary prerequisite for implementing tool retrieval, multi-agent systems, or session-based routing.

Re-evaluate Unit Primitives

The current implementation ties tool permissions to Unit Primitives (UPs), which would complicate tool consolidation. I believe there is an ongoing discussion about moving away from this feature-based pricing model toward a volume-based one, so we would need to double check with product about this.

From @brytannia

For UPs I’d advocate that we scrap them all together from tool permission and re-think our pricing policy to make it by volume of requests. This already causes issues when agent doesn’t have access to one tool, it’s trying to use another one which helps no one.

A product decision to move away from unit primitives at the tool level would simplify the complexity of implementing any of the options above. It would remove the need for mapping from a single consolidated tool (like fetch_gitlab_resource) back to multiple individual UPs (like ASK_ISSUE, ASK_MERGE_REQUEST).

GitLab MCP Server as SSOT for Remote Data

Additionally, rather than re-implementing the tools and directly into the AIGW, @bastirehm recommended:

I’m also wondering if we shouldn’t just 100% focus on this for any GitLab related tools and then always automatically connect to that (Other vendors seem to do it the same way at least for their Coding Agent).

Tool Evaluation and Monitoring

If we choose to implement any of these ideas and options, we should ensure evaluation tests are top priority. Having tool routing evaluation would help us maintain quality as we scale up and add more tools.

You can find more details about this here #1327 (closed)

Implementation Plan

Here is what an execution plan might look like:

Implement Prerequisites:
- Prioritize the implementation of the MCP Unification ADR to enable dynamic tool registration.
- Finalize the decision on Unit Primitives vs. volume-based pricing to clarify permission requirements for consolidated tools.
Step 1: Tool Consolidation and Parameter Alignment:
- Begin by implementing Option 1: Tool Consolidation. This provides immediate benefits in reducing prompt size and improving LLM accuracy.
- Refactor get_issue, get_merge_request, get_epic into a single fetch_gitlab_resource tool (or similar, based on what the team finds best).
- Update the tool's schema to use fullPath and iid instead of project_id or group_id.
- Apply the same consolidation logic to write operations (e.g., update_issue, update_merge_request) into a single update_gitlab_resource tool.

Note: this would happen in the official MCP server rather than the AIGW if MCP is chosen as the SSOT.

Step 2: Implement an Advanced Routing Mechanism:
- Evaluate and implement one of the advanced routing solutions.
- With option 2, define a routing mechanism for initial tool sets
- With option 3, define specialized sub-agents (e.g., CodebaseAgent, GitLabKnowledgeAgent) with distinct, non-overlapping toolsets to resolve the local vs. remote tool ambiguity.
- With option 4, integrating a "session" based toolset activator/router
Step 3: Integrate Evaluation and Monitoring:
- Implement the Tool Routing Evaluation Framework (#1327).
- Use this framework to measure the impact of the changes on tool selection accuracy, latency, and cost, and to guide future improvements.

Edited Jul 27, 2025 by Michael Angelo Rivera