Improve read_file tool to support chunked reading and size limits

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Close this issue

Problem

Currently, the read_file tool can attempt to read files of any size, which can lead to:

Tool responses that exceed reasonable size limits and overflow context windows
Inefficient token usage when only a portion of a large file is needed
No mechanism for the LLM to read specific portions of large files

When an LLM attempts to read a large file (e.g., a large CSV, log file, or generated code), the entire content is returned, which may be truncated aggressively or cause context issues.

We have recently moved from a read_file (single file read) to a read_files tool to read multiple files at once.

This problem is likely also connected to Agentic Duo Chat unnecessarily runs `sed` or ot... (#557751) where the LLM might try to work around the limitations of our read file too.

Desired Outcome

The read_file/read_files tool should:

Support reading files in chunks by accepting offset and limit parameters
Refuse to read files larger than a reasonable threshold (e.g., 2MB) without chunking
Return a helpful error message when a file is too large, instructing the LLM to use chunked reading instead
Allow the LLM to efficiently navigate large files by reading specific portions

This would enable the LLM to:

Read the beginning of a file to understand its structure
Navigate to specific sections of interest
Handle large files without overwhelming the context window

Proposal

Completely re-review both our read_file and read_files implementation against industry standards + ask anthropic for suggestions on an efficient read files tool.

Implement support for offset and limit but only use/expose these to the LLM if an appropriate executor is used. This can e.g. be decided on the provided language server version.

Implementation Plan

In language server, start reading the offset and limit in the tool. If they exist, apply the offset and limit. Otherwise ignore them.
In DWS create ReadFileV2 tool. The tool should accept limit and offset along with file_path. When setting tools, conditionally add the new tool based on language server version.

Edited Oct 21, 2025 by 🤖 GitLab Bot 🤖