Bail out code parsing if exceeding max duration
What does this merge request do and why?
Specific code content or large one could cause CPU saturation (related issue) when parsing with tree-sitter
.
This MR uses the parser timeout mechanism in tree-sitter
to limit parsing time to 500ms.
How to set up and validate locally
- Check out to this merge request's branch.
- Ensure a local Docker image built successfully.
docker buildx build --platform linux/amd64 \ -t ai-gateway:dev .
- Run a shell to the Docker container
docker run -it --platform linux/amd64 --rm -v $PWD:/app -it ai-gateway:test bash
- Start a REPL session
poetry run python >>> from ai_gateway.code_suggestions.processing.base import LanguageId >>> from ai_gateway.prompts.parsers import CodeParser >>> CodeParser.from_language_id("\n".join(["def foo():"] * 100), LanguageId.JS, parse_timeout_micros=1) Traceback (most recent call last): File "/app/ai_gateway/prompts/parsers/treesitter.py", line 139, in from_language_id tree = parser.parse(bytes(content, "utf8")) ValueError: Parsing failed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/app/ai_gateway/prompts/parsers/treesitter.py", line 141, in from_language_id raise ValueError(f"Failed to parse code content: {str(ex)}") ValueError: Failed to parse code content: Parsing failed >>> CodeParser.from_language_id("def foo():", LanguageId.JS, parse_timeout_micros=1) <ai_gateway.prompts.parsers.treesitter.CodeParser object at 0x4002ff72e0>
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.
Edited by Tan Le