Skip to content

Bail out code parsing if exceeding max duration

Tan Le requested to merge set-parser-timeout into main

What does this merge request do and why?

Specific code content or large one could cause CPU saturation (related issue) when parsing with tree-sitter.

This MR uses the parser timeout mechanism in tree-sitter to limit parsing time to 500ms.

How to set up and validate locally

  1. Check out to this merge request's branch.
  2. Ensure a local Docker image built successfully.
    docker buildx build --platform linux/amd64 \
      -t ai-gateway:dev .
  3. Run a shell to the Docker container
    docker run -it --platform linux/amd64 --rm -v $PWD:/app -it ai-gateway:test bash
  4. Start a REPL session
    poetry run python
    >>> from ai_gateway.code_suggestions.processing.base import LanguageId
    >>> from ai_gateway.prompts.parsers import CodeParser
    >>> CodeParser.from_language_id("\n".join(["def foo():"] * 100), LanguageId.JS, parse_timeout_micros=1)
    
    Traceback (most recent call last):
      File "/app/ai_gateway/prompts/parsers/treesitter.py", line 139, in from_language_id
        tree = parser.parse(bytes(content, "utf8"))
    ValueError: Parsing failed
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/app/ai_gateway/prompts/parsers/treesitter.py", line 141, in from_language_id
        raise ValueError(f"Failed to parse code content: {str(ex)}")
    ValueError: Failed to parse code content: Parsing failed
    
    >>> CodeParser.from_language_id("def foo():", LanguageId.JS, parse_timeout_micros=1)
    <ai_gateway.prompts.parsers.treesitter.CodeParser object at 0x4002ff72e0>

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Tan Le

Merge request reports