Skip to content

Investigate bundling tree-sitter in gitlab-org/gitlab

Overview

We should consider adding the ability to use tree-sitter in the main GitLab project for a couple of reasons.

  1. Specifically parsing magic comments may be more performant using tree-sitter as opposed to the current plan of using regular expressions. (This isn't empirically proven, but seems quite likely.)
  2. This would also allow for other code-level capabilities within GitLab (AI, or not) to make inferences about files.

Questions to answer

  1. Is tree-sitter the right choice for this?
  2. Is it necessary?
  3. Are there alternatives to it that we have not considered?

Assuming we want to push ahead:

  • What is the best way to bundle tree-sitter with the monolith application?
    • Ruby bindings for tree-sitter appear to be old, and poorly maintained.
    • Parsers are architecture specific. How do we work with that? (We can't just add a gemfile and hit bundle.)

Available dependencies to evaluate

Grammars

Bindings

Alternatives to tree sitter

  • Yet to investigate...

Questions

Why is this so much easier in Python?

Edited by Max Woolf