Investigate bundling tree-sitter in gitlab-org/gitlab
Overview
We should consider adding the ability to use tree-sitter in the main GitLab project for a couple of reasons.
- Specifically parsing magic comments may be more performant using tree-sitter as opposed to the current plan of using regular expressions. (This isn't empirically proven, but seems quite likely.)
- This would also allow for other code-level capabilities within GitLab (AI, or not) to make inferences about files.
Questions to answer
- Is tree-sitter the right choice for this?
- Is it necessary?
- Are there alternatives to it that we have not considered?
Assuming we want to push ahead:
- What is the best way to bundle tree-sitter with the monolith application?
- Ruby bindings for tree-sitter appear to be old, and poorly maintained.
- Parsers are architecture specific. How do we work with that? (We can't just add a gemfile and hit
bundle
.)
Available dependencies to evaluate
Grammars
- https://github.com/tree-sitter/tree-sitter-ruby - Seems to be official and functional. Also seems to be well maintained.
Bindings
- https://github.com/tree-sitter/ruby-tree-sitter.old - Literally has "old" in the URL. Not updated in over 3 years.
- https://github.com/calicoday/ruby-tree-sitter-ffi - Updated in the last year, but only just. Requires building and bundling runtime as a separate file. I am unsure how we would achieve that.
- https://github.com/Faveod/ruby-tree-sitter - Appears to be better maintained, but I'm unable to get it to work with a Rails application after installing.
-
https://github.com/Shopify/tree_stand - Shopify have their own bindings. Appears to be dependent on the above gem. Currently investigating...
- This looks potentially promising. I had to build tree-sitter from source, rather than relying on the
brew
repository, which is fine. Let's see where this goes...
- This looks potentially promising. I had to build tree-sitter from source, rather than relying on the
Alternatives to tree sitter
- Yet to investigate...
Questions
Why is this so much easier in Python?
-
https://github.com/tree-sitter/py-tree-sitter - It's better maintained and has no external dependencies. It appears to just work.
™️
Edited by Max Woolf