Investigate tree-sitter grammar build failures

Problem to solve

The CI pipeline is failing when building tree-sitter grammars.

https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/prompt-library/-/jobs/5876852342

This issue can also be reproduced locally by running the following docker build

docker build --target=client --no-cache --tag promptlib-client:0.2.0 .

[+] Building 349.2s (17/22)
 => [internal] load .dockerignore                                                                                                                                                                     0.0s
 => => transferring context: 197B                                                                                                                                                                     0.0s
 => [internal] load build definition from Dockerfile                                                                                                                                                  0.0s
 => => transferring dockerfile: 1.08kB                                                                                                                                                                0.0s
 => [internal] load metadata for docker.io/library/python:3.11.6-slim                                                                                                                                 4.2s
 => [internal] load metadata for docker.io/library/python:3.11.6                                                                                                                                      3.3s
 => [builder  1/12] FROM docker.io/library/python:3.11.6@sha256:2a725c9721f737a2944244c98c714d24f8bcfaddd9f5c15083cbaa024f7fce54                                                                     39.2s
 => => resolve docker.io/library/python:3.11.6@sha256:2a725c9721f737a2944244c98c714d24f8bcfaddd9f5c15083cbaa024f7fce54                                                                                0.0s
 => => sha256:ef55ff88816b389767bc8dbb5a9e57088c59874b38b2fc06e449e2a26b3c68e7 7.49kB / 7.49kB                                                                                                        0.0s
 => => sha256:5cdd9a70365f741a6b9f7a4e32cdb7d4aa29ac73da0b78ca0a83e937f285fdd5 63.99MB / 63.99MB                                                                                                     14.7s
 => => sha256:df2021ddb7d686bdbb125598b2a6163d63035f080356b3014595f354ea0b40d6 49.61MB / 49.61MB                                                                                                     13.3s
 => => sha256:8d647f1dd7e741209a8a75083ccc889e39cb3e94c17f45441eae96e1a679d971 23.58MB / 23.58MB                                                                                                      4.8s
 => => sha256:2a725c9721f737a2944244c98c714d24f8bcfaddd9f5c15083cbaa024f7fce54 2.14kB / 2.14kB                                                                                                        0.0s
 => => sha256:7f64e6a715a2394e7f25dfdd54e111f856c4cfbc577c116e7ab1b7c03e14c49d 2.01kB / 2.01kB                                                                                                        0.0s
 => => sha256:95089c600b361807380090316c250b0b8eaf4fa2175b11ac8f49bb7581c61125 202.45MB / 202.45MB                                                                                                   29.1s
 => => extracting sha256:df2021ddb7d686bdbb125598b2a6163d63035f080356b3014595f354ea0b40d6                                                                                                             3.6s
 => => sha256:031bfcddba4a0d9962728adf894b6e5bb4effcf5df35cdacd98177bee657a7c6 6.24MB / 6.24MB                                                                                                       15.6s
 => => sha256:7a2f0d5b0b056032259c906191d705e2ea2ab8f41955040ecb0158fa705789d9 19.44MB / 19.44MB                                                                                                     17.7s
 => => sha256:b0f1034fa6a0f9bf69269765dda3fc3121a911e418c876cdf63363277f0a6418 238B / 238B                                                                                                           15.9s
 => => sha256:3bf98da35d8a63fc6184eb3958b8677a0b5d6c7014348ca79b3f5edcba5bf7d1 3.11MB / 3.11MB                                                                                                       16.8s
 => => extracting sha256:8d647f1dd7e741209a8a75083ccc889e39cb3e94c17f45441eae96e1a679d971                                                                                                             0.8s
 => => extracting sha256:5cdd9a70365f741a6b9f7a4e32cdb7d4aa29ac73da0b78ca0a83e937f285fdd5                                                                                                             3.9s
 => => extracting sha256:95089c600b361807380090316c250b0b8eaf4fa2175b11ac8f49bb7581c61125                                                                                                             7.8s
 => => extracting sha256:031bfcddba4a0d9962728adf894b6e5bb4effcf5df35cdacd98177bee657a7c6                                                                                                             0.4s
 => => extracting sha256:7a2f0d5b0b056032259c906191d705e2ea2ab8f41955040ecb0158fa705789d9                                                                                                             0.7s
 => => extracting sha256:b0f1034fa6a0f9bf69269765dda3fc3121a911e418c876cdf63363277f0a6418                                                                                                             0.0s
 => => extracting sha256:3bf98da35d8a63fc6184eb3958b8677a0b5d6c7014348ca79b3f5edcba5bf7d1                                                                                                             0.3s
 => CACHED [client 1/5] FROM docker.io/library/python:3.11.6-slim@sha256:cc758519481092eb5a4a5ab0c1b303e288880d59afc601958d19e95b300bc86b                                                             0.0s
 => [internal] load build context                                                                                                                                                                     0.1s
 => => transferring context: 558.05kB                                                                                                                                                                 0.1s
 => [client 2/5] RUN curl -sSL https://sdk.cloud.google.com | bash                                                                                                                                    0.3s
 => [client 3/5] WORKDIR /eval/                                                                                                                                                                       0.0s
 => [builder  2/12] WORKDIR /eval/                                                                                                                                                                    0.4s
 => [builder  3/12] RUN pip install poetry==1.7.1                                                                                                                                                     6.9s
 => [builder  4/12] RUN pip wheel --no-cache-dir --use-pep517 "tree-sitter (==0.20.2)"                                                                                                                7.1s
 => [builder  5/12] COPY pyproject.toml pyproject.toml                                                                                                                                                0.0s
 => [builder  6/12] RUN poetry install --only main                                                                                                                                                   64.3s
 => [builder  7/12] COPY scripts scripts                                                                                                                                                              0.0s
 => [builder  8/12] RUN ./scripts/install-tree-sitter-deps                                                                                                                                            3.0s
 => ERROR [builder  9/12] RUN poetry run python scripts/build.py                                                                                                                                    224.0s
------
 > [builder  9/12] RUN poetry run python scripts/build.py:
#0 0.570 Cloning into 'tree-sitter-c'...
#0 6.423 Cloning into 'tree-sitter-cpp'...
#0 15.18 Cloning into 'tree-sitter-c-sharp'...
#0 29.09 Cloning into 'tree-sitter-go'...
#0 31.53 Cloning into 'tree-sitter-php'...
#0 64.24 Cloning into 'tree-sitter-python'...
#0 68.01 Cloning into 'tree-sitter-java'...
#0 73.68 Cloning into 'tree-sitter-javascript'...
#0 79.74 Cloning into 'tree-sitter-ruby'...
#0 170.6 Cloning into 'tree-sitter-rust'...
#0 178.1 Cloning into 'tree-sitter-scala'...
#0 187.9 Cloning into 'tree-sitter-typescript'...
#0 224.0 Traceback (most recent call last):
#0 224.0   File "/eval/scripts/build.py", line 73, in <module>
#0 224.0     sys.exit(main())
#0 224.0 Checking out grammars in /eval/scripts/vendor/Linux-6.5.0-14-generic-aarch64-with-glibc2.36
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-c
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-cpp
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-c-sharp
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-go
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-php
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-python
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-java
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-javascript
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-ruby
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-rust
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-scala
#0 224.0 Cloning https://github.com/tree-sitter/tree-sitter-typescript
#0 224.0 Building /eval/promptlib/prompt_engine/.treesitter_lib/tree-sitter-languages.so
#0 224.0              ^^^^^^
#0 224.0   File "/eval/scripts/build.py", line 67, in main
#0 224.0     Language.build_library(lib, language_directories)
#0 224.0   File "/root/.cache/pypoetry/virtualenvs/promptlib-XoeQhzs_-py3.11/lib/python3.11/site-packages/tree_sitter/__init__.py", line 87, in build_library
#0 224.0     source_mtimes = [path.getmtime(__file__)] + [path.getmtime(path_) for path_ in source_paths]
#0 224.0                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#0 224.0   File "/root/.cache/pypoetry/virtualenvs/promptlib-XoeQhzs_-py3.11/lib/python3.11/site-packages/tree_sitter/__init__.py", line 87, in <listcomp>
#0 224.0     source_mtimes = [path.getmtime(__file__)] + [path.getmtime(path_) for path_ in source_paths]
#0 224.0                                                  ^^^^^^^^^^^^^^^^^^^^
#0 224.0   File "<frozen genericpath>", line 55, in getmtime
#0 224.0 FileNotFoundError: [Errno 2] No such file or directory: '/eval/scripts/vendor/Linux-6.5.0-14-generic-aarch64-with-glibc2.36/tree-sitter-php/src/parser.c'
------
Dockerfile:18
--------------------
  16 |     COPY scripts scripts
  17 |     RUN ./scripts/install-tree-sitter-deps
  18 | >>> RUN poetry run python scripts/build.py
  19 |
  20 |     # Build promptlib wheel
--------------------
ERROR: failed to solve: process "/bin/sh -c poetry run python scripts/build.py" did not complete successfully: exit code: 1

Proposal

This could be due to latest changes from upstream grammars. We might want to pin tree-sitter grammar version similar to gitlab-org/modelops/applied-ml/code-suggestions/ai-assist#235 (closed).

Links / references

Similar MR gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!456 (merged)

Edited by Tan Le