Skip to content

Draft: Add API for consolidating simple shards for a repo

John Mason requested to merge jm-merge-api into main

What does this MR do and why?

Add API for consolidating simple shards for a repo

Related to gitlab#432240 (closed)

How to set up and validate locally

  1. Download snapshot of gitlab-org/gitlab zoekt shard data here: gitlab#432240 (comment 1670419266)
  2. Untar the contents into ~/Downloads/gitlab-data
  3. Build binary and start server pointing to this directory
make && ./bin/gitlab-zoekt-indexer --index_dir ~/Downloads/gitlab-data
  1. In another terminal session, make an API call to merge gitlab-org/gitlab data.
curl -vvv  -d '{ "RepoID": 278964 }'  localhost:6060/indexer/merge
  1. Verify merging occurred. There should be only a handful of shards now that are roughly the target shard size.
$> ls -lth ~/Downloads/gitlab-data
total 3338656
-rw-rw-rw-  1 m  staff   381M Dec 11 22:01 278964_v17.00003.zoekt
-rw-rw-rw-  1 m  staff   386M Dec 11 22:01 278964_v17.00002.zoekt
-rw-rw-rw-  1 m  staff   387M Dec 11 22:01 278964_v17.00001.zoekt
-rw-rw-rw-  1 m  staff   476M Dec 11 22:01 278964_v17.00000.zoekt
-rw-r--r--  1 m  staff    36B Dec 11 22:00 node.uuid
  1. You can test that the search still works on these shards by moving them to gdk directory
cp ~/Downloads/gitlab-data/*.zoekt /Users/m/gitlab-development-kit/zoekt-data/development/index
  1. Go to zoekt-webserver on localhost:6090 and perform a search. Verify that there are results coming back for 278964
Edited by John Mason

Merge request reports