Integrate ipynbdiff to Gitlab
ipynbdiff (https://gitlab.com/gitlab-org/incubation-engineering/mlops/ipynbdiff) creates a markdown version of Jupyter The flow we expect is:
- User commits a change to one or more notebooks
- When creating the diff, use ipynbdiff as diff driver
- A diff_batch.json is generated using a regular viewer for text
Tasks:
-
Implement ipynb2md as a git driver -
Integrate the driver into gitaly (or find a better place)
Current summary of discussions:
Blocked since Gitaly might not be a suitable place for this change, since it is a highly optimized service and the team doesn't have the capacity for in depth analysis of how to integrate it. #3787 (comment 687700222)
Alternative suggestions:
Implement this on the Rails side, which would mean recreating the diff
Pros:
- Faster iteration
- No Gitaly
Cons:
- Need to reimplement the differ
- Worse performance, since the entire preprocessing/diffing will be done at request time
- Only rails consumer will benefit from the improved diff experience
Shadow repos containing the preprocessed versions of the notebooks
For repositories that have notebooks, create a mirror with the preprocessed jupyter notebooks, and use git diff
Pros:
- Can use all the goodies from stock
git diff
- No Gitaly
Cons:
- Additional Storage for the repos (we could measure how often
- Dealing with changes in the preprocessing script, and backporting
- Mapping commits from the original repo to the shadow one
Discussion here: #3787 (comment 684842240)
Edited by Eduardo Bonet