Skip to content

Integrate rb-ipynbdiff into Gitlab

Context

&6589 (closed)

Work to do

Integrate rb-ipynbdiff into the rails app, generating diffs for Jupyter Notebooks.

When receiving a patch, we will check if it contains diffs for Jupyter Notebooks. If so, we will request the originals, and use the IpynbDiff.diff to create a new diff, injecting into diff_patches.json. This should be wrapped by a feature flag

Concerns

Considering that we are pulling raw files, and performing another diff during the request, this might be an operation that's too heavy to be done at request time. If so, there are some alternatives we can pursue:

  1. Move to a worker

Whenever a commit is made, we fire a worker that a) creates and caches the markdown versions of the document and b) caches the diff. The benefit here is that when opening the notebook on gitlab we can improve performance by rendering the markdown version. It also allow us to extract the embedded images from the notebook into their own files, further improving the diff

  1. Shadow Repos

Not so much as an alternative, but an evolution of the Worker solution, we can create shadow repos for each repo, containing changes to files aimed at performance or usability improvements. Advantages here is that they would likely use less space, and we could benefit from stock git diff rather then relying on the rb-ipynbdiff implementation.

Edited by Eduardo Bonet