Allow Code Reviews for Jupyter Notebook

Tasks:

  • Create a sample repo to showcase the issues with Notebooks MRs
  • Explore nbdime as backend for notebook diff

Context

Due to the file structure, code reviewing Jupyter Notebooks can be challenging. https://gitlab.com/eduardobonet/ipynb-mr-sample gives some examples of that

  1. On https://gitlab.com/eduardobonet/ipynb-mr-sample/-/merge_requests/1, nothing changes in the code. The only difference is the order in which lines were run. This caused the entire file to change, and makes it impossible to code review

image

  1. On https://gitlab.com/eduardobonet/ipynb-mr-sample/-/merge_requests/2, only a single line changed, which causes an image to change, but since the change is within a json structure, spotting what actually changed becomes really hard

image

External tools

  • ReviewNB is a service that allows the Code Review experience on top of GitHub repositories

Testing NBDime

On https://gitlab.com/eduardobonet/ipynb-mr-sample/-/tree/nbdime-test, we show a test of using nbdime to diff the contents between branch2 and another-branch. nbdime_output.txt is the output of nbdiff (color characters should be removed later), but it could be parsed into a code mr review:

nbdiff notebook.ipynb old_notebook.ipynb
--- notebook.ipynb  2021-08-13 16:12:00.478714
+++ old_notebook.ipynb  2021-08-13 16:26:22.487417
## replaced /cells/2/execution_count:
-  5
+  2

## modified /cells/2/outputs/0/data/text/plain:
-  [<matplotlib.lines.Line2D at 0x130e389d0>]
+  [<matplotlib.lines.Line2D at 0x1307c1970>]

## replaced /cells/2/outputs/0/execution_count:
-  5
+  2

## inserted before /cells/2/outputs/1:
+  output:
+    output_type: display_data
+    data:
+      image/png: iVBORw0K...<snip base64, md5=ecea473e60784aa2...>
+      text/plain: <Figure size 432x288 with 1 Axes>
+    metadata (unknown keys):
+      needs_background: light

## deleted /cells/2/outputs/1:
-  output:
-    output_type: display_data
-    data:
-      image/png: iVBORw0K...<snip base64, md5=13bfe1623a694f6d...>
-      text/plain: <Figure size 432x288 with 1 Axes>
-    metadata (unknown keys):
-      needs_background: light

## modified /cells/2/source:
@@ -1,4 +1,4 @@
 x = np.linspace(0, 4*np.pi,50)
-y = 2 * np.sin(x)
+y = np.sin(x)
 
 plt.plot(x, y)

The webapp version of the diff (nbdiff-web) seems a lot closer to our goal

image

This seems to be a good candidate to power a first iteration.

Edited by Eduardo Bonet