Skip to content

SEG MLOps Update - November 12th 2021

All Weekly Demos: #16

Recording

https://youtu.be/5Xkn0XHlVFE

Vision

Make GitLab a tool Data Scientists and Machine Learning Engineers love to use.

Mission

Identify opportunities in our portfolio to explore ways where GitLab can provide a better user experience for Data Science and Machine Learning across the entire Machine Learning life cycle (model creation, testing, deployment, monitoring, and iteration).

What Was Done

Jupyter Diff is Live!

Epic: gitlab-org&6589 (closed)

Very exciting update, Jupyter Diff is live and will be shipped with version 14.5 of GitLab for self managed customers! We shared this results on Twitter and LinkedIn and the reception from the user base has been amazing!

The feature can be seen live here: gitlab-org/incubation-engineering/mlops/ipynb-test-projects/diff-sample@542d42a0

Twitter Thread: https://twitter.com/ef_bonet/status/1458728905020522496

We are taking feedbacks for the next iteration. What is the next enhancement for Jupyter you want to see we supporting on Gitlab? Add it here

HyperParameter Optimisation

Epic | Issue | Repo

We are continuing the work on exploring how to implement Hyperparameter optimisation using only GitLab. Creating a new MR triggers a pipeline that:

  • Fetches the data
  • Reads a file with possible values
  • Performs the optimisation
  • Posts the results on the MR

image

We will be sharing our learnings along the way.

We are currently running a very simplistic scenario: dataset used for the training generated, optimisation runs sequentially, the data is small and training is very fast, we are using Grid Search, it doesn't upload the results anywhere. But with this skeleton we can iterate on complexity:

  • Use real data
  • Implement an iterative pipeline
  • Stress test the runner using larger datasets
  • Test out GPU runners
  • Make it easier to build the pipeline (less commit-push-check)
  • Run the pipeline on a different vendor (Kubeflow, AWS)

What do you want us to see explore? Do you have any example we should look into? You can comment on this issue and we will take a look!

Up Next

We will take a break this week from Jupyter Notebooks, and focus a bit more on the pipelines. We will make the pipelines a bit more complex and see how GitLab handles it.

Edited by Eduardo Bonet