Who should we explore integrating with?
There are a lot of new players on the MLOps space, and we want to to explore which vendors should we work on polishing integrations.
Popular GitHub Repos
One way to identify candidates is to look for popular repos open source repos tagged with ml, machine learning, ml or machinelearning, The code can be found here: https://gitlab.com/gitlab-org/incubation-engineering/mlops/top-ml-projects
On this document I started categorising each of the top 200 repositories, and the results were quite surprising:
- About 40% are repos with learning resources: awesome-X style repos, tutorials, books, courses, etc
- 13% are application repos (eg deepfake, voice recognition)
- 7% are dataset repos
- Framework repos (tensorflow, keras, etc), represent 16% of the top200 repos, and they are good candidates for integration on the model integration step
Only 17 repos (8.5%) focus on MLOps, and they are way down the list. While this can have many explanations (mlops is way further on the ML learning path, it's new, the repos don't invest in marketing, etc), it limits the number of options we have, which is good for now.