Skip to content

Adds remaining endpoints for MLFlow API compatibility

Eduardo Bonet requested to merge 370478-add-mlflow-endpoints-5 into master

What does this MR do and why?

Adds remaining MLFlow endpoint and changes api response so MLFlow client works

Queries

  • Ml::Experiment.by_project_id
SELECT "ml_experiments".* FROM "ml_experiments" WHERE "ml_experiments"."project_id" = 31
Index Scan using index_ml_experiments_on_project_id_and_name on ml_experiments  (cost=0.15..5.22 rows=4 width=88) (actual time=0.029..0.034 rows=29 loops=1)
  Index Cond: (project_id = 31)
Planning Time: 0.607 ms
Execution Time: 0.057 ms

How to Reproduce

GET experiments/list

  1. Create a Project and a project access token, with api level:

    export PROJECT_ID=<Your Project Id>
    export GITLAB_PAT=<your api token>
  2. Enable the Feature flag

    echo "Feature.enable(:ml_experiment_tracking)" | bundle exec rails c
  3. Create an Experiment:

    curl -X POST -H "Authorization: Bearer $GITLAB_PAT" -d name=my_cool_experiment http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mflow/api/2.0/mlflow/experiments/create
  4. List the experiments

    curl -X GET -H "Authorization: Bearer $GITLAB_PAT" http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mflow/api/2.0/mlflow/runs/list

MLFlow Client working

  1. Create a Project and a project access token, with api level:

    export PROJECT_ID=<Your Project Id>
    export GITLAB_PAT=<your api token>
  2. Enable the Feature flag

    echo "Feature.enable(:ml_experiment_tracking)" | bundle exec rails c
  3. Clone the repository: https://gitlab.com/gitlab-org/incubation-engineering/mlops/mlflow_experiment

  4. Install dependencies: pip install -r requirements.txt

  5. Export the variables:

    export MLFLOW_TRACKING_URI="http://gdk.test:3000/api/v4/projects/"$PROJECT_ID"/ml/mlflow"
    export MLFLOW_TRACKING_TOKEN=$GITLAB_PAT 
  6. Run the script. This should create an experiment, train a model and store the parameters and metrics:

    python train.py
    2022/09/21 10:53:36 INFO mlflow.tracking.fluent: Experiment with name '2d53da12-c741-4711-a93c-00780262f0fe' does not exist. Creating a new experiment.
    Experiment name: 2d53da12-c741-4711-a93c-00780262f0fe
    Elasticnet model (alpha=0.100000, l1_ratio=0.050000):
      RMSE: 0.7777686546195534
      MAE: 0.6098738158916734
      R2: 0.21869474555096913
  7. Check if the data was created, based on the generated experiment id (which maps to the name on the gitlab table:

    Ml::Candidate.joins(:experiment).find_by(experiment: {name: '2d53da12-c741-4711-a93c-00780262f0fe'}).params
    [#<Ml::CandidateParam:0x00007ff0c112fde0
      id: 30,
      created_at: Wed, 21 Sep 2022 08:53:38.393273000 UTC +00:00,
      updated_at: Wed, 21 Sep 2022 08:53:38.393273000 UTC +00:00,
      candidate_id: 31,
      name: "alpha",
      value: "0.01">,
     #<Ml::CandidateParam:0x00007ff0b48184c0
      id: 31,
      created_at: Wed, 21 Sep 2022 08:53:38.720678000 UTC +00:00,
      updated_at: Wed, 21 Sep 2022 08:53:38.720678000 UTC +00:00,
      candidate_id: 31,
      name: "l1_ratio",
      value: "0.05">]

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #370478 (closed)

Edited by Eduardo Bonet

Merge request reports