Adds LogParam and LogBatch endpoints to MLFlow
What does this MR do and why?
Enables logging params to an ML experiment
Database
Migrations
- Up
❯ bin/rails db:migrate:main RAILS_ENV=test
main: == 20220914080716 AddIndexToCandidateIdAndNameOnMlCandidateParams: migrating ==
main: -- index_exists?(:ml_candidate_params, [:candidate_id, :name], {:name=>"index_ml_candidate_params_on_candidate_id_on_name"})
main:    -> 0.0085s
main: -- transaction_open?()
main:    -> 0.0000s
main: -- index_exists?(:ml_candidate_params, [:candidate_id, :name], {:unique=>true, :name=>"index_ml_candidate_params_on_candidate_id_on_name", :algorithm=>:concurrently})
main:    -> 0.0022s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0007s
main: -- add_index(:ml_candidate_params, [:candidate_id, :name], {:unique=>true, :name=>"index_ml_candidate_params_on_candidate_id_on_name", :algorithm=>:concurrently})
main:    -> 0.0046s
main: -- execute("RESET statement_timeout")
main:    -> 0.0007s
main: == 20220914080716 AddIndexToCandidateIdAndNameOnMlCandidateParams: migrated (0.0279s)- Down
❯ bin/rails db:rollback:main RAILS_ENV=test
main: == 20220914080716 AddIndexToCandidateIdAndNameOnMlCandidateParams: reverting ==
main: -- index_exists?(:ml_candidate_params, [:candidate_id, :name], {:name=>"index_ml_candidate_params_on_candidate_id_on_name"})
main:    -> 0.0052s
main: -- transaction_open?()
main:    -> 0.0000s
main: -- indexes(:ml_candidate_params)
main:    -> 0.0026s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0006s
main: -- remove_index(:ml_candidate_params, {:algorithm=>:concurrently, :name=>"index_ml_candidate_params_on_candidate_id_on_name"})
main:    -> 0.0065s
main: -- execute("RESET statement_timeout")
main:    -> 0.0007s
main: == 20220914080716 AddIndexToCandidateIdAndNameOnMlCandidateParams: reverted (0.0249s)How to Reproduce
- 
Create a Project and a project access token, with api level: export PROJECT_ID=<Your Project Id> export GITLAB_PAT=<your api token>
- 
Enable the Feature flag echo "Feature.enable(:ml_experiment_tracking)" | bundle exec rails c
- 
Create an Experiment: curl -X POST -H "Authorization: Bearer $GITLAB_PAT" -d name=my_cool_experiment http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mflow/api/2.0/mlflow/experiments/create
- 
Create a Run, and make a note of the run id returned curl -X POST -H "Authorization: Bearer $GITLAB_PAT" -d experiment_id=1 http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mflow/api/2.0/mlflow/runs/create
- 
Log a param curl -X POST -H "Authorization: Bearer $GITLAB_PAT" -d '{"run_id":"<RUN_ID>","key":"hello","value":"world"}' http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mflow/api/2.0/mlflow/runs/log-parameter
- 
Get the run now has a poram in the run.data.params field curl -X GET -H "Authorization: Bearer $GITLAB_PAT" "http://gdk.test:3000/api/v4/projects/$PROJECT_ID/ml/mflow/api/2.0/mlflow/runs/get?run_id=<RUN_ID>"
Difference between APIs
POST /runs/log-parameter
When run exists
| Mlflow | Gitlab | |
| URL | http://127.0.0.1:5000/api/2.0/mlflow/runs/log-parameter | http://gdk.test:3000/api/v4/projects/31/ml/mflow/api/2.0/mlflow/runs/log-parameter | 
| Params | {} | {} | 
| Body | {'run_id': 'c65c0b0aa4874c53b04f753444b1ae1a', 'key': 'hello', 'value': 'SomeParameter'} | {'run_id': '4b09659f-e25d-4055-9193-c37623553beb', 'key': 'hello', 'value': 'SomeParameter'} | 
| Status Code | 200 | 201 | 
| Reponse | {} | {} | 
When key is not passed
| Mlflow | Gitlab | |
| URL | http://127.0.0.1:5000/api/2.0/mlflow/runs/log-parameter | http://gdk.test:3000/api/v4/projects/31/ml/mflow/api/2.0/mlflow/runs/log-parameter | 
| Params | {} | {} | 
| Body | {'run_id': 'c65c0b0aa4874c53b04f753444b1ae1a', 'value': 'SomeParameter'} | {'run_id': '4b09659f-e25d-4055-9193-c37623553beb', 'value': 'SomeParameter'} | 
| Status Code | 400 | 400 | 
| Reponse | b'{"error_code": "INVALID_PARAMETER_VALUE", "message": "Missing value for required parameter \'key\'. See the API docs for more information about request parameters."}' | b'{"error":"key is missing"}' | 
When param is repeated
| Mlflow | Gitlab | |
| URL | http://127.0.0.1:5000/api/2.0/mlflow/runs/log-parameter | http://gdk.test:3000/api/v4/projects/31/ml/mflow/api/2.0/mlflow/runs/log-parameter | 
| Params | {} | {} | 
| Body | {'run_id': 'c65c0b0aa4874c53b04f753444b1ae1a', 'key': 'hello', 'value': 'SomeParameter'} | {'run_id': '4b09659f-e25d-4055-9193-c37623553beb', 'key': 'hello', 'value': 'SomeParameter'} | 
| Status Code | 200 | 201 | 
| Reponse | {} | {} | 
POST /runs/log-batch
When run exists
| Mlflow | Gitlab | |
| URL | http://127.0.0.1:5000/api/2.0/mlflow/runs/log-batch | http://gdk.test:3000/api/v4/projects/31/ml/mflow/api/2.0/mlflow/runs/log-batch | 
| Params | {} | {} | 
| Body | {'run_id': 'c65c0b0aa4874c53b04f753444b1ae1a', 'metrics': [{'key': 'metric2', 'value': 1.0, 'timestamp': 12345678, 'step': 0}, {'key': 'metric3', 'value': 100.0, 'timestamp': 12345678, 'step': 0}], 'params': [{'key': 'param1', 'value': 'ValueForParam1'}]} | {'run_id': '4b09659f-e25d-4055-9193-c37623553beb', 'metrics': [{'key': 'metric2', 'value': 1.0, 'timestamp': 12345678, 'step': 0}, {'key': 'metric3', 'value': 100.0, 'timestamp': 12345678, 'step': 0}], 'params': [{'key': 'param1', 'value': 'ValueForParam1'}]} | 
| Status Code | 200 | 201 | 
| Reponse | {} | {} | 
When a metric is passed without key
| Mlflow | Gitlab | |
| URL | http://127.0.0.1:5000/api/2.0/mlflow/runs/log-batch | http://gdk.test:3000/api/v4/projects/31/ml/mflow/api/2.0/mlflow/runs/log-batch | 
| Params | {} | {} | 
| Body | {'run_id': 'c65c0b0aa4874c53b04f753444b1ae1a', 'metrics': [{'value': 1.0, 'timestamp': 12345678, 'step': 0}, {'key': 'metric4', 'value': 100.0, 'timestamp': 12345678, 'step': 0}]} | {'run_id': '4b09659f-e25d-4055-9193-c37623553beb', 'metrics': [{'value': 1.0, 'timestamp': 12345678, 'step': 0}, {'key': 'metric4', 'value': 100.0, 'timestamp': 12345678, 'step': 0}]} | 
| Status Code | 500 | 400 | 
| Reponse | b'\n\n500 Internal Server Error\n Internal Server Error\nThe server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.\n' | b'{"error":"metrics[0][key] is missing"}' | 
When a parameter is duplicated
| Mlflow | Gitlab | |
| URL | http://127.0.0.1:5000/api/2.0/mlflow/runs/log-batch | http://gdk.test:3000/api/v4/projects/31/ml/mflow/api/2.0/mlflow/runs/log-batch | 
| Params | {} | {} | 
| Body | {'run_id': 'c65c0b0aa4874c53b04f753444b1ae1a', 'params': [{'key': 'param1', 'value': 'AnotherValue'}, {'key': 'param12', 'value': 'ValueForParam2'}]} | {'run_id': '4b09659f-e25d-4055-9193-c37623553beb', 'params': [{'key': 'param1', 'value': 'AnotherValue'}, {'key': 'param12', 'value': 'ValueForParam2'}]} | 
| Status Code | 400 | 201 | 
| Reponse | b'{"error_code": "INVALID_PARAMETER_VALUE", "message": "Changing param values is not allowed. Params were already logged=\'[{\'key\': \'param1\', \'old_value\': \'ValueForParam1\', \'new_value\': \'AnotherValue\'}, {\'key\': \'param12\', \'old_value\': None, \'new_value\': \'ValueForParam2\'}]\' for run ID=\'c65c0b0aa4874c53b04f753444b1ae1a\'."}' | {} | 
GET /runs/get
When run exists
| Mlflow | Gitlab | |
| URL | http://127.0.0.1:5000/api/2.0/mlflow/runs/get | http://gdk.test:3000/api/v4/projects/31/ml/mflow/api/2.0/mlflow/runs/get | 
| Params | {'run_id': 'c65c0b0aa4874c53b04f753444b1ae1a'} | {'run_id': '4b09659f-e25d-4055-9193-c37623553beb'} | 
| Body | {} | {} | 
| Status Code | 200 | 200 | 
| Reponse | { "run": { "info": { "run_uuid": "c65c0b0aa4874c53b04f753444b1ae1a", "experiment_id": "2", "user_id": "", "status": "RUNNING", "start_time": 1234, "artifact_uri": "./mlruns/2/c65c0b0aa4874c53b04f753444b1ae1a/artifacts", "lifecycle_stage": "active", "run_id": "c65c0b0aa4874c53b04f753444b1ae1a" }, "data": { "metrics": [ { "key": "hello", "value": 10.0, "timestamp": 12345678, "step": 3 }, { "key": "metric2", "value": 1.0, "timestamp": 12345678, "step": 0 }, { "key": "metric3", "value": 100.0, "timestamp": 12345678, "step": 0 } ], "params": [ { "key": "hello", "value": "SomeParameter" }, { "key": "param1", "value": "ValueForParam1" } ] } } } | { "run": { "info": { "run_id": "4b09659f-e25d-4055-9193-c37623553beb", "run_uuid": "4b09659f-e25d-4055-9193-c37623553beb", "experiment_id": "24", "start_time": 1234, "end_time": 12345678, "status": "FAILED", "artifact_uri": "not_implemented", "lifecycle_stage": "active", "user_id": "1" }, "data": { "metrics": [ { "key": "hello", "value": 10.0, "timestamp": 12345678, "step": 3 }, { "key": "metric2", "value": 1.0, "timestamp": 12345678, "step": 0 }, { "key": "metric3", "value": 100.0, "timestamp": 12345678, "step": 0 } ], "params": [ { "key": "hello", "value": "SomeParameter" }, { "key": "param1", "value": "ValueForParam1" }, { "key": "param12", "value": "ValueForParam2" } ] } } } | 
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
- 
I have evaluated the MR acceptance checklist for this MR. 
Related to #370478 (closed)
Edited  by Eduardo Bonet