Epic(Draft): Inference pipeline
Problem to solve
When the user finishes the training of the model, it will be ideal that other user's can test the results of the model.
We call this process, inference. Usually it is a REST service that receives an Image, text or csv file and returns a response in json format.
The problems to solve :
How to create inference pipelines to test the models.
How to receive external inputs to feed the model.
How to show the response of the model in the UI.
Intended users
All users
User experience goal
User's would be able to run the models trained by other user's, with a no-code approach. This means that they will have a view where the input the test data and see the response without doing any config or coding.
Proposal for Technical Solution
To do inference it is necessary to receive external input data, therefore we can offer 3 pre-made options (widgets),
- Image input
- CSV input
- text input
That are added to the inference tab of the project, the owner of the project must configure the widget to receive the proper input for his code.
The owner of the project must have a tab to design the inference form, this creation process will need several actions:
-
Choose the code repo to run the inference, the code repo type ALGORITHM will have two stages in it, the details of each of them is going to be choosen during the publishing process.
1.1 This means that now we have to publish a code repository with 2 entry points, one for TRAIN and one for INFERENCE.
-
The code owner will choose a branch from the MLProject to choose either the model (artifact), or the input-data.
-
The code owner will set up the parameter values and a type of widget in a new tab named "show".
-
The type of input of his project, CSV, Image or Text . (We can support others later).
-
A command to start the service that is contained in the code repo. test.py --port 5000 ...
-
The type of output, it must be a json file, we can print it as it comes or maybe parse it if it follows certain criteria.
The backend will create a pipeline the same as it does with an experiment, but this time it is going to serve a model.
The tester will enter the inference tab and start the pipeline, once the pipeline is running, the inference service starts and the tester can enter the input data using the widget and see the response. The pipeline can be programmed to shut down automatically if there is no activity detected after 5 minutes.
Inference pipeline (draft)
- Download the model to the server:
wget <model artifact>
- Same as publish process, prepare the machine with environment, requirements, etc.
- Launch the service
python3 test.py --port 5000
in the server - If there is no activity detected in 10 minutes, shut down the server.
Frontend
- Starts the pipeline
- When pipeline is running then the frontend unblocks the widget and receives external input.
- This widget sends the data to the service that was started with the inference pipeline and frontend also receives the response and shows it in the inference tab.
Permissions and Security
Documentation
Availability, Testing & Test Cases
What does success look like, and how can we measure that?
Additional Notes
What is the type of buyer?
Is this a cross-stage feature?
Links / references
/cc @si-ge-st