Discussion: Model registry data topology
Context
Model registry artefact management will use Package Registry, but it doesn't handle metadata and other features we might need. We also need to think of a path to promote a Candidate (which will use the same Model package type) into a model version.
Model Experiments data topology
Model experiments has a similar situation, where it uses package registry to hold the artefacts, but then has a layer of tables to provide information for UI:
- Well defined GitLab metadata (such as Merge Request ID, or Build id) go into Ml::Candidate
- MlCandidateMetadata creates a flexible way for users to add any metadata that is not covered by GitLab. An improvement to MlCandidateMetadata would be to add a type column, so that we can add better rendering for some specific types (eg image_url, image, int, etc)
erDiagram
Project ||--o{ MlExperiment : owns
Project ||--o{ MlCandidate : owns
User ||--|{ MlCandidate : creates
User ||--|{ MlExperiment : creates
MlExperiment ||--o{ MlCandidate : compares
MlExperiment ||--o{ MlExperimentMedatadata : has
MlCandidate ||--o{ MlCandidateParam : has
MlCandidate ||--o{ MlCandidateMetric : has
MlCandidate ||--o{ MlCandidateMetadata : has
MlCandidate ||--o{ PackagesPackage : stores
MlCandidateParam {
string name
string value
}
MlCandidateMetric {
string name
float value
int step
}
MlCandidateMetadata {
string name
string value
}
MlExperimentMedatadata {
string name
string value
}
MlCandidate {
bigint id
bigint iid
string name
uuid eid
}
MlExperiment {
bigint id
bigint iid
string name
}
Suggested solution:
Follow the same general id from above:
- Add
Ml::Model
This has the description of a model, and other model specific attributes. It has a name, iid, description, and 0 or many versions. A version - A model version is a Packages::Package of type
ml_model
, where the package name is the model name - Add
Packages::MlModelMetadata
, which has general metadata about that specific version of the model (egmodel_id
)
Eventually:
Connecting Model Experiments to Model Registry:
- Add association from
Ml::Experiment
toMl::Model
. - Each model has a default
Ml::Experiment
. - Each
Ml::ModelVersion
can have 0 or 1Ml::Candidate
.
From the data layer perspective, promoting a Candidate to Version would mean:
- The package associated to the package becomes the package for the model version. Candidates without package cannot be promoted.
- The metadata for the
Ml::ModelVersion
is theMl::ModelVersionMetadata
and theMl::CandidateMetadata
associated to the ModelVersion. Same for Metrics and Params.
Edited by Eduardo Bonet