Skip to content

WIP: Preprocessing extension API

Chris Bethune requested to merge preprocessing_api into devel

This is an extension aimed at providing TA3 systems with the ability to specify a pre-processing pipeline that can be executed by a TA2 system on demand, and pre-pended to the ML pipelines created by TA2 systems. It uses the pipeline structure defined by the meta-learning working group, although a few sections that are not relevant in this context haven't been implemented. Protobuf structures are defined to map the meta-learning structure as closely as possible, and could be simplified in some cases.

The extension adds 2 new calls - CompilePipeline and SetPreprocessing. CompilePipeline takes a pipeline definition as an argument, and passes it off to TA2 for validation. TA2 validates the pipeline, and returns a pipeline ID to TA3. The ID can be passed into the ExecutePipeline call, allowing a TA3 system to run a TA1 pipeline over input data. The SetPreprocessing call takes a pipeline ID as input, and notifies TA2 that the associated pipeline is to be used as a pre-processing step when generating pipelines as part of a CreatePipelines call. This ensures that pipeline produced by TA2 included all necessary transformations when exported. Example usages are included under the example/go directory. While most performers are using Python, the intended use should still be apparent.

This work focuses on a pre-processing extension to the core, but it can be reworked to form a more general TA3-TA2 API, as discussed in #52 (closed); the protobuf definition of the pipeline represents the majority of the work, and that can be carried forward.

Edited by Chris Bethune

Merge request reports