WIP: Preprocessing extension API
This is an extension aimed at providing TA3 systems with the ability to specify a pre-processing pipeline that can be executed by a TA2 system on demand, and pre-pended to the ML pipelines created by TA2 systems. It uses the pipeline structure defined by the meta-learning working group, although a few sections that are not relevant in this context haven't been implemented. Protobuf structures are defined to map the meta-learning structure as closely as possible, and could be simplified in some cases.
The extension adds 2 new calls - CompilePipeline
and SetPreprocessing
. CompilePipeline
takes a pipeline definition as an argument, and passes it off to TA2 for validation. TA2 validates the pipeline, and returns a pipeline ID to TA3. The ID can be passed into the ExecutePipeline
call, allowing a TA3 system to run a TA1 pipeline over input data. The SetPreprocessing
call takes a pipeline ID as input, and notifies TA2 that the associated pipeline is to be used as a pre-processing step when generating pipelines as part of a CreatePipelines
call. This ensures that pipeline produced by TA2 included all necessary transformations when exported. Example usages are included under the example/go
directory. While most performers are using Python, the intended use should still be apparent.
This work focuses on a pre-processing extension to the core, but it can be reworked to form a more general TA3-TA2 API, as discussed in #52 (closed); the protobuf definition of the pipeline represents the majority of the work, and that can be carried forward.