Skip to content

Add messages and rpc actions to enable PrependPrimitive capabilities

Fred requested to merge add-prepend-primitive-capabilities into devel

Objectives

  1. Address TA3 needs by enabling human-specified preprocessing steps to be encoded in produced pipelines such that new data evaluated by said model will apply the same transformations.

  2. Investigate the problem as quickly as possible. Given that supporting undorderd datasets is a very high priority before the next evaluation, a complete rewrite of the TA2-TA3 API to support this feature would create more problems than it might solve. Modifications to the existing API should be sufficient to begin the process of supporting and investigating this TA3 need.

Changes

Add a message type, PrependPrimitive, which describes a primitive's id and its inputs and outputs. It will also contain optional fields, such as name and description, which will be populated by the primitive's own metadata.

Add an RPC call, ListPrependPrimitives, which enumerates the primitives a TA2 has available for prepending in PipelineCreateRequest calls.

Add a message key PreprendPrimitives to PipelineCreateRequest which contains an ordered set of primitive instructions available to be prepended to any Pipeline created by the call as well as an optional bool flag which indicates if all returned pipelines MUST contain the specified set of prepended primitives.

The PipelineCreateRequest response will be unchanged from the current PipelineCreateResult. No explicit validation will occur until pipeline searching begins and PipelineCreateResult messages arrive.

Add a ExecutePrependPrimitivesRequest so TA3 may request to see the output of primitives without requesting a full pipeline call to aid data investigation, as well as a streaming response, ExecutePrependPrimitivesRequest, to indicate status.

...

message PrependPrimitive {
	string id = 1;
	repeated Feature input_features = 2;
	repeated Feature output_features = 3;
	optional repeated string args = 4;
	optional string name = 5;
	optional string description = 6;
}

message PipelineCreateRequest {
	SessionContext context = 1;
	string dataset_uri = 2; 
	TaskType task = 3;
	TaskSubtype task_subtype = 4;
	string task_description = 5;
	OutputType output = 6;
	repeated PerformanceMetric metrics = 7;
	repeated Feature target_features = 8;
	repeated Feature predict_features = 9;
	int32 max_pipelines = 10;
	optional repeated PreprendPrimitive prepend_primitives = 11; // ordered list of PrependPrimitive actions to apply
	optional bool require_prepend_primitives = 12; // indication if all returned pipelines MUST apply specified PrependPrimitives
}

message PrependPrimitivesRequest {}

message PrependPrimitivesList {
	repeated PrependPrimitives prepend_primitives = 1;
}

message ExecutePrependPrimitivesRequest {
	SessionContext context = 1;
	string dataset_uri = 2;
	string result_uri = 3; // should write full 'datasetDoc.json' style output to specified directory
	repeated PrependPrimitives prepend_primitives = 4;
}

message ExecutePrependPrimitiveResponse {
    Response response_info = 1; 
    Progress progress_info = 2; 
    string action_id = 3;
}

service Core {
	...
	rpc ListPrependPrimitives(PreprendPrimitivesRequest) returns (PreprendPrimitivesList) {}
	
	rpc RunPrependPrimitive(ExecutePrependPrimitiveRequest) returns (stream ExecutePrependPrimitiveResponse) {}
	...
}

Merge request reports