Add a mixin/signal to signal that a produce method is operating on whole set of inputs and not on each sample from input
For fitting we have different methods which allow one to specify if primitive should fit on all inputs or on subsets (batch) of inputs and then continue. This is important because it is not always possible that you can have all samples in memory at once.
But we do not have anything like this for produce methods. Idea there is that in most cases it should not matter if you call produce 100x on inputs with 1 sample, or 1x on inputs with 100 samples. In this way batching is simple. You just decide on how many input samples you want to use per call. This is important for some transformers which will operate on raw data. (And whole pipelines should probably run on batches at some point.)
But not all produce methods operate like this. For example for clustering transformer and distance primitives produce runs on all input data. So it seems this is less common case but it exists. For this we should be able to signal somehow to the caller to special case batching in this case.
We should probably use the same signal as we will use when resolving primitive-interfaces#46 (moved).
Then the following base produce methods should have this signal:
-
ClusteringDistanceMatrixMixin'sproduce_distance_matrix -
PairwiseDistanceLearnerPrimitiveBase'sproduce -
PairwiseDistanceTransformerPrimitiveBase'sproduce - clustering transformer's
produce