Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster)

Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster). Added missing getSubMapper & getLinearMapper for TensorContractonMapper - fixed compilation issues. Other minor complex packing improvements.

Merge request reports

Loading