Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.

Add ploadN, pstoreN (and unaligned versions), pgatherN, pscatterN, loadPacketN and storePacketN.

Useful for:

  1. memory access - prevent reading/writing past end of data (only elements needed),
  2. performance - eliminates masking, one Packet vs N scalars, less complexity for edge condition functions/templates (better i-cache), etc.
  3. partial Packet operations - simplified Packet operations instead of read scalars, merge with Packet, operation, get scalar, write scalars.
  4. consistent results - reduces variations for scalar vs packet operations
Edited by Chip Kerchner

Merge request reports

Loading