tuning for NVIDIA CC 9.x GPUs
Summary
Both H100 and AD10x (consumer) needs detailed testing/evaluation, kernel tuning.
Use cases
Better performance.
Impact
Users of recent GPUs.
Detailed description
-
disable texture use -
disable cj smhem prefetch -
test async mem operations -
test further unrolling to maximize icache use -
check PME analytical vs tabulated
Edited by Alan Gray