[x86] Zen processors now marked with CPUX86_HAS_FAST_BT_MEM flag
Summary
This merge request adds the CPUX86_HAS_FAST_BT_MEM
feature flag for Zen processors (-CpZEN
and -OpZEN
), since the BT instruction reading a memory operand is known to be efficient on this processor.
System
- Processor architecture: i386, x86_64
What is the current bug behavior?
N/A
What is the behavior after applying this patch?
Future optimisations that check the CPUX86_HAS_FAST_BT_MEM
flag will now be active on programs complied for Zen processors.
Additional notes
According to Agner Fog's Instruction Tables (https://www.agner.org/optimize/instruction_tables.pdf), page 90, BT mem, imm
instructions (Intel notation) run in a single clock cycle. Annoyingly, this slows down to 2 clock cycles on Zen3 and later.
It is admittedly difficult to interpret some of the tables because the format differs between processors. It does seem to be the case though that BTx mem, reg
(i.e. a variable bit index), including simple BT, are very inefficient no matter what the processor is.