Skip to content

[x86] New "Fast LEA" hint and new "ICELAKE" processor option

Summary

To facilitate more accurate optimisation involving LEA instructions, two new optimisation hints have been introduced to the compiler, and a new processor option.

  • CPUX86_HINT_FAST_3COMP_ADDR - this indicates that complex LEA instructions (commonly those that have a base, index and offset) are at least as fast as two ADD instructions in a dependency chain (LEA has a latency of at least 3 cycles on many Intel processors from the 2000s and 2010s).
  • New processor options ICELAKE, ICELAKE-CLIENT (synonym for ICELAKE for GCC compatibility) and ICELAKE-SERVER for Ice Lake architectures.
  • New benchmark test at tests/bench/blea.pp where LEA and ADD timings can be tested on a custom-made checksum routine.

System

  • Processor architecture: i386, x86_64

What is the current bug behavior?

N/A

What is the behavior after applying this patch?

Flags currently not used, but its use is planned for !134 and !501 (merged).

Additional notes

From user tests and Agner Fog's reports, the flags are assigned as follows to the available processors

 Processor    Fast (32/64)
-----------   ------------
   80386           *
   80486           *
  PENTIUM          *
  PENTIUM2         *
  PENTIUM3         *
  PENTIUM4
  PENTIUMM         *
   ATHLON          *
   COREI           *
   BOBCAT          *
  COREAVX
   JAGUAR          *
 PILEDRIVER        *
 EXCAVATOR         *
  COREAVX2
    ZEN            *
    ZEN2           *
  ICELAKE          *
ICELAKE-CLIENT     *
ICELAKE-SERVER     *
    ZEN3           *

The Pentium 4 was a bit of an odd case; early versions had good 16-bit speed but not 32-bit speed, while Prescott Pentium 4s had poor latency all round. All Intel processors starting from Sandy Bridge (COREAVX) had poor LEA latency but this was finally addressed in Ice Lake (ICELAKE). AMD processors generally had good 32/64-bit performance through all iterations but at the cost of 16-bit performance.

Edited by J. Gareth "Kit" Moreton

Merge request reports

Loading