Skip to content

Some fixes/cleanups for numeric_limits & fix for related bug in psqrt

From majnemer@google.com:

Some fixes/cleanups for numeric_limits

BFloat16:

  • Set the highest payload bit instead of the lowest for signaling_NaN to match Half
  • Set has_denorm to denorm_present as this type supports denormals. Otherwise, we should set denorm_min to min as per the standard.

Half:

  • epsilon defined incorrectly
  • tinyness_before should be identical to the other C++ floating point types
  • is_bounded defined as false instead of true; is_bounded == false would be true for types with arbitrary precision
  • traps should be set to what float uses, it is likely false
  • is_iec559 should be set to true for both types (long double has this true and it has a much weirder encoding)

From rmlarsen@google.com:

Add a workaround to the AVX implementation of psqrt since _mm256_rsqrt_ps appears to flush negative denormal values to zero.

Closes #2409 (closed)

Edited by Rasmus Munk Larsen

Merge request reports

Loading