CONFIG_OPTFLOW_REFINEMENT: int32 overflow causes different behaviour in debug and release with some compilers

How was the issue detected?

What version / commit were you testing with?

What steps will reproduce the problem?

We are making verification streams for CONFIG_OPTFLOW_REFINEMENT, but have found a problem in the reference code.

We have a test stream that decodes differently on debug and release builds.

The issue is within av1_opfl_mv_refinement_nxn_interp_grad. This has a sse4.1 optimized version and a C reference implementation.

The problem is related to the following lines in av1_opfl_mv_refinement_interp_grad:

  // Clamp su2, sv2, suv, suw, and svw to avoid overflow in det, det_x, and
  // det_y
  su2 = (int64_t)clamp((int)su2, -OPFL_COV_CLAMP_VAL, OPFL_COV_CLAMP_VAL);
  sv2 = (int64_t)clamp((int)sv2, -OPFL_COV_CLAMP_VAL, OPFL_COV_CLAMP_VAL);
  suv = (int64_t)clamp((int)suv, -OPFL_COV_CLAMP_VAL, OPFL_COV_CLAMP_VAL);
  suw = (int64_t)clamp((int)suw, -OPFL_COV_CLAMP_VAL, OPFL_COV_CLAMP_VAL);
  svw = (int64_t)clamp((int)svw, -OPFL_COV_CLAMP_VAL, OPFL_COV_CLAMP_VAL);

The variables here are declared as int64_t so have 64 bits, but the cast to int means that they are reduced to 32 bits before the clamp.

The corresponding code in sse4.1 is in calc_mv and calc_mv_process.

The problem is that casting a number that does not fit is implementation defined behaviour (unless options like -fwrapv are used) so the compiler is free to choose different implementations. For recent versions of Clang (e.g. 15), this means that the release build behaves differently to the debug build.

Fix A

Make the clamping work with 64 bit inputs.

Fix B

Use 32bit wrap around, but change the code to avoid undefined behaviour or implementation defined behaviour.

Fix C

Use 32bit wrap around, and require use of options like -fwrapv that specify the wrap around behaviour.