Skip to content

NEON port of svt_estimate_noise_fp16_c

Description

This MR vectorizes in NEON the functions below by porting the AVX2 version of the same algorithms:

  • svt_estimate_noise_fp16_c

The datasets for the 8 bit files can be sourced from:

wget http://ultravideo.fi/video/Bosphorus_3840x2160_120fps_420_8bit_YUV_Y4M.7z
wget http://ultravideo.fi/video/Bosphorus_1920x1080_120fps_420_8bit_YUV_Y4M.7z
wget https://ultravideo.fi/video/Bosphorus_3840x2160_120fps_420_10bit_YUV_RAW.7z

7za e Bosphorus_*.7z

chmod 0444 *
rm Bosphorus_Copyright* Bosphorus_*.7z

8bit-4k:

SvtAv1EncApp -i Bosphorus_3840x2160.y4m --output Bosphorus_3840x2160.mkv --tile-columns 1 --tile-rows 1

8bit-1080:

SvtAv1EncApp -i Bosphorus_1920x1080.y4m --output Bosphorus_1920x1080.mkv

10bit-4k-bare:

SvtAv1EncApp -i Bosphorus_3840x2160_120fps_420_10bit_YUV.yuv --output Bosphorus_3840x2160_120fps_420_10bit_YUV.mkv --input-depth 10 --width 3840 --height 2160 --tile-columns 1 --tile-rows 1 --frames 200

Performance improvement

filename preset crf e2e 3x encoding time
After Before Improvement
Bosphorus_3840x2160.y4m 4 30 146.59 145.47 -0.76%
Bosphorus_3840x2160.y4m 6 30 51.10 50.85 -0.49%
Bosphorus_3840x2160.y4m 8 30 24.21 24.41 0.83%
Bosphorus_3840x2160.y4m 10 30 11.20 11.25 0.45%
Bosphorus_3840x2160.y4m 12 30 7.48 7.53 0.67%
Bosphorus_3840x2160.y4m 13 30 7.50 7.58 1.07%
Bosphorus_1920x1080.y4m 4 30 69.65 69.94 0.42%
Bosphorus_1920x1080.y4m 6 30 25.71 25.62 -0.35%
Bosphorus_1920x1080.y4m 8 30 12.97 12.99 0.15%
Bosphorus_1920x1080.y4m 10 30 5.67 5.68 0.18%
Bosphorus_1920x1080.y4m 12 30 2.87 2.87 0.00%
Bosphorus_1920x1080.y4m 13 30 2.36 2.41 2.12%
Bosphorus_3840x2160_120fps_420_10bit_YUV.yuv 4 30 183.85 181.45 -1.31%
Bosphorus_3840x2160_120fps_420_10bit_YUV.yuv 6 30 45.41 45.88 1.04%
Bosphorus_3840x2160_120fps_420_10bit_YUV.yuv 8 30 22.37 22.76 1.74%
Bosphorus_3840x2160_120fps_420_10bit_YUV.yuv 10 30 10.53 10.65 1.14%
Bosphorus_3840x2160_120fps_420_10bit_YUV.yuv 12 30 7.35 7.36 0.14%
Bosphorus_3840x2160_120fps_420_10bit_YUV.yuv 13 30 7.36 7.37 0.14%

Author(s)

@glpuga @rjcausarano

Performance impact

  • quality
  • memory
  • speed
  • 8 bit
  • 10 bit
  • N/A

Test set

  • obj-1-fast can be found here
  • other
  • N/A

Merge method

  • Allow the maintainer to squash and merge when PR is ready to create a 1-commit to the master branch. The maintainer will be able to fix typos / combine commit messages to create a more readable 1-commit message or use whatever is stated in the 'Description' section
  • I will clean up my commits and the maintainer shall use 'rebase and merge' to the master branch
Edited by Rodrigo Causarano

Merge request reports