Description
This is merge request of revertion the commit 1d842ee6 that causes speed regression on windows.
After this revertion, the MacOS stack overflow issue can not be re-produced in CI progress. So we do not apply the option 2 in this MR. Another MR is prepared here, the test shows similar performance.
If the option2 is required, we will close this MR and create another MR includes option2
Issue
Closes Issue 1979
Author(s)
Performance impact
-
quality -
memory -
speed -
8 bit -
10 bit -
N/A
Test Case:
AMD Ryzen 7 3700X 8-Core Processor (8C16T) @ 3.60 GHz + 32GB DRAM
Windows10 Pro 21H1
SvtAv1EncApp.exe -i tf2_lossless_10bit.yuv -w 1920 -h 1080 --input-depth 10 --preset 7 --crf 14 --tune 0
Runs: 10
commit d92858be is the bad commit on master for base
commit 91a3aa39 only reserts the patch of moving memory from stack to heap in MR1947
Test NO, | master commit d92858be FPS | revertion commit 91a3aa39 FPS |
---|---|---|
1 | 11.037 | 13.978 |
2 | 10.892 | 13.909 |
3 | 10.887 | 13.909 |
4 | 10.929 | 13.899 |
5 | 10.869 | 13.936 |
6 | 10.896 | 13.879 |
7 | 10.953 | 13.920 |
8 | 10.878 | 13.945 |
9 | 10.917 | 13.911 |
10 | 10.968 | 13.889 |
AVG. | 10.923 | 13.918 |
Deviation | 27.42% |
Test set
-
obj-1-fast can be found here -
other -
N/A
Table 1 | 10-bit ref scale off revert version vs. master | PSNR | SSIM | VMAF | AVG PSNR/SSIM/VMAF | Cycles Dev | Max Value Memory Deviation | Abs Max Clip Memory Deviation | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ALL | svt_M5_91a3aa39_off_10bit | vs. | svt_M5_master_off_10bit | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 1.08% | -0.11% | 0.27% |
ALL | svt_M8_91a3aa39_off_10bit | vs. | svt_M8_master_off_10bit | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.97% | 0.17% | -1.32% |
ALL | svt_M12_91a3aa39_off_10bit | vs. | svt_M12_master_off_10bit | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.98% | 0.10% | 0.28% |
Table 2 | 10-bit ref scale random mode revert version vs. master | PSNR | SSIM | VMAF | AVG PSNR/SSIM/VMAF | Cycles Dev | Max Value Memory Deviation | Abs Max Clip Memory Deviation | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ALL | svt_M5_91a3aa39_resize_random_10bit | vs. | svt_M5_master_resize_random_10bit | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | -0.28% | 0.68% | 1.96% |
ALL | svt_M8_91a3aa39_resize_random_10bit | vs. | svt_M8_master_resize_random_10bit | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | -0.61% | -1.12% | 2.33% |
ALL | svt_M12_91a3aa39_resize_random_10bit | vs. | svt_M12_master_resize_random_10bit | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | -0.79% | 0.37% | -2.31% |
Table 1 and Table 2 show lossless in both default case and resize random mode case.
Merge method
-
Allow the maintainer to squash and merge when PR is ready to create a 1-commit to the master branch. The maintainer will be able to fix typos / combine commit messages to create a more readable 1-commit message or use whatever is stated in the 'Description' section -
I will clean up my commits and the maintainer shall use 'rebase and merge' to the master branch