GCC version 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)
##Operating System
Kernel OS Version: Darwin Kernel Version 16.7.0: Tue Jan 30 11:27:06 PST 2018; root:xnu-3789.73.11~1/RELEASE_X86_64
```
### Constant Shifts
```
test description absolute operations ratio with
number time per second test0
0 "uint8_t constant right shift" 1.88 sec 25499.09 M 1.00
1 "uint8_t repeated constant right shift" 1.82 sec 26325.17 M 0.97
2 "uint8_t constant left shift" 1.83 sec 26186.88 M 0.97
3 "uint8_t repeated constant left shift" 1.87 sec 25679.06 M 0.99
4 "uint8_t identity" 1.06 sec 45319.23 M 0.56
5 "uint8_t right shift zero" 1.06 sec 45305.97 M 0.56
6 "uint8_t left shift zero" 1.05 sec 45829.73 M 0.56
7 "int8_t constant right shift" 2.63 sec 18224.42 M 1.40
8 "int8_t repeated constant right shift" 2.62 sec 18310.80 M 1.39
9 "int8_t constant left shift" 1.83 sec 26296.72 M 0.97
10 "int8_t repeated constant left shift" 1.87 sec 25720.25 M 0.99
11 "int8_t identity" 1.05 sec 45662.62 M 0.56
12 "int8_t right shift zero" 1.06 sec 45179.53 M 0.56
13 "int8_t left shift zero" 1.06 sec 45469.27 M 0.56
14 "uint16_t constant right shift" 3.13 sec 15358.68 M 1.66
15 "uint16_t repeated constant right shift" 3.13 sec 15315.37 M 1.66
16 "uint16_t constant left shift" 3.12 sec 15387.59 M 1.66
17 "uint16_t repeated constant left shift" 3.13 sec 15332.28 M 1.66
18 "uint16_t identity" 2.09 sec 22952.15 M 1.11
19 "uint16_t right shift zero" 2.09 sec 22948.65 M 1.11
20 "uint16_t left shift zero" 2.10 sec 22817.07 M 1.12
21 "int16_t constant right shift" 2.82 sec 17030.54 M 1.50
22 "int16_t repeated constant right shift" 2.80 sec 17160.90 M 1.49
23 "int16_t constant left shift" 3.14 sec 15304.76 M 1.67
24 "int16_t repeated constant left shift" 3.11 sec 15425.08 M 1.65
25 "int16_t identity" 2.10 sec 22823.30 M 1.12
26 "int16_t right shift zero" 2.11 sec 22729.25 M 1.12
27 "int16_t left shift zero" 2.11 sec 22770.46 M 1.12
28 "uint32_t constant right shift" 5.88 sec 8166.28 M 3.12
29 "uint32_t repeated constant right shift" 5.90 sec 8128.79 M 3.14
30 "uint32_t constant left shift" 5.88 sec 8164.42 M 3.12
31 "uint32_t repeated constant left shift" 5.88 sec 8166.45 M 3.12
32 "uint32_t identity" 4.37 sec 10972.18 M 2.32
33 "uint32_t right shift zero" 4.38 sec 10966.44 M 2.33
34 "uint32_t left shift zero" 4.39 sec 10926.02 M 2.33
35 "int32_t constant right shift" 5.93 sec 8095.01 M 3.15
36 "int32_t repeated constant right shift" 5.95 sec 8068.02 M 3.16
37 "int32_t constant left shift" 5.98 sec 8030.75 M 3.18
38 "int32_t repeated constant left shift" 5.96 sec 8053.61 M 3.17
39 "int32_t identity" 4.31 sec 11124.79 M 2.29
40 "int32_t right shift zero" 4.31 sec 11126.72 M 2.29
41 "int32_t left shift zero" 4.32 sec 11109.39 M 2.30
42 "uint64_t constant right shift" 16.60 sec 2890.73 M 8.82
43 "uint64_t repeated constant right shift" 16.60 sec 2892.34 M 8.82
44 "uint64_t constant left shift" 16.54 sec 2902.61 M 8.78
45 "uint64_t repeated constant left shift" 16.56 sec 2898.08 M 8.80
46 "uint64_t identity" 13.91 sec 3451.25 M 7.39
47 "uint64_t right shift zero" 13.92 sec 3449.14 M 7.39
48 "uint64_t left shift zero" 13.93 sec 3445.21 M 7.40
49 "int64_t constant right shift" 28.75 sec 1669.71 M 15.27
50 "int64_t repeated constant right shift" 28.70 sec 1672.75 M 15.24
51 "int64_t constant left shift" 16.61 sec 2889.52 M 8.82
52 "int64_t repeated constant left shift" 16.56 sec 2899.09 M 8.80
53 "int64_t identity" 13.96 sec 3438.47 M 7.42
54 "int64_t right shift zero" 14.05 sec 3416.72 M 7.46
55 "int64_t left shift zero" 14.18 sec 3384.46 M 7.53
```
* Constant and repeated constant shifts look good. But 8 bit signed right shift is taking a bit longer than unsigned, 16 bit signed left shift takes longer than right, 32 bit signed is equal in both directions, and int64 again has signed right shift slower than left.
* Identity operations look good, until we get to signed 64 bit and something goes a little bit slow.
### Variable Shifts
```
test description absolute operations ratio with
number time per second test0
0 "uint8_t variable right shift" 55.67 sec 862.22 M 1.00
1 "uint8_t repeated variable right shift" 49.30 sec 973.61 M 0.89
2 "uint8_t variable left shift" 10.84 sec 4428.76 M 0.19
3 "uint8_t repeated variable left shift" 49.49 sec 969.94 M 0.89
4 "int8_t variable right shift" 11.27 sec 4260.22 M 0.20
5 "int8_t repeated variable right shift" 50.57 sec 949.20 M 0.91
6 "int8_t variable left shift" 10.91 sec 4398.09 M 0.20
7 "int8_t repeated variable left shift" 49.70 sec 965.79 M 0.89
8 "uint16_t variable right shift" 52.05 sec 922.11 M 0.94
9 "uint16_t repeated variable right shift" 75.41 sec 636.48 M 1.35
10 "uint16_t variable left shift" 11.10 sec 4325.64 M 0.20
11 "uint16_t repeated variable left shift" 75.28 sec 637.61 M 1.35
12 "int16_t variable right shift" 10.96 sec 4380.58 M 0.20
13 "int16_t repeated variable right shift" 74.38 sec 645.29 M 1.34
14 "int16_t variable left shift" 10.96 sec 4381.56 M 0.20
15 "int16_t repeated variable left shift" 75.04 sec 639.69 M 1.35
16 "uint32_t variable right shift" 43.21 sec 1110.93 M 0.78
17 "uint32_t repeated variable right shift" 50.05 sec 959.14 M 0.90
18 "uint32_t variable left shift" 6.59 sec 7283.61 M 0.12
19 "uint32_t repeated variable left shift" 50.07 sec 958.73 M 0.90
20 "int32_t variable right shift" 20.79 sec 2308.36 M 0.37
21 "int32_t repeated variable right shift" 50.39 sec 952.63 M 0.91
22 "int32_t variable left shift" 6.55 sec 7326.63 M 0.12
23 "int32_t repeated variable left shift" 49.80 sec 963.82 M 0.89
24 "uint64_t variable right shift" 17.08 sec 2810.33 M 0.31
25 "uint64_t repeated variable right shift" 49.74 sec 965.04 M 0.89
26 "uint64_t variable left shift" 17.25 sec 2781.84 M 0.31
27 "uint64_t repeated variable left shift" 50.00 sec 960.01 M 0.90
28 "int64_t variable right shift" 25.67 sec 1869.81 M 0.46
29 "int64_t repeated variable right shift" 50.87 sec 943.55 M 0.91
30 "int64_t variable left shift" 17.53 sec 2738.72 M 0.31
31 "int64_t repeated variable left shift" 50.07 sec 958.59 M 0.90
```
* we start off with a problem - unsigned 8 bit variable shift is much slower than other 8 bit shifts. We see the same problem for unsigned 16 bit, 32 bit, but not 64 bit (where signed right shift gets slower).
* repeated variable shifts are not being optimized down to a single shift by LLVM.
* for some reason the single variable 32 bit shifts are faster than 8 or 16 bit, except signed variable right shift. Something funny is going on in the optimizer here.
### Constant Mask Low
```
test description absolute operations ratio with
number time per second test0
0 "uint8_t constant mask low" 1.47 sec 32660.17 M 1.00
1 "uint8_t constant mask low by shift" 1.65 sec 29169.19 M 1.12
2 "int8_t constant mask low" 1.47 sec 32690.29 M 1.00
3 "int8_t constant mask low by shift" 1.46 sec 32908.49 M 0.99
4 "uint16_t constant mask low" 3.00 sec 15986.67 M 2.04
5 "uint16_t constant mask low by shift" 2.96 sec 16210.45 M 2.01
6 "int16_t constant mask low" 3.03 sec 15853.79 M 2.06
7 "int16_t constant mask low by shift" 2.98 sec 16094.92 M 2.03
8 "uint32_t constant mask low" 6.16 sec 7798.49 M 4.19
9 "uint32_t constant mask low by shift" 6.12 sec 7839.48 M 4.17
10 "int32_t constant mask low" 6.29 sec 7625.15 M 4.28
11 "int32_t constant mask low by shift" 6.35 sec 7562.09 M 4.32
12 "uint64_t constant mask low" 17.31 sec 2773.06 M 11.78
13 "uint64_t constant mask low by shift" 17.31 sec 2772.56 M 11.78
14 "int64_t constant mask low" 17.34 sec 2768.74 M 11.80
15 "int64_t constant mask low by shift" 17.32 sec 2771.50 M 11.78
```
* there are some speed glitches, but overall it appears that the constant mask low operations are being optimized correctly.
### Variable Mask Low
```
test description absolute operations ratio with
number time per second test0
0 "uint8_t variable mask low" 1.47 sec 32681.39 M 1.00
1 "uint8_t variable mask low by shift" 82.66 sec 580.67 M 56.28
2 "int8_t variable mask low" 1.53 sec 31308.97 M 1.04
3 "int8_t variable mask low by shift" 25.82 sec 1858.88 M 17.58
4 "uint16_t variable mask low" 3.07 sec 15652.88 M 2.09
5 "uint16_t variable mask low by shift" 76.46 sec 627.79 M 52.06
6 "int16_t variable mask low" 2.93 sec 16381.21 M 2.00
7 "int16_t variable mask low by shift" 24.58 sec 1952.78 M 16.74
8 "uint32_t variable mask low" 5.88 sec 8167.71 M 4.00
9 "uint32_t variable mask low by shift" 25.26 sec 1900.58 M 17.20
10 "int32_t variable mask low" 5.99 sec 8007.81 M 4.08
11 "int32_t variable mask low by shift" 25.31 sec 1896.78 M 17.23
12 "uint64_t variable mask low" 16.49 sec 2910.83 M 11.23
13 "uint64_t variable mask low by shift" 26.94 sec 1781.46 M 18.35
14 "int64_t variable mask low" 16.45 sec 2918.53 M 11.20
15 "int64_t variable mask low by shift" 26.89 sec 1784.96 M 18.31
```
* Wow, the variable mask low operations are NOT being optimized correctly!
* How the heck are the unsigned mask operations slower?