Improve generic CompareByte.
Generic `CompareByte` can be improved in the way resembling !360 (things like `PtrUint(ptr) := PtrUint(ptr) div 4 * 4` aren’t necessary for that, but they give slightly better code as well and I hope they are valid everywhere...).
Patch: [CompareByte.patch](/uploads/b7987627caf8b794ae56d98d683c4d28/CompareByte.patch).
It makes platform-specific implementations for `i386` and `x86_64` worse than generic for me, as `i386` uses bytewise loop and **three** `REP CMP`s (one would already be bad enough), and `x86_64` uses bytewise loop exclusively. So, unless someone comes up with SSE version, I propose to also remove [both](https://gitlab.com/freepascal.org/fpc/source/-/blob/55deefbab5a5f3f203587cfdb1f065251d3321f4/rtl/i386/i386.inc#L467) of [them](https://gitlab.com/freepascal.org/fpc/source/-/blob/55deefbab5a5f3f203587cfdb1f065251d3321f4/rtl/x86_64/x86_64.inc#L634).
Benchmark: [CompareByte.pas](/uploads/cccda7cef3d1427d60e5476988c32843/CompareByte.pas).
My results. Note `(!)` where second byte already differs, but `i386` version sees `len > 57` and issues three `REP CMP`s.
```
x86-64/win64 i386/win32
CompareByteGeneric: 288 b CompareByteGeneric: 304 b
CompareByteGenericV2: 208 b CompareByteGenericV2: 176 b
Different byte #0 of 1 Different byte #0 of 1
System.CompareByte: 1.8 ns/call System.CompareByte: 2.2 ns/call
CompareByteGeneric: 3.0 ns/call CompareByteGeneric: 2.9 ns/call
CompareByteGenericV2: 1.9 ns/call CompareByteGenericV2: 3.0 ns/call
Different byte #7 of 8 Different byte #7 of 8
System.CompareByte: 4.9 ns/call System.CompareByte: 5.3 ns/call
CompareByteGeneric: 11 ns/call CompareByteGeneric: 7.8 ns/call
CompareByteGenericV2: 4.4 ns/call CompareByteGenericV2: 6.3 ns/call
Different byte #15 of 16 Different byte #15 of 16
System.CompareByte: 7.8 ns/call System.CompareByte: 8.9 ns/call
CompareByteGeneric: 19 ns/call CompareByteGeneric: 9.4 ns/call
CompareByteGenericV2: 6.2 ns/call CompareByteGenericV2: 8.0 ns/call
Different byte #23 of 24 Different byte #23 of 24
System.CompareByte: 9.9 ns/call System.CompareByte: 11 ns/call
CompareByteGeneric: 26 ns/call CompareByteGeneric: 11 ns/call
CompareByteGenericV2: 6.4 ns/call CompareByteGenericV2: 9.5 ns/call
Different byte #1 of 100 Different byte #1 of 100
System.CompareByte: 1.8 ns/call System.CompareByte: 40 ns/call (!)
CompareByteGeneric: 5.2 ns/call CompareByteGeneric: 5.8 ns/call
CompareByteGenericV2: 2.9 ns/call CompareByteGenericV2: 4.3 ns/call
Different byte #99 of 100 Different byte #99 of 100
System.CompareByte: 43 ns/call System.CompareByte: 53 ns/call
CompareByteGeneric: 20 ns/call CompareByteGeneric: 24 ns/call
CompareByteGenericV2: 10 ns/call CompareByteGenericV2: 15 ns/call
Different byte #999 of 1000 Different byte #999 of 1000
System.CompareByte: 288 ns/call System.CompareByte: 163 ns/call
CompareByteGeneric: 87 ns/call CompareByteGeneric: 199 ns/call
CompareByteGenericV2: 51 ns/call CompareByteGenericV2: 93 ns/call
```
issue