Commit 48d538fd authored by Christoph Conrads's avatar Christoph Conrads

README: improve integer factorization hints

parent 92675524
......@@ -40,40 +40,8 @@ The Bash script `find-awc-swb-prngs.sh` finds AWC/SWB parameters and block sizes
* `swb` is a subtract-with-borrow PRNG with base `b`, long lag `r`, and short lag `s`. Its recurrence is `x_n = x_{n-r} - x_{n-s} - c`, where `c` is the carry bit.
* `swc` is a subtract-with-borrow PRNG with recurrence `x_n = x_{n-s} - x_{n-r} - c` (note the different order of the terms). This generator is also an SWB but seeing that the C++11 standard library calls the class implementing this generator `subtract_with_carry`, I decided to use the acronym `swc`.
* `t` is the time parameter from Lüscher's chaotic dynamical system discussion of AWC and SWBs PRNGs.
* `log2period` and `log10period` show the base-2 and base-10 logarithm of the period length bounds. **THE PERIOD LENGTHS ARE ONLY APPROXIMATIONS**.
* `log2period` and `log10period` show the base-2 and base-10 logarithm of the period length bounds.
The period lengths are only approximations because I am only superficially familiar with number theory. Moreover, computing the period length requires decomposing large integers into their prime factors. This is a hard problem and in many cases we do not even attempt to factorize numbers. In fact, the prime factorizations hard-coded in `compute-awc-swb-period-length.py` were computed by [GMP-ECM](http://ecm.gforge.inria.fr/).
Computing the period length requires decomposing large integers into their prime factors. This is a hard problem and in many cases I did not even attempt to factorize numbers. In fact, the prime factorizations hard-coded in `compute-awc-swb-period-length.py` were mostly computed by [GMP-ECM](http://ecm.gforge.inria.fr/) and some by [Cado-NFS](http://cado-nfs.gforge.inria.fr/). To factorize numbers, find small factors first with GMP-ECM using the procedure outlined in the section "How to use P-1, P+1, and ECM efficiently?" of the GMP-ECM README, run Cado-NFS afterwards. The idea here is that the run-time of ECM depends on the number of digits of a desired prime factor whereas the run-time of Cado-NFS is a function of the size of the number. I strongly suggest to always enable primality testing when running GMP-ECM or you may waste hours of computing time (`ecm -primetest ...`).
To compute the block size `p`, we suggest to use the smallest prime number larger than `t * r`, e.g., for the popular 24-bit ranlux, we have `b=2^24`, `r=24`, `s=10`, and `t = 16`. It holds that `t * p = 16 * 24 = 384` so `p = 389`. Note that in practice, prime values larger than `t*p / 4` are often sufficient to pass all empirical random number generator tests.
# Appendix: Integer Factorization with GMP-ECM
GMP-ECM reads the integers to factorize from standard input and it can also parse input like `2^8+1`.
Find all small factors first using Pollard's Rho algorithm:
```sh
ecm -q -pm1 -c 107 11e3
```
Check for prime factors using `openssl prime`, e.g.,
```sh
openssl prime 3
```
Pass only one number at a time and remember that this program prints in input in hexadecimal. This can be very confusing:
```
$ openssl prime 295
127 is not prime
```
Take all non-prime factors and attempt to decompose them using elliptic curve factorization using the parameters found in [Optimal parameters for ECM](https://members.loria.fr/pzimmermann/records/ecm/params.html):
```sh
echo 123 | ecm -q -c 107 11e3
echo 123 | ecm -q -c 261 5e4
...
```
I suggest to prefix the ecm call with `time` because it shows the ecm return value and the return value contains information about the computed factors:
```
$ echo '2^(64*6)-1' | time -p /tmp/ecm/bin/ecm -q -pm1 -c 107 11e3
7341968051412347753761237695 6700417 22253377 65537 ((((2^(64*6)-1)/7341968051412347753761237695)/6700417)/22253377)/65537
Command exited with non-zero status 6
```
Your shell may have a built-in version of `time`. See [man(1) ecm](https://manpages.debian.org/jessie/gmp-ecm/gmp-ecm.1.en.html) for the meaning of the return values; everything else on this website is outdated.
To compute the block size `p`, it is suggested to use the smallest prime number larger than `t * r`, e.g., for the popular 24-bit ranlux, we have `b=2^24`, `r=24`, `s=10`, and `t = 16`. It holds that `t * p = 16 * 24 = 384` so `p = 389`. Note that in practice, prime values larger than `t*p / 4` are often sufficient to pass all empirical random number generator tests.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment