Optimizing The Linear Congruential Generator

Optimizing The Linear Congruential Generator

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Nov 7 2014, 04:41 AM #1

This topic is highly related to Optimal base for random guessing by humans, but I decided to create a new thread because I wanted to (1) focus on one particular algorithm, and (2) discuss "computer-scale" parameters rather than just "human-scale" ones.


The Linear Congruential Generator (LCG) is one of the most common algorithms for deterministic generation of "random" numbers. It is defined by (in Python syntax):

Code: Select all

def random_sequence(multiplier, increment, modulus, seed):
    value = seed
    for index in range(modulus):
        value = (multiplier * value + increment) % modulus
        yield value
(The sequence is periodic, so technically the "for" statement should be "while True", but the sequence will be easier to analyze by treating it as finite.)

Each value of the sequence is an integer between 0 (inclusive) and the modulus (exclusive).

For dozenal counting, an obvious choice for the modulus is 12 (*10). This yields a sequence of single-digit numbers. The hard part is deciding on the multiplier and increment.

Some choices are quite obviously bad. For example, the multiplier a=3 and increment c=3 (starting with a seed of 0) gives the sequence [3, 0, 3, 0, 3, 0, ...]. This is not only way too obvious of a pattern, but only uses two of the twelve possible values.

The most basic desirable property of an LCG is to have a full period of m values, each used once per period. This ensures a uniform distribution of values.

With m=12, there are only four possible combinations of parameters meeting this criterion:
  • a=1, c=1: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 0, ...
  • a=1, c=5: 5, 10, 3, 8, 1, 6, 11, 4, 9, 2, 7, 0, ...
  • a=1, c=7: 7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5, 0, ...
  • a=1, c=11: 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, ...
But not of these sequences is "random enough" (a concept I'll more formally define in a later post). So, m=12 just isn't a good choice.

So, let's try two dozenal digits. With m=144 (*100), there are 576 (*400) possible combinations of a and c to chose from. How do we choose?

To be continued...
Quote
Like
Share

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Nov 17 2014, 12:19 AM #2

It's hard to find an example of a good random number generator, but it's easy to find an example of a bad one. Let X[sub]i[/sub] = The day of the week of January 1, 2000+i (Sunday=0, ..., Saturday=6) This isn't an LCG, but it is periodic, with a period of n=400. Computing this "random" sequence over the full period gives: The relative frequencies are [i]close[/i] to the naively-expected 1/7 = 0.142 857 142 857..., deviating only because of a mathematical quirk of the Gregorian leap-day adjustment. (On the Julian calendar, they'd be [i]exactly[/i] 1/7.) To quantify this, define the [b]disuniformity[/b] of a periodic PRNG as the sum of (actual - expected relative frequency)^2 over a whole number of full periods. (Note that this is similar to the [url=http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test]chi-squared test[/url] statistic, but is independent of the number of periods taken.) So, for this particular "random" number generator, the disuniformity is 2*(0.14-1/7)^2 + 2*(0.1425 - 1/7)^2 + 3*(0.145-1/7)^2 = 17/560000 = 0.0000303 571428 571428... Again, this is [i]close[/i] to uniform, and would be a passable PRNG if one-dimensional uniformity were the only criterion. But random numbers aren't always used one at a time; they're often used in pairs. For example, the roll of a [i]pair[/i] of dice in a board game, or the generation of random points in a plane. In theory, taking two random numbers from the range 0-6 gives 49 possible results. But this particular PRNG makes only 14 of them possible: This is because for all 7 possible days a year can start on, there are only 2 possibilities for what day the next year can start on, depending on whether the year is a leap year or not. The disuniformity of this PRNG applied to [i]pairs[/i] of values works out to 12646433/192080000 = 0.06583940545605997.
Quote
Like
Share

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Mar 17 2016, 02:56 AM #3

For a general-purpose random-number generator, we can't assume that the range will be the same as the modulus. Sometimes, we'll want to flip a (2-sided) coin. Sometimes, we'll want to roll a 6-sided die. Sometimes, we'll want to draw a card from a deck of 52.

In computer programming, there are two main ways of generating a range(k) random number from a range(m) generator.

One is the C-style way, rand() % k. But this is a bad approach, especially in combination with the LCG method.

The other is the BASIC way, INT(RND() * k), where RND is a function that returns a floating-point number between 0 (inclusive) and 1 (exclusive). We can convert an integer in range(m) to such a number simply by dividing by m. So, given a generator that returns a random number r in range(m), we can scale it to range(k) by doing INT(k * r / m). If we use a floored integer division operator (like Python's //), then we can simply write k * r // m and avoid troublesome floating-point arithmetic altogether.

This still gives "subtly non-uniform" results, but at least it avoids the systematic bias towards smaller numbers that rand() % k has when k is not a divisor of m.
Quote
Like
Share

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Mar 17 2016, 03:41 AM #4

As illustrated by my second post in this thread, it's not enough for individual "random" numbers to be uniformly distributed. We also need sequences of random numbers to be uniformly distributed as much as possible.

But which sequences? We can not reasonably expect, for example, pairs of numbers from range(144) to be uniformly distributed for the 20 736 possible values, if our generator is to have only 144 possible states.

But we can expect pairs of numbers from range(12) to be uniformly-distributed. Or for any factor of 12. For other numbers less than 12, we can't ensure exact uniformity, but we can strive to minimize the disuniformity (as defined above).

By similar reasoning, we should have the 125 possible triples of numbers from range(5) be as uniformly distributed as possible. (5 being the floor of the cube root of 144). And for quadruples from range(3) and 5-to-7-tuples from range(2).
Quote
Like
Share

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Mar 20 2016, 03:11 AM #5

Another common use case of random numbers is shuffling a deck of cards. The standard algorithm for this is the Fisher–Yates shuffle, which for a deck of n cards requires n-1 random integers: the first from range(n), the second from range(n-1), etc., down to range(2). Given an ideal random number generator, all n! permutations should be possible.

Because 6!=720 but our proposed PRNG has only 144 possible states, there will be unreachable permutations of 6 or more cards. So let's evaluate the shuffles of 5 cards or less. We can ignore the degenerate cases of "shuffling" 1 card (which is a no-op) or 2 cards (which is equivalent to a coin toss).
Quote
Like
Share

jrus
Regular
jrus
Regular
Joined: Oct 23 2015, 12:31 AM

Mar 20 2016, 11:43 PM #6

Forget Linear Congruential Generators. Just use http://www.pcg-random.org :-)
Quote
Like
Share

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Mar 22 2016, 12:23 AM #7

Code: Select all

uint32_t pcg32_random_r(pcg32_random_t* rng)
{
    uint64_t oldstate = rng->state;
    // Advance internal state
    rng->state = oldstate * 6364136223846793005ULL + (rng->inc|1);
    // Calculate output function (XSH RR), uses old state for max ILP
    uint32_t xorshifted = ((oldstate >> 18u) ^ oldstate) >> 27u;
    uint32_t rot = oldstate >> 59u;
 &nbsp; &nbsp;return &#40;xorshifted >> rot&#41; &#124; &#40;xorshifted << &#40;&#40;-rot&#41; & 31&#41;&#41;;
}
Nice simple algorithm. Looks like an enhanced version of Xorshift. I'll have to try it the next time I do a Monte Carlo simulation.

But this thread was supposed to be about human-generated random numbers. Bitshifts and XOR may be fast for a computer, but not so much for pencil-and-paper (unless you're already using a power-of-two base).
Quote
Like
Share

jrus
Regular
jrus
Regular
Joined: Oct 23 2015, 12:31 AM

Mar 23 2016, 03:12 AM #8

Do you have some context where you need to generate random base-twelve numbers by hand, or is this just a fun curiosity?
Quote
Like
Share

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Mar 23 2016, 03:13 AM #9

In summary, the tests I have proposed for a 144-state random number generator are: So, the question is: Is there any combination of multiplier and increment that passes all of these tests?
Quote
Like
Share

Dan
Dozens Disciple
Dan
Dozens Disciple
Joined: Aug 8 2005, 02:45 PM

Apr 12 2016, 02:39 AM #10

The answer turns out to be "no". So the question is: How close can we get? To be continued...
Quote
Like
Share

Piotr
Piotr

Dec 27 2016, 12:07 PM #11

The problem is that 144 is very composite, so using LCG will make small cycles after doing the result mod divisor. For example all LCGs that use mod even either have all results odd or alternate odd and even. A good LCG can be done by using a mod of 2147483647 (prime) and multiplying by 16807 (no adding). It will make an uniformly random generator from 1 to 2147483646. Primes near powers of 2 are most convenient (3, 7, 17, 31, 127, 257, 8191, 65537, ...)
Quote
Share