Pseudorandom number generators
Random numbers are useful for a variety of purposes, such as generating data encryption keys, simulating and modeling complex phenomena and for selecting random samples from larger data sets. They have also been used aesthetically, for example in literature and music, and are of course ever popular for games and gambling. When discussing single numbers, a random number is one that is drawn from a set of possible values, each of which is equally probable, i.e., a uniform distribution. When discussing a sequence of random numbers, each number drawn must be statistically independent of the others.^{[1]}
This chapter aims at explaining why it's hard and very important (besides interesting) to understand how to get a computer to generate proper random numbers. Since most of currently used security and encryption standards depend on random numbers, it is easy to imagine that by selecting not secured algorithms causes many aspects of our digital lives to become exposed to clever programmers and companies interested in data analysis or less legal practices (identity theft, surveillance, bank fraud and so on). One must realize that the privacy of for example all their banking activities, email communication, social networking and so on heavily depends on randomness in applied security mechanisms.
In cryptography, a pseudorandom generator (or PSG) is procedure that outputs a sequence computationally indistinguishable from truly random sequence with uniformly distributed random sequence. The prefix pseudo (from Greek ψευδής "lying, false") is used to mark something as false, fraudulent, or pretending to be something it is not. Pseudo random generators find application in many fields besides cryptography such as applied mathematics, physics and simulations. Simulations often require mechanisms producing sequences of random values. These procedures are certainly nontrivial and often require significant amounts of computational time.
Contents
Definition
A pseudorandom generator (or PRG) is a (deterministic) map , where . Here is the 'seed length' and is the 'stretch'. We typically think that and that is efficiently computable in some model. If is any 'statistical test', we say that G 'fools' is
where denotes a uniformly random string in . Here the string is called the 'seed'. If is a class of tests, we say that G 'fools ' or is an 'PRG against ' if fools for every .^{[2]}
In other words we are trying to convince any outside party (let's call them an adversary) that sequences returned from PRG are being produced chosen at random. Adversary may use statistical test algorithms to check simultaneously outputs from PRG and uniformly random sequnces. PRGs ensure that both outputs look the same to the adversary..
Required properties
Reliable PRG should have all these properties^{[3]}:
Unbiased  Uniform distribution
By definition of the word unbiased this property states that PRG is showing no prejudice for or against something. In PRG language this means that all values of whatever sample size is collected are equiprobable. This property ensures the independence of the generator and its stability against certain types of attacks.
Unpredictable  Independence
It is impossible to predict what the next output will be, given all the previous outputs, but not the internal state. If this is not guaranteed then basically anyone can pose as a generator. This is used for example in the man in the middle attack.
Unreproducible
Two of the same generators, given the same starting conditions, will produce different outputs. Certain types of spoofing attacks try to reproduce subset of real production environment in order to exploit the lack of security in respect with this condition.
Long period
The generator should be of long period because this property directly influences the randomness of generated outputs. This is crucial for example for simulations since they are conducted in order to simulate dynamic behavior and states of an environment not only the cyclic stages of an environment.
Fast computation
The generator should be reasonably fast. It is always a good idea to take care about the users of the generator since they might be in a quite constrained environment either by hardware specifications or by expected time performance.
Security
The generator should be secured. Basically, security of the generator ensures that no one can break the generator in reasonable time either by the bruteforce approach or by more clever ones. If there is no polynomialtime algorithm that on the first output sequence can predict the bit with probability greater than 0.5 we consider the generator to be secured.
Construction of simple PRG
There are many ways how to construct PRGs and one of the simplest ones is to use pseudorandom functions and expand the key. A pseudorandom function (or PRF) is any function defined over Failed to parse (unknown function "\textnormal"): {\displaystyle {{\textnormal(K, X, Y)}}} :
where:
 is key space
 is input space
 is output space
such that exists efficient algorithm to evaluate Failed to parse (unknown function "\textnormal"): {\displaystyle {{\textnormal F(k,x)}}} .^{[4]}
On the other hand pseudorandom permutation is any function defined over Failed to parse (unknown function "\textnormal"): {\displaystyle {{\textnormal(K, X)}}}
:
such that^{[4]}:
 Exists efficient deterministic algorithm to evaluate Failed to parse (unknown function "\textnormal"): {\displaystyle {{\textnormal E(k,x)}}}
 The function Failed to parse (unknown function "\textnormal"): {\displaystyle {{\textnormal F(k,\cdot)}}} is onetoone
 Exists efficient inversion algorithm Failed to parse (unknown function "\textnormal"): {\displaystyle {{\textnormal D(k,y)}}}
Any pseudorandom permutation (or PRP) is also pseudorandom function given
 Function is efficiently invertible
So let be a PRF
Failed to parse (syntax error): {\displaystyle {{ \begin{cases} Functions[X, Y]: \text{ all functions from X to Y}\\ S_F = {F(k, \cdot) st. k \in K} \subseteq Functions[X, Y] \end{cases} }
For PRF to be suitable for use in PRG it must be secure and therefore computationally indistinguishable from random function . This situation is depicted below:
The adversary can not distinguish whether output came from or some random .
If is secure PRF we can use key expantion to construct secure PRG defined as
where:
 number of bits in each block
 number of generated blocks
We get return value by using key expansion. Great advantage of using this approach is ability to employ multiple CPU cores and take advantage of parallelization (for example odd values are computed by core 1; even values are computed by core 2). Security of PRG is provided by fact that is indistinguishable from random .
Linear methods
Linear Congruential Generator


Linear Congruential Generator (or LCG) is one of the best known PRGs in the world. This generator is defined as follows
where:
 is the sequence of pseudorandom values
 is the seed
 is the multiplier
 is the increment
 is the modulo
are integer constants that specify the generator.^{[5]}
The range of output values is restricted since after at most values the period starts to repeat itself (in the terms of repeating the same pattern). The most significant element in terms of the length of the period is the multiplier. Best outputs from generator with regard to the length of the period are provided given values:
 is divisable by all the primes that divide the multiplier
 is multiple of number 4, when the multiplier is multiple of number 4
 the multiplier and the increment do not have common divisor (except from 1)
An example of output exported from WolframAplha application^{[6]}:
Example:
Following example shows how to compute first five output values for specified LCG:
 .
Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * X_0 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * 1477 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  7661  

Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * X_1 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * 7661 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  2785  
Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * X_2 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * 2785 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  6875  
Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * X_3 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * 6875 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  4381  
Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * X_4 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 4331 * 4381 + 3492 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  4901 
Multiplicative Congruential Generator
Most natural choice for is one that equals to the capacity of a computer word. (binary machine), where is the number of bits in the computer word. (decimal machine), where is the number of digits in the computer word. 
Multiplicative Congruent Generator (or MCG) is simplified version of LCG since if c = 0 in LCG we get the MCG^{[5]}. This generator is defined as follows:
where:
 is the sequence of pseudorandom values
 is the seed
 is the multiplier
 is the modulo
An example of output (of 59bit multiplicative congruential generator) exported from WolframAplha application^{[7]}:
Example:
Following example shows how to compute first five output values for specified MCG:
 .
Failed to parse (unknown function "\hskip"): {\displaystyle X_0 * 3414 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 1739 * 3414{\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  3360  

Failed to parse (unknown function "\hskip"): {\displaystyle X_1 * 3414 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 3360 * 3414{\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  1791  
Failed to parse (unknown function "\hskip"): {\displaystyle X_2 * 3414 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 1791 * 3414{\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  4593  
Failed to parse (unknown function "\hskip"): {\displaystyle X_3 * 3414 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 4593 * 3414{\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  0321  
Failed to parse (unknown function "\hskip"): {\displaystyle X_4 * 3414 {\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  Failed to parse (unknown function "\hskip"): {\displaystyle 0321 * 3414{\hskip 0.5cm} (mod{\hskip 0.15cm}7902)}  2865 
Lagged Fibonacci Generator


Lagged Fibonacci generator (or LFG) is one of the fastest PRGs providing long period. LFG is defined as follows
where:
 is coeficient of lag,
 is coeficient of lag,
 binary operation such as adding or subtracting in modulo , multiplication in modulo or bitwise exclusive ()
and the sequence is defined by
For generator to work, and must be odd numbers. Generator stores used values in a lag table. In order to achieve the maximum length of the period and fair degree of randomness parameters need to be set in the following way:
 and have values of powers of primitive polynomials
producing the length of the period for :
^{[8]}
An example of an output^{[9]}:
Mersenne Twister


Mersenne twister (or MT) is one of the modern implementations of PRGs and it provides very high quality of peudorandom numbers. It was developed by Makoto Matsumoto and Takuji Nishimura in 1997 with the aim of replacement of some known faults of older PRGs. MT is founded on a matrix linear recurrence over a finite binary field. Nowadays, the most used versions are MT with 32 and 64 bit word length. Due to optimizations applied to MT it is optimized to be used in Monte Carlo method simulaions in many fields of science.
Since the description of algorithm is quite , I would like to point the readers to this paper from George Mason University where the internal mechanics of MT are explained in detail.
An example of output (of Mersenne twister shift register generator) exported from WolframAplha application^{[7]}:
Nonlinear methods
Blum Blum Shub


Blum Blum Shub (or BBS) is PRG developed by Lenore Blum, Manuel Blum and Michael Shub in 1986. It is defined as follows:
Failed to parse (unknown function "\hspace"): {\displaystyle x_{n+1} = x_n^2 \hspace{0.5cm} mod M}
where:
 random large prime
 random large prime
 is th eproduct of and
 is the seed; usually an integer that is coprime to ^{[10]}
BBS does not find application in simulations due to its running time. On the other hand due to it's security properties it is appropriate for use in cryptography.
An example of output constructed in Maple 14 (seed=10, range=1000):
Testing
Very important concept to ensure reliability and determine possible areas of use is testing PRGs. There are many tests for PRGs. First of all let's take a look at bacis catogories of tests.
Theoretic tests
These tests aim at detailed study of internal structure of a PRG, its parameters and inner workings. They are typically used when theoretical concepts of PRG are publicly known. From cryptography we know that best way to test any security concept is to assume that adversary already knows the structure and inner workings of tested concept so we rely on complexity of the math backing this concept.
These tests aim at:
 finding logical gaps in proposed solutions
 short comings in the ways parameters are processed
 detail analysis and possible test covarage (in terms of the software development testing)
Examples:
 Autocorrelation test
 Analysis serial correlation of the members of the sequence. Very nice example and tutorial how to approach this problem is described here with simple applet for computing the autocorrelation test
 Spectral test
 This test detects periodical aspects of produced sequences. Nice application of these test on LCGs is demonstrated here
Blackbox testing
Sometimes adversary does not have an access to the used PRG or simply cannot determine what kind of PRG (if any) is used. In these cases adversary uses another approach and tries to determine the PRG by supplying certain groups of parameters and analysing and testing provided result sequences. Result analysis is based on finding similarities and patterns in result sets. In case of less secured PRG the adversary is able to determine just by doing that, whether they are interacting with PRG or truly random function. An adversary might be able to determine a kind of PRG or even the concrete implementation based on the randomness of gathered outputs.
These tests aim at:
 faults in PRG inner workings
 repeating sequences and patterns in result sequences
 outputs of certain combination of entry parameters
References
 ↑ Introduction to Randomness and Random Numbers, 2012, Random.org
 ↑ Lecture 16: Nisan's PRG for small space, 15855: Intensive Intro to Complexity Theory. Spring 2009, Carnegie Mellon University, USA.: page 1
 ↑ XiangYang Li, Pseudorandom Number  Cryptography and Network Security. Illinois Institute of Technology, USA
 ↑ ^{4.0} ^{4.1} Boneh, D. (2012); Lecture 3  Block ciphers, Introduction to Cryptography. Stanford, USA.
 ↑ ^{5.0} ^{5.1} Knuth, D. E. (1997). The Art of Computer Programming, volume 2. Addison Wesley, third edition.
 ↑ WolframAlpha  Linear Congruential Generators (2012)
 ↑ ^{7.0} ^{7.1} WolframAlpha  Mersenne Twister and Friends (2012)
 ↑ Mikulka Z., (2008); Random Number Generators  Bachelor’s thesis. University of technology  Faculty of electrical engeneering and communication, Department of telecomunications, Brno.: page 14
 ↑ SPRNG Libraries Documentation
 ↑ Lenore Blum, Manuel Blum, and Michael Shub. "A Simple Unpredictable PseudoRandom Number Generator", SIAM Journal on Computing, volume 15, pages 364–383, May 1986.
Extarnal links
 WolframAlpha  Linear Congruential Generator
 WolframAlpha  Mersenne Twister simulator
 Mersenne Twister – A Pseudo Random Number Generator and its Variants
 Autocorrelation test
 Spectral test
 Web dedicated to the 'randomness'
Self test
1. What is the plaintext for cipher text "33 63 66 15 41 79 85 15 65 58 85" with following encryption properties:
Lets assume that a secret message has been prepared by converting the letters into digits following the rule:
A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

Letter code  01  02  03  04  05  06  07  08  09  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26 
Then the successive digits are added, modulo 10 to the successive digits of the output of a LCG with following properties: , , and .
If , then cipher for plaintext AB is:
Plaintext  A  B 

Plaintext code  01  02 
Key digits  12  34 
Ciphertext  13  36 
2. Let be a secure PRF. I following generator G a secure PRF?
Failed to parse (unknown function "\hskip"): {\displaystyle G(k, x) = { \begin{cases} 0^{128} {\hskip 1cm} \text{if } x = 0\\ F(k, x) {\hskip 0.5cm} \text{otherwise}\\ \end{cases} }
 a) No, it is easy to distinguish G from a random function
 b) Yes, an attack on G would also break F
 c) It depends on F
3. Which required property of PRG is not fulfilled based on the following output from a generator:
Input  235  803  186  597  931  235  274  727 

Output  345812  971486  207319  349183  729460  345812  367428  319708 
 a) unbiased
 b) unpredictable
 c) unreproducible
 d) none of the above
4. What is value of for MCG defined as follows: , , and .
 a) 4572
 b) 3649
 c) 2892
 d) 6217
5. Categorize test based on its description.
Birthday spacings: Choose random points on a large interval. The spacings between the points should be asymptotically exponentially distributed. The name is based on the birthday paradox.
 a) theoretic test
 b) blackbox testing
Solution
 1. Answer: LCGINACTION
 First determine first 6 outputs of the generator. Get letter codes by reversed modulo 10 addition and translate the ciphertext.
 2. Answer: a
 When the adversary queries G at x = 0 they always get 0 and they know they are interacting with PRF and not truly random function.
 3. Answer: c
 Generator produced same output for two identical inputs
 4. Answer: c
 5. Answer: a