Random number generation

A random number generator (RNG) is a computational or physical device designed to generate a sequence of numbers or symbols that can not be reasonably predicted better than by a random chance.
Various applications of randomness have led to the development of several different methods for generating random data, of which some existed since ancient times, includingdice, coin flipping, the shuffling of playing cards, the use of yarrow stalks (by divination) in the I Ching, and many other techniques. Because of the mechanical nature of these techniques, generating large numbers of
sufficiently random numbers (important in statistics) required a lot of work and/or time. Thus, results would sometimes be collected and distributed as random number tables. Nowadays, after the advent of computational random number generators, a growing number of government-run lotteries, and lottery games, are using RNGs instead of more traditional drawing methods. RNGs are also used to determine the odds of modern slot machines.
Several computational methods for random number generation exist. Many fall short of the goal of true randomness, although they may meet, with varying success, some of thestatistical tests for randomness intended to measure how unpredictable their results are (that is, to what degree their patterns are discernible). However, carefully designed cryptographically secure computationally based methods of generating random numbers do exist, such as those based on the Yarrow algorithm and the Fortuna (PRNG) and others.
Random number generators have applications in gambling, statistical sampling, computer simulation, cryptography, completely randomized design, and other areas where producing an unpredictable result is desirable. Generally, in applications having unpredictability as the paramount, such as in security applications, hardware generators are generally preferred over pseudo-random algorithms, where feasible.
Random number generators are very useful in developing Monte Carlo-method simulations, as debugging is facilitated by the ability to run the same sequence of random numbers again by starting from the same random seed. They are also used in cryptography – so long as the seed is secret. Sender and receiver can generate the same set of numbers automatically to use as keys.
The generation of pseudo-random numbers is an important and common task in computer programming. While cryptography and certain numerical algorithms require a very high degree of apparent randomness, many other operations only need a modest amount of unpredictability. Some simple examples might be presenting a user with a "Random Quote of the Day", or determining which way a computer-controlled adversary might move in a computer game. Weaker forms of randomness are used in hash algorithms and in creating amortized searching and sorting algorithms.
Some applications which appear at first sight to be suitable for randomization are in fact not quite so simple. For instance, a system that "randomly" selects music tracks for a background music system must only appear random, and may even have ways to control the selection of music: a true random system would have no restriction on the same item appearing two or three times in succession.
There are two principal methods used to generate random numbers. The first method measures some physical phenomenon that is expected to be random and then compensates for possible biases in the measurement process. Example sources include measuring atmospheric noise, thermal noise, and other external electromagnetic and quantum phenomena. For example, cosmic background radiation or radioactive decay as measured over short timescales represent sources of natural entropy.
The speed at which entropy can be harvested from natural sources is dependent on the underlying physical phenomena being measured. Thus, sources of naturally occurring "true" entropy are said to be blocking – they are rate-limited until enough entropy is harvested to meet the demand. On some Unix-like systems, including most Linux distributions, the pseudo device file /dev/random will block until sufficient entropy is harvested from the environment. Due to this blocking behavior, large bulk reads from/dev/random, such as filling a hard disk drive with random bits, can often be slow on systems that use this type of entropy source.
The second method uses computational algorithms that can produce long sequences of apparently random results, which are in fact completely determined by a shorter initial value, known as a seed value or key. As a result, the entire seemingly random sequence can be reproduced if the seed value is known. This type of random number generator is often called a pseudorandom number generator. This type of generator typically does not rely on sources of naturally occurring entropy, though it may be periodically seeded by natural sources. This generator type is non-blocking, so they are not rate-limited by an external event, making large bulk reads a possibility.
Some systems take a hybrid approach, providing randomness harvested from natural sources when available, and falling back to periodically re-seeded software-basedcryptographically secure pseudorandom number generators (CSPRNGs). The fallback occurs when the desired read rate of randomness exceeds the ability of the natural harvesting approach to keep up with the demand. This approach avoids the rate-limited blocking behavior of random number generators based on slower and purely environmental methods.
While a pseudorandom number generator based solely on deterministic logic can never be regarded as a "true" random number source in the purest sense of the word, in practice they are generally sufficient even for demanding security critical applications. Indeed, carefully designed and implemented pseudo-random number generators can be certified for security-critical cryptographic purposes, as is the case with the yarrow algorithm and fortuna. The former is the basis of the /dev/random source of entropy onFreeBSD, AIX, OS X, NetBSD and others. OpenBSD also uses a pseudo-random number algorithm based on ChaCha20 known as arc4random.
The earliest methods for generating random numbers, such as dice, coin flipping and roulette wheels, are still used today, mainly in games and gambling as they tend to be too slow for most applications in statistics and cryptography.
A physical random number generator can be based on an essentially random atomic or subatomic physical phenomenon whose unpredictability can be traced to the laws ofquantum mechanics. Sources of entropy include radioactive decay, thermal noise, shot noise, avalanche noise in Zener diodes, clock drift, the timing of actual movements of ahard disk read/write head, and radio noise. However, physical phenomena and tools used to measure them generally feature asymmetries and systematic biases that make their outcomes not uniformly random. A randomness extractor, such as a cryptographic hash function, can be used to approach a uniform distribution of bits from a non-uniformly random source, though at a lower bit rate.
Various imaginative ways of collecting this entropic information have been devised. One technique is to run a hash function against a frame of a video stream from an unpredictable source. Lavarand used this technique with images of a number of lava lamps. HotBits measures radioactive decay with Geiger–Muller tubes,whileRandom.org uses variations in the amplitude of atmospheric noise recorded with a normal radio.
Another common entropy source is the behavior of human users of the system. While people are not considered good randomness generators upon request, they generate random behavior quite well in the context of playing mixed strategy games.Some security-related computer software requires the user to make a lengthy series of mouse movements or keyboard inputs to create sufficient entropy needed to generate random keys or to initialize pseudorandom number generators.
Pseudorandom number generators (PRNGs) are algorithms that can automatically create long runs of numbers with good random properties but eventually the sequence repeats (or the memory usage grows without bound[citation needed]). The series of values generated by such algorithms is generally determined by a fixed number called a seed.One of the most common PRNG is the linear congruential generator, which uses the recurrence
X_{n+1} = (a X_n + b)\, \textrm{mod}\, m
to generate numbers, where a, b and m are large integers, and X_{n+1} is the next in X as a series of pseudo-random numbers. The maximum number of numbers the formula can produce is the modulus, m. To avoid certain non-random properties of a single linear congruential generator, several such random number generators with slightly different values of the multiplier coefficient, a, can be used in parallel, with a "master" random number generator that selects from among the several different generators.[citation needed]
A simple pen-and-paper method for generating random numbers is the so-called middle square method suggested by John von Neumann. While simple to implement, its output is of poor quality.
Most computer programming languages include functions or library routines that provide random number generators. They are often designed to provide a random byte or word, or a floating point number uniformly distributed between 0 and 1.
The quality i.e. randomness of such library functions varies widely from completely predictable output, to cryptographically secure. The default random number generator in many languages, including Python, Ruby, R, IDL and PHP is based on the Mersenne Twister algorithm and is not sufficient for cryptography purposes, as is explicitly stated in the language documentation. Such library functions often have poor statistical properties and some will repeat patterns after only tens of thousands of trials. They are often initialized using a computer's real time clock as the seed, since such a clock generally measures in milliseconds, far beyond the person's precision. These functions may provide enough randomness for certain tasks (for example video games) but are unsuitable where high-quality randomness is required, such as in cryptography applications, statistics or numerical analysis.[citation needed]
Much higher quality random number sources are available on most operating systems; for example /dev/random on various BSD flavors, Linux, Mac OS X, IRIX, and Solaris, orCryptGenRandom for Microsoft Windows. Most programming languages, including those mentioned above, provide a means to access these higher quality sources.