Hardware random number generator
In
computing, a
hardware random number generator is an apparatus that generates
random numbers from a physical process. Such devices are typically based on microscopic phenomena such as
thermal noise or the
photoelectric effect or other
quantum phenomena. These processes are, in theory, completely unpredictable, and the theory's assertions of unpredictability are subject to experimental test. A quantum-based hardware random number generator typically contains an
amplifier to bring the output of the physical process into the
macroscopic realm, and a
transducer to convert the output into a digital signal.
Random numbers generators can also be obtained from macroscopic phenomena, such as
cards,
dice, and the
roulette wheel. This unpredictability can be justified by the theory of unstable
dynamical systems and
chaos theory. These theories suggest that even though macroscopic phenomena are deterministic in theory under
Newtonian mechanics, real-world systems evolve in ways that cannot be predicted in practice because one would need to know the initial conditions to an accuracy that
grows exponentially over time.
Although dice are mostly used for
gambling, the
Victorian scientist
Francis Galton described a way to use dice to generate random numbers for scientific purposes in 1890. Even though some gamblers believe they can control their throws of dice enough to win at
craps (a claim which remains a topic of debate), no one (outside the
paranormal community) claims that someone not party to the throwing can predict the outcomes.
Hardware random number generators are often relatively slow, and they may produce a biased sequence (i.e., some values are more common than others). Whether a hardware random number generator is suitable for a particular application depends on the details of both the application and the generator.
Most computer "random number generators" are not hardware devices, but are software routines implementing
algorithms. Often they are supplied as
library routines in
programming language implementations, or as part of the
operating system. These are more properly called
pseudo-random number generators, and cannot produce truly random outputs.
Algorithmic information theory defines a sequence of
bits as non-random if it can be produced by some computer program that is shorter than that sequence (
Chaitin-Kolmogorov randomness). Pseudo-random number generators fail that test dramatically. They can usually be programmed in a few thousand bits, but can produce far larger sequences, with periods so long no plausible computer or combination of them could exhaust a single period before the
heat death of the Universe.
There are several informal definitions of
randomness, usually based on either a lack of discernible
patterns in a sequence, or the unpredictability of the sequence or various aspects of it by, generally, the most puissant possible adversary. Output from well-designed pseudo-random number generators should pass assorted statistical tests probing for non-randomness (see
NIST Special Publication 800-22,
Knuth,
The Art of Computer Programming, vol. 2, and RFC 1750 for details of many such tests).
The sequences such generators produce always have patterns, in one sense, since the algorithm that generates them has a finite state, and must, eventually, repeat one of those states. Given the original state of the generator, and the implementation of the algorithm, a pseudo-random number generator of this sort is totally predictable. Given even partial knowledge of that state, they are often insecure for many purposes. On the other hand, it has become relatively easy to produce pseudo-random number generators that are guaranteed not to repeat on any conceivable computer within a time-frame that is millions of times longer than the
age of the universe. It is an open question whether it is always possible, in some practical way, to distinguish the output of such a pseudo-random number generator from a that of a perfectly random source without knowledge of the generator's internal state. Much of modern cryptography rests on the assumption that
ciphers can be constructed whose output is indistinguishable from random noise without knowledge of a secret
key used in the algorithm.
"Unpredictable" random numbers were first investigated in the context of
gambling, and many randomizing devices such as
dice,
shuffling playing cards, and
roulette wheels, were first developed for use in gambling. Fairly produced random numbers are vital to electronic gambling and ways of creating them are sometimes regulated by governmental gaming commissions.
"Random" numbers are also used for non-gambling purposes, both where their use is mathematically important, such as sampling for
opinion polls, and in situations where "fairness" is approximated by
randomization, such as selecting
jurors and
military draft lotteries. Their use is antique as well. In the
Book of Numbers (33:54),
Moses commands the Israelites to apportion the land by lot (×'ורל). And the drawing of lots, often pottery shards, are well attested in the Classical world of Greece and Rome. Some of these very shards have been recovered.
One early way of producing random numbers was by a variation of the same machines used to play
keno or select
lottery numbers. Basically, these mixed numbered ping-pong balls with blown air, and used some method to withdraw balls from the mixing chamber. This method gives reasonable results, but the random numbers generated by this means are expensive. The method is inherently slow, and is unusable in most mechanized situations, (i.e., with computers).
In
1927,
Cambridge University Press published a book of
Random sampling numbers, arranged by a statistician, Leonard Henry Caleb Tippett, which contained 41,600 digits taken from English parishes listed in census records. Other random number tables were published in that era, including one by
R. A. Fischer and another by the U.S.
Interstate Commerce Commission in 1949 with over 100,000 random digits.
The culminating printed work is
A Million Random Digits with 100,000 Normal Deviates, published by the
RAND Corporation in
1955. They created an electronic simulation of a roulette wheel attached to a
key punch, the results of which were then carefully filtered and tested before being used to generate the table. The RAND table was a significant breakthrough in delivering random numbers because such a large and carefully prepared table had never before been available. It was useful for simulations and modeling. But, having been published, it is not usable as cryptographic keys, or for preparing them, or as a seed in some cryptographic pseudo-random number generator. However, since it was published long before modern cryptography, using values from it for the random (if not unknown) constants for initializing algorithms would demonstrate that the constants had not been selected for (in B. Schneier's words) "nefarious purpose(es)."
Khufu and Khafre do this, for example (ref Applied Cryptography - B. Schneier).
See: Nothing up my sleeve numbers.
There are two fundamental sources of physical randomness: quantum mechanics at the atomic or sub-atomic level and thermal noise (some of which is quantum mechanical in origin). Quantum mechanics predicts that certain physical phenomena, such as the
nuclear decay of atoms, are fundamentally random and cannot, in principle, be predicted. (For a discussion of empirical verification of quantum unpredictability, see
Bell test experiments.) And because we live at a finite temperature, every system has some random variation in its state; for instance, molecules of air are constantly bouncing off each other in a random way. (See
statistical mechanics.) This randomness is a quantum phenomenon as well (See
phonon.)
Because the outcome of quantum-mechanical events cannot in principle be predicted, they are the 'gold standard' for randomness. Some quantum phenomena used for random number generation include:
* A
nuclear decay radiation source (as, for instance, from some kinds of commercial
smoke detectors), detected by a
Geiger counter attached to a PC.
*
Photons travelling through a
semi-transparent mirror, as in the commercial product, Quantis from id Quantique SA. The mutually exclusive events (reflection â€" transmission) are detected and associated to "0" or "1" bit values respectively.
Thermal phenomena are easier to detect. They might be (somewhat) vulnerable to attack by lowering the temperature of the system, though most systems will stop operating at temperatures (e.g., ~150 K) low enough to reduce noise by a factor of two. Some of the thermal phenomena used include:
*
thermal noise from a
resistor, amplified to provide a random voltage source.
*
Avalanche noise generated from an
avalanche diode or Zener breakdown noise from a reverse-biased
zener diode.
*
Atmospheric noise, detected by a radio receiver attached to a PC (though much of it (like lightning noise) is not, properly, thermal noise)
Another variable physical phenomenon that is easy to measure is
clock drift.
In the absence of quantum or thermal noise, other phenomena that tend to be random, although in ways not easily characterized by laws of physics, can be used. With several such sources, combined carefully (as in, for example, the
Yarrow algorithm or
Fortuna CSPRNGs), enough entropy can be collected for the occasional creation of cryptographic keys and
nonces. The advantage is that this needs, in principle, no special hardware. The primary source of randomness normally used in such approaches is the precise timing of the
interrupts caused by mechanical input/output devices, such as keyboards and
disk drives, various system information, etc.
This last approach must be implemented carefully and may be subject to attack if it is not. For instance, the generator built into the Linux kernel may be vulnerable to an attack [
1]. The random number generator used cryptographically in an early version of the
Netscape browser was certainly vulnerable (and was promptly changed).
One approach in using physical randomness is to convert a noise source into random bits in a separate device that is then connected to the computer through an I/O port. The acquired noise signal is amplified, filtered, and then run through a high-speed voltage comparator to produce a logic signal that alternates states at random intervals. Care must always be taken when amplifying low-level noise to keep out spurious signals, such as power line hum and unwanted broadcast transmissions, and to avoid adding bias during acquisition and amplification. In some simple designs, the fluctuating logic value is converted to an
RS-232 type signal and presented to a computer's serial port. Software then sees this series of logic values as bursts of "
line noise" characters on an I/O port. More sophisticated systems may format the bit values before passing them into a computer.
Another approach is to feed an analog noise signal to an
analog to digital converter, such as the audio input port built into most personal computers. The digitized signal may then be processed further in software to remove bias. However, digitization is itself often a source of bias, sometimes subtle, so this approach requires considerable care.
Some have suggested using digital cameras, such as
webcams, to photograph chaotic macroscopic phenomena. A group at
Silicon Graphics imaged
Lava lamps to generate random numbers. One problem was determining whether the chaotic shapes generated were random -- the team decided that they are in properly operating Lava lamps. Other chaotic scenes could be employed, such as streamers blown by a computer's exhaust fan or, probably, bubbles in a
fish tank (fish optional). The digitized image will generally contain additional noise resulting from the video to digital conversion process.
The bit-stream from such systems
is prone to be biased, with 1s or 0s predominating. There are two approaches to dealing with bias and other artifacts. The first is to engineer the RNG to minimize bias. One method to correct this feeds back the generated bit stream, filtered by a low-pass filter, to adjust the bias of the generator. By the
central limit theorem, the feedback loop will tend to be well-adjusted almost all the time. Ultra-high speed random number generators use this method. Even then, the numbers generated are usually somewhat biased.
A higher quality device might use two sources and eliminate signals that are common to both—this eliminates interference from outside electric and magnetic fields. This is recommended for gambling devices, to prevent cheating by exploiting bias in the "random bit" stream.
An old
Intel 80802 Firmware Hub chip included a hardware RNG using two free running oscillators, one fast and one slow. A thermal noise source (non-commonmode noise from two diodes) is used to modulate the frequency of the slow oscillator, which then triggers a measurement of the fast oscillator. That output is then debiased using a
von Neumann type decorrelation step (see below). The output rate of this device is somewhat less than 100,000 bps. This chip was an optional component of the 840 chipset family that supported an earlier Intel bus. It is not included in modern PCs.
All
VIA C3 microprocessors have included a hardware RNG on the processor chip since 2003. Instead of using thermal noise, raw bits are generated by using four freerunning oscillators which are designed to run at different rates. The output of two are XORed to control the bias on a third oscillator, whose output clocks the output of the fourth oscillator to produce the raw bit. Minor variations in temperature, silicon characteristics, and local electrical conditions cause continuing oscillator speed variations and thus produce the entropy of the raw bits. To further insure randomness, there are actually two such RNGs on each chip, each positioned in different environments and rotated on the silicon. The final output is a mix of these two generators. The raw output rate is tens to hundreds of megabits per second, and the whitened rate is a few megabits per second. User software can access the generated random bit stream using new non-priveleged machine language instructions.
A software implementation of a related idea on ordinary hardware is included in
CryptoLib, a cryptographic routine library (JB Lacy, DP Mitchell, WM Schell, CrytoLib: Cryptography in software, Proc 4th USENIX Security Symp, pg 1-17, 1993). The algorithm is
truerand. Most modern computers have two crystal oscillators, one for the real-time clock and one for the primary CPU clock; truerand exploits this fact. It uses an operating system service that sets an alarm, running off the real-time clock. One subroutine sets that alarm to go off in one clock tick (usually 1/60th of a second). Another then enters a while loop waiting for the alarm to trigger. Since the alarm will not always trigger in exactly one tick, the least significant bit of a count of loop interations, between setting the alarm and its trigger, will vary randomly, possibly enough for some uses. truerand doesn't require additional hardware, but in a multi-tasking system great care must be taken to avoid non-randomizing interference from other processes (e.g., in the suspension of the counting loop process as the operating system scheduler starts and stops assorted processes).
Software whitening
A second approach to bias is to remove it (in software or hardware). Even if the above hardware bias reduction steps have been taken, the bit-stream should
still be assumed to contain bias and correlation. There are various methods used to try to reduce bias and correlation, often known by the name "
whitening" algorithms, by analogy with the related problem of producing white noise from a correlated signal.
John von Neumann invented a simple algorithm to fix simple bias, and reduce correlation: it considers bits two at a time, taking one of three actions: when two successive bits are the same, they are not used as a random bit, a sequence of 1,0 becomes a 1, and a sequence of 0,1 becomes a zero. This eliminates simple bias, and is easy to implement as a computer program or in digital logic. This technique works no matter how the bits have been generated. It cannot assure randomness in its output, however. What it can do (with significant numbers of discarded bits) is transform a random bit stream with a frequency of 1's different from 50% into a stream closer to that frequency.
Another technique for improving a near random bit stream is to the bit stream with the output of a high-quality
cryptographically secure pseudorandom number generator such as
Blum Blum Shub or a good
stream cipher. This can cheaply improve decorrelation and digit bias.
A related method which improves a near random bit stream is to take two or more uncorrelated near random bit streams, and
exclusive or them together. Let the probability of a bit stream producing a 0 be 1/2 +
e, where -1/2 ≤
e ≤ 1/2. Then
e is the bias of the bitstream. If two uncorrelated bit streams with bias
e are exclusive-or-ed together, then the bias of the result will be 2
e2. This may be repeated with more bit streams. (See also
Piling-up lemma.)
Some designs apply cryptographic
hash functions such as
MD5,
SHA-1, or
RIPEMD-160 or even a
CRC function to all or part of the bit stream, and then use the output as the random bit stream. This is attractive, partly because it is relatively fast compared to some other methods, but depends entirely on qualities in the hash output (i.e., randomness) for which there is little or no theoretical basis.
Other designs use "true random bits" as the
key for a high quality
block cipher algorithm, taking the encrypted output as the random bit stream. Care must be used in these cases to select an appropriate
block mode, however.
When a high-speed source of low quality random digits is needed, sometimes a true random source is used to generate the seed for a pseudo random number generator. The PRNG is run for a fixed number of digits, while the hardware generating device produces a new seed. This is unacceptable for most key generation purposes, since it effectively reduces the size of the key to the size of the PRNG seed.
There are mathematical techniques for estimating the
entropy of a sequence of symbols. These are useful for determining if there is enough entropy in a seed pool, for example, but they cannot, in general, distinguish between a true random source and a pseudo-random generator.
Software engineers without true random number generators often try to develop them by measuring physical events available to the software. An example is measuring the time between user keystrokes, and then taking the least significant bit (or two or three) of the count as a random digit. The time variation between keystokes has been found to be patternless, if measured with sufficiently high resolution. A similar approach measures task-scheduling, network hits, disk-head seek times and other internal events. One Microsoft design includes a very long list of such internal values (see the
CSPRNG article).
The method is risky when it uses computer-controlled events because a clever, malicious programmer might be able to predict a cryptographic key by controlling the external events. Several gambling frauds have been uncovered which rely on manipulating normally hidden events internal to the operation of computers or networks. It is also risky because the supposed user-generated event (e.g., keystrokes) can be
spoofed, allowing an attacker to control the "random values" used by the cryptography.
However, with sufficient care, a system can be designed that produces cryptographically secure random numbers from the sources of randomness available in a modern computer. The basic design is to maintain an "entropy pool" of random bits that are assumed to be unknown to an attacker. New randomness is added whenever available (for example, when the user hits a key) and an estimate of the number of bits in the pool that cannot be known to an attacker is kept. At this point there are several strategies:
* When random bits are requested, return that many bits derived from the entropy pool (by a cryptographic hash function, say) and decrement the estimate of the number of unknown bits. If not enough unknown bits are available, wait until enough are available. This is the design of the "
/dev/random" device in Linux and other Unix-like operating systems, which provides high-quality random numbers as long as the estimates of the input randomness are sufficiently cautious. The Linux "/dev/urandom" device is a simple modification that disregards estimates of input randomness and is less likely to have high entropy as a result.
* Maintain a
stream cipher with a key and IV obtained from an entropy pool. When enough bits of entropy have been collected, replace both key and IV with new random values and decrease the estimated entropy in the pool. This is the approach taken by the
yarrow library. It provides resistance against certain attacks and conserves hard-to-obtain entropy.
There is a
Perl module called
Math that generates random numbers from interrupt timing discrepancies.
It is very easy to misconstruct devices which generate random numbers. Also, they 'break' silently, often producing decreasingly random numbers as they degrade. An example might be the rapidly decreasing radioactivity of the smoke detectors mentioned earlier. As the radioactive intensity decreases, its sensor will be required to compensate, not an easily accomplished task. Failure modes in such devices are plentiful and are neither easy, quick, nor inexpensive to detect.
Because many entropy sources are often quite fragile, and fail silently, statistical tests on their output should be performed continuously. Many, but not all, such devices include some such tests into the software that reads the device.
Just as with other components of a cryptosystem, a software random number generator should be designed to resist certain attacks.
See: random number generator attack.
Defending against these attacks is difficult.
Hardware random number generators should be constantly monitored for proper operation. RFC 4086 and
FIPS Pub 140-2 include such tests. Also see the documentation for the New Zealand cryptographic software library
cryptlib. Statistical tests can detect failure of a noise source, or a radio station transmitting on a channel thought to be empty, for example. Generator output should be sampled for testing before being passed through a "whitener." This allows better verification of the underlying noise generation. Detecting some deviation from perfection would be a sign that a true noise source is being tested. Correlation of bias with other parameters such as internal temperature or bus voltage would be a further check.
*
Pseudorandom number generator*
Random number generator*
Randomness*
Bell test experiments*
ERNIE*
Lottery machine*
Peter Gutmann of the
University of Auckland has produced an extensive analysis of "random number" generation with computers (and of several actual designs) in the documentation for the
cryptlib cryptographic tool kit.
*
RFC 4086 on Randomness Recommendations for Security (Replaces earlier RFC 1750.)
*
A Statistical Test Suite for the Validation of Random Number Generators and Pseudo Random Number Generators for Cryptographic Applications NIST Special Publication 800-22
*
An article on the history of generating random numbers at the American Scientist Online.
*
Random Number Generator built from combinational and sequential logic blocksInternet services
World wide web based sources of random bits may be suitable for research purposes but should never be used for cryptographic security since their corruption will be impossible to detect.
*
Web Application Allowing Download of Quantum Random Numbers on Demand by
id Quantique and
University of Geneva*
HotBits provides private radioactively-produced random numbers.
*
random.org claims to deliver private random numbers generated from atmospheric radio noise.
*
RandomNumber.org claims to provide for download many large files of random numbers generated using the ComScire J1000KU hardware true random number generator.
*
KenoRND() claims to provide random numbers generated from live keno results from real casinos
* The
Global Correlations in Random Data has some
pictures of the hardware random number generators it uses, as well as an online database of the random numbers it has collected.
Manufacturers
*
ComScire*
Random HG324 *
Protego *
true-random.com Hardware random number generators and their theory
*
Intel RNG *
VIA RNG (Included on the CPU itself)
*
LavaRnd (Utilises readily available webcams to provide cryptographically strong random numbers)
*
RBI Quantum RNG (A quantum random number generator)
*
Quantis Quantum RNG (A quantum random number generator)
Code
*
iwrandom.c (C code for generating random numbers from Wi-Fi background noise)
*
random.c (Linux strong random number generator; the initial C code and theory of operation)
*
video_entropyd (Randomness from video)
*
audio_entropyd (Randomness from audio)
*
A Million Random Digits with 100,000 Normal Deviates by the
RAND Corporation*
Francis Galton. "Dice for statistical experiments",
Nature 42:13-14, 1890.
(Facsimile at: [2])