54

I was trying to solve a hobby problem that required generating a million random numbers. But I quickly realized, it is becoming difficult to make them unique. I picked up Algorithm Design Manual to read about random number generation.

It has the following paragraph that I am fully not able to understand.

Unfortunately, generating random numbers looks a lot easier than it really is. Indeed, it is fundamentally impossible to produce truly random numbers on any deterministic device. Von Neumann [Neu63] said it best: “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.” The best we can hope for are pseudo-random numbers, a stream of numbers that appear as if they were generated randomly.

Why is it impossible to produce truly random numbers in any deterministic device? What does this sentence mean?

9 Answers9

67

One should look for a cryptographically secure pseudo-random number generator. Most PRNG are linear congruence generators (so next number is a linear function of previous number), so if you plot next number vs previous number you'll get a chart of parallel lines. A CSPRNG will not do that. The trade-off is that they're slow.

I group random number generators into 3 categories:

  1. Good enough for homework.
  2. Good enough to bet your company on.
  3. Good enough to bet your country on.

Why is it impossible to produce truly random numbers in any deterministic device ?

A deterministic device will always produce the same output when given the same starting conditions and inputs - that is what it means to be deterministic. "Truly random number" is more of a philosophical viewpoint, as what does it mean to be random is the crux of the philosophical navel gazing (folks aren't even certain if atomic decay is random or follows some pattern we just can't figure out yet). A cryptographically secure random number generator is going to take some external source of entropy to make the device non-deterministic.

Tangurena
  • 13,324
23

True randomness implies nondeterminism. If it's deterministic, it can be accurately predicted (this is what determinism means); if it can be predicted, it is not random.

The best thing you can get from a deterministic pseudo-random number generator is a stream of numbers that has a very long cycle (non-repeating is impossible unless your RNG device has unlimited storage) which, for the length of the cycle, produces a stream numbers that meets all the other properties of a random sequence (a uniform distribution of values being the most interesting one).

To solve this issue, many modern UNIXes and Unix-likes have kernel RNG's that use physical noise sources to generate true randomness.

Another common approach is to take the current time as the seed for a deterministic RNG (srand(time(NULL)); in C); cryptographically speaking, this is worthless, since the current time is no secret, but for things like physical simulations or video games, it is good enough.

tdammers
  • 52,936
10

The second chapter of the book Discrete-Event Simulation: A First Course by Lawrence Leemis gives a fantastic introduction to random number generators (or more accurately, psuedo-random number generators).

An excerpt from his book explains it well in my opinion:

Historically three types of random number generators have been advocated for computational applications: (a) 1950's-style table look-up generators like, for example, the RAND corporation table of a million random digits; (b) hardware generators like, for example, thermal "white noise" devices; and (c) algorithmic (software) generators. Of these three types, only algorithmic generators have achieved widespread acceptance. The reason for this is that only algorithmic generators have the potential to satisfy all of the following generally well-accepted random number generation criteria. A generator should be:

  • random - able to produce output that passes all reasonable statistical tests of randomness;
  • controllable - able to reproduce its output, if desired;
  • portable - able to produce the same output on a wide variety of computer systems;
  • efficient - fast, with minimal computer resource requirements;
  • documented - theoretically analyzed and extensively tested.

So while it could be possible to use a white-noise generator to get "better" random numbers, they have not gained acceptance because they do not follow most of the criteria above.

I would recommend that you get your hands on a copy of that book (or on something similar). Understanding exactly how PRNG's work will definitely assist you in your efforts.

riwalk
  • 7,690
9

Because You need to write code to generates the random numbers and Code is NOT random. (It's deterministic)

So you wind up starting with a "Seed value(s)" that is picked at "Random" (usually the current time stamp) then use it in an algorithm to start generating numbers. But the entire set of is based off the original Seed value!

So if you run your code again with the exact same Seed value(s), you will get the EXACT same SET of numbers! How can any reasonably person call that random? But it sure does LOOK random.


Regarding making them unique, After generating a number simply check if you already have that number, if you do, throw it away and generate a new one.

Morons
  • 14,706
6

I have a very simple definition of Pseudo Random:

Too many unknown variables to predict.

I also have a simple definition of True Random:

Infinite unknown variables.

The problem with a computer is that it always knows ALL variables. The random number is simply a mathematical function of some seed value.
The best we can do is to give the computer a pseudo-random seed value, which is usually based off a variable that we can't predict (such as exact time).

Even though a computer is absolutely unable to create a random number, it is good at introducing too many variables to predict!

5

Since you are generating random numbers, you should expect the generated values to be non-unique. This is a property of randomness - you can't say a sequence of truly random (or even pseudo-random) numbers is unique, because that requirement would allow the final value in the range to be predicted, as well as changing the probability of all the unchosen numbers each time a new one is selected.

James McLeod
  • 7,603
4

Generating truly random numbers in software is indeed not possible as others have pointed out, however it is possible with hardware to build a device which can generate truly random numbers*. There are quite a few examples of this on the internet, and there are a variety of methods used, from reading the time between ticks on Geiger counter to sampling the white noise (mostly background radiation from the universe) of an untuned receiver. I myself have built a few using a few of the methods available.

*Any good physics geek will point out that given the way the universe operates none of these are hyper-technicaly truely random but there is no reasonable way to predict the results so for the sake of this discussion they are sufficiant.

Unkwntech
  • 141
3

There is no way you can produce a random number without a special hardware. In my freshman year, a couple of classmates and I proposed a random number generator that has basically a AM receiver and tuned to 4 different channels, get the input into a A to D converter and add them all (modulo your max number). Since the combination of analog input from any arbitrary number of stations is random and we could produce a large number of random numbers from the A2D convertor we proposed this could be a good generator. Of course, even this is not truly random in a philosophical sense, though for most practical purposes this could work.

2

Determinism is essentially a function. Remember from Algebra that a function is a correspondence between a domain and range such that each member of the domain corresponds to exactly one member of the range.

So if f(x) = z, f(x) != y unless y is z. That is a function. Imagine JavaScript:

function Add(A, B) {
      return A + B;
}

var addedNumber = Add(2,3);//returns 5
addedNumber = Add(2,3);//still 5

No matter how many times you call Add(2,3) it will always return 5. In other words, Add() is a deterministic function.

External factors can make Add behave in a non-deterministic fashion. For example, if you introduce multithreading into the equation. Human input also causes non-determinism.

Now, this is where things get interesting.

“Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.”

Note Von Neumann states, "arithmetical methods of producing [...]". This is not talking about human input, concurrency, sample wind speeds read from a precise instrument or other non-algorithmic ways of producing random input to a deterministic function.

This simply states a function or system of functions is not going to suddenly become non-deterministic. In other words, Add(2,3) will not somehow return 6 or anything other than 5 given the same inputs. That is impossible.

The quoting author takes it a step further.

The best we can hope for are pseudo-random numbers, a stream of numbers that appear as if they were generated randomly.

The context is previously defined to be "on any deterministic device". I could end the argument here. But, what if we change up the context by introducing a new element to the system? A non-deterministic element added as input makes the system a non-deterministic system. Although, by removing the non-deterministic element we are reduced back to a deterministic system. If we can somehow trace or otherwise reproduce the inputs we can reproduce a result. But this entire paragraph is tangetenial to what the author is saying. Remember the context.

One could argue over the meaning of non-determinism. Once again, tangetenial. Remember the context.

So he is correct. On any deterministic device it is impossible for a deterministic system to produce a true random result.

P.Brian.Mackey
  • 11,121
  • 8
  • 53
  • 88