Pseudo random number generators (PRNGs) play a big role in software development. They aren’t, of course, random. Instead, a PRNG executes a deterministic algorithm in which the next number it produces is a function of its most recent value, and perhaps a seed value. The output appears random, meaning it would require a very large sample to determine that the series of generated values is not from an actual uniform random distribution over some defined interval.
The multiplicative congruential generator is a typical example of a PRNG:
A truly random process has a minute probability of generating a value that was recently generated, but a characteristic of the multiplicative congruential generator and other PRNGs is that they create a very long sequence of values. The sequence eventually will repeat, but only after every value in the range has been visited. In short, a PRNG does not produce duplicate values within a number of samples.
Code Acting Badly
In a recent software project, I was using a standard library method called rand() to return pseudo random numbers. I used these values as handles, to identify internal instances of classes that implemented various functions. These handles needed to be unique, since they identified specific internal instances of classes.
When I tried supporting concurrent operations, my code began acting erratically. There appeared to be race conditions. This was baffling, since the code was quite simple and the mutexes I used should have prevented any problems.
Finally, I was able to diagnose the problem: rand() was returning duplicate values within a very short span. I inadvertently had two or more threads that were given the same value for a handle. So unexpectedly, they operated on the same class instances, which was especially problematic after one of them closed and freed the internal instances and removed the handle from the collection of open handles.
The fix was straight-forward: keep generating candidate new handle values from rand() until one is found that is distinct from all the handles currently in use. The delay was thinking that rand() could possibly be the source of the problem.
Why Does This Happen?
As described in the opening section, the next pseudo random number returned by a PRNG is a function of the most recent pseudo random number it returned and perhaps some other initial values. My theory is that each thread keeps its own state for use by rand(), much as each thread has its own stack.
Support for this theory is that the Windows srand() function, which sets the seed value for rand(), only affects the thread from which it is called. The seed value in other threads remains unchanged.
SInce I needed unique handles rather than a realistic pseudo random sequence, I was able to solve my problem by calling rand() until it gave me a candidate handle value that was not currently in use. But what do you do if your multi-threaded app needs a pseudo random sequence that is used by different threads? Perhaps the approach would be to run rand() or your other PRNG from a dedicated thread, and have other threads retrieve pseudo random numbers from the generator through IPC or a producer/consumer queue.
See For Yourself!
I’ve written a tiny GitHub project, RepeatRand, that illustrates this behavior. It builds and runs on Windows.