Home

Blog

Contact

Resume for Steven E. Newton

Papers

Readings for Code Janitors

Services


Steven E. Newton
Crater Moon Development
© 2003-2023

Unit Testing Randomness

16 Oct 2022 569 words. (3 min)

The Problem

How do you unit test for randomness? Other than asking for n randoms and verifying that you do get n randoms, what else can you do?

In a manifestly typed languages it makes no sense to test that the randoms are of the correct type. You can test to see if the numbers are generated in the range you specify, eg, if you ask for a random number between 1 and 10 and you get 11 or -23904 you know something is wrong.

In this article we’ll look at the following possible techniques:

  • Simple minded counting
  • Runs
  • Statistical properties

Simple But Wrong

The simple-minded way isn’t sufficient: generate a whole bunch of randoms between 1 and 10, count the occurences of each, and see if it forms an equal distribution, as described in Test The Randomness of a Random Number Generator

The reason this isn’t sufficient is that a random number generator could be competely predictable and still generate an equal distribution, but the predictability makes it cryptographically non-random.

Another way is known as using runs:

Generate a bunch of randoms and check to make sure the results don’t look like 11111888888666663333366666 ... or 0123456789012345678901234567890123456789 ..

Note that while either of these results would pass the simple distribution test, they are clearly not random.

Defining Randomness

What does it mean to be random? Bruce Schneier discusses this in his Applied Cryptography. There are a number of statistical properties of randomness.

they should have about the same number of ones and zeros, about half the runs (sequences of the same bit) should be of length one, one quarter of length two, one eigth of length three, and so no. They should not be compressible. The distribution of run lengths for zeros and ones should be the same. These properties can be emprically measured and then compared to statistical expectations using a chi-square test. (p. 45)

See also Knuth, D The Art of Computer Programming: Volume 2, Seminumerical Algorithms

However, Schneier throws out two more properties that blow the idea of a unit test out of the water. First, a sequence must be unpredictable, or “computationally infeasible to predict what the next random bit will be”, and second, “It cannot be reliably reproduced”.

Conclusion

What does that leave? A unit test that is sophisticated enough to check the statistical properties of a run is likely to be complex enough that it will itself need unit tests to determine its correctness. A unit test that is simple doesn’t really determine randomness.

This question concretely applies to a Java implementation of a random number generator that replaces the psuedo-random generator in java.util.Random with a code that uses the /dev/random device on Linux (and other modern Unix-like operating systems) for use in a cryptographic application. It’s possible to write a test which verifies that the class adheres to the same interface as the built-in library generator, and that it generates the number of randoms requested and within a specified range, but to write a unit test that tests for true randomness is another matter.

Example

  one test case per class, named ClassNameTestCase
  one test method per public method, named testMethodName
  one test method per private algorithm method, named testMethodName
  in each test method, test good results and failures
    sometimes split test method into two:
      testMethodName
      testMethodNameFailures
   sometimes, in addition to one test case per class,
     add test cases for special "inter-class" logic
<-Functions on the Heap XPiphany ->