Book Search

Download this chapter in PDF format

Chapter34.pdf

Table of contents

How to order your own hardcover copy

Wouldn't you rather have a bound book instead of 640 loose pages?
Your laser printer will thank you!
Order from Amazon.com.

Chapter 34: Explaining Benford's Law

Solving Mystery #1

There are two main mysteries in Benford's law. The first is this: Where does the logarithmic pattern of leading digits come from? Is it some hidden property of Nature? We know that ost(g) is a constant value of 0.301 if Benford's law is being followed. Using Fig. 34-5 we can find where this number originates. By definition, the average value of ost(g) is OST(0); likewise, the average value of sf(g) is SF(0). However, OST(0) is always equal to SF(0), since PDF(0) has a constant value of one. That is, the average value of ost(g) is equal to the average value of sf(g), and does not depend on the characteristics of pdf(g). As shown above, the average value of sf(g) is log(2) - log(1) = 0.301, which dictates that the average value of ost(g) is also 0.301. If we repeated this procedure looking for 2 as the leading digit, the average value of sf(g) would be log(3) - log(2) = 0.176. The remaining digits, 3-9, are handled in the same way. In answer to our question, the logarithmic pattern of leading digits derives solely from sf(g) and the convolution, and not at all from pdf(g). In short, the logarithmic pattern of leading digits comes from the manipulation of the data, and has nothing to do with patterns in the numbers being investigated.

This result can be understood in a simple way, showing how Benford's law resembles a magician's slight of hand. Say you tabulate a list of numbers appearing in a newspaper. You tally the histogram of leading digits and find that they follow the logarithmic pattern. You then wonder how this pattern could be hidden in the numbers. The key to this is realizing that something has been concealed– a big something.

Recall the program in Table 34-1, where lines 400-430 extract the leading digit of each number. This is done by multiplying or dividing each number repeatedly by a factor of ten until it is between 1 and 9.999999. This manipulation of the data is far from trivial or benign. You don't notice this procedure when manually tabulating the numbers because your brain is so efficient. But look at what this manipulation involves. For example, successive numbers might be multiplied by: 0.01, 100, 0.1, 1, 10, 1000, 0.001, and so on.

This changes the numbers in a pattern based on powers of ten, i.e., the anti-logarithm. You then examine the processed data and marvel that it looks logarithmic. Not realizing that your brain has secretly manipulated the data, you attribute this logarithmic pattern to some hidden feature of the original numbers. Voila! The mystery of Benford's law!

Next Section: Solving Mystery #2