Take a collection of seemingly random numbers, for instance the gross domestic product of 212 countries, and then examine the leading digit. For instance, for the number $435 million (the 2015 GDP of Tonga) the leading digit is 4. The leading digits will of course be from 1-9, since 0 cannot be a leading digit. How would you expect the digits 1-9 to be distributed? Randomly, with each digit appearing as a leading digit approximately 11 percent of the time? That is not the case. The distribution for many data sets follows Benford’s Law.
Above is the actual distribution of leading digits for the GDP of 212 countries, showing that it approximates what Benford’s Law predicts: 1 will appear as a leading digit 30.1% of the time, 2 will appear 17.6 % of the time, 3 is 12.5%, and the frequency continues to decrease. Here is the GDP data in table form:
|Leading digit||Count||Distribution||In Benford’s Law|
The result is not a quirk of the units used. The above numbers were in U.S. dollars, but you will get a similar result if the numbers are converted into rubles or yen. Similarly, the lengths of the rivers of the world follow Benford’s Law, whether they are measured in kilometers, miles, or feet. The law is scale invariant, so multiplying the numbers in a data set by any constant yields the same distribution of first digit frequencies.
For an intuitive explanation of why Benford’s Law works, imagine that you have a savings account with $100 in it, and you begin to track your balance on a daily basis as you add to your savings. As the balance begins to approach $200, any balances in between must start with 1, and to get from $100 to $200, your account must double in size. This may take a while, so you will have many daily balances that start with 1. To get from $200 to $300, your account must increase by 50 percent, a lesser but still significant leap. By the time you get to $900, the jump to $1,000 only requires an 11 percent increase, so it happens relatively quickly, and there are fewer days with a balance starting with 9. But upon reaching $1,000, the next jump up to $2,000 again requires doubling the amount, so there will be many daily balances starting with 1.
The phenomenon was first observed by Simon Newcomb, who noticed that in his book of logarithm tables, the earlier pages, starting with 1, were more worn than later pages. Frank Benford tested the law on a variety of data, from U.S. populations to molecular weights, the surface areas of rivers, and numbers contained in an issue of Reader’s Digest.
The law does not apply to randomly chosen numbers such as lottery numbers, or to numbers within a short range, such as human heights.
Benford’s Law is used to detect accounting fraud. An embezzler making fraudulent withdrawals will often try to make the amounts seem random, and may thus use each leading digit with approximately the same frequency. However, the amounts of legitimate entries will follow Benford’s Law.