An Introduction to Benford's Law

An Introduction to Benford's Law

by Arno Berger, Theodore P. Hill


View All Available Formats & Editions
Use Standard Shipping. For guaranteed delivery by December 24, use Express or Expedited Shipping.

Product Details

ISBN-13: 9780691163062
Publisher: Princeton University Press
Publication date: 05/26/2015
Pages: 256
Product dimensions: 6.00(w) x 9.40(h) x 0.80(d)

About the Author

Arno Berger is associate professor of mathematics at the University of Alberta. He is the author of Chaos and Chance: An Introduction to Stochastic Aspects of Dynamics. Theodore P. Hill is professor emeritus of mathematics at the Georgia Institute of Technology and research scholar in residence at the California Polytechnic State University.

Read an Excerpt

An Introduction to Benford's Law

By Arno Berger, Theodore P. Hill


Copyright © 2015 Princeton University Press
All rights reserved.
ISBN: 978-1-4008-6658-8



Benford's law, also known as the First-digit or Significant-digit law, is the empirical gem of statistical folklore that in many naturally occurring tables of numerical data, the significant digits are not uniformly distributed as might be expected, but instead follow a particular logarithmic distribution. In its most common formulation, the special case of the first significant (i.e., first non-zero) decimal digit, Benford's law asserts that the leading digit is not equally likely to be any one of the nine possible digits 1, 2, ..., 9, but is 1 more than 30% of the time, and is 9 less than 5% of the time, with the probabilities decreasing monotonically in between; see Figure 1.1. More precisely, the exact law for the first significant digit is

Prob(D1 = d) = log10 (l + [1/d]) for all d, = 1, 2, ..., 9; (1.1)

here, D1 denotes the first significant decimal digit, e.g.,

D1(√2) = D1(1.414) = 1,

D1(π-1) = D1 0.3183) = 3,

D1(eπ) = D1(23.14) = 2.

Hence, the two smallest digits occur as the first significant digit with a combined probability close to 50 percent, whereas the two largest digits together have a probability of less than 10 percent, since

Prob (D1 = 1) = log10 2 = 0.3010, Prob(D1 = 2) = log10 3/2 = 0.1760,


Prob (D1 = 8) = log10 9/8 = 0.05115, Prob(D1 = 9) = log10 10/9 = 0.04575.

The complete form of Benford's law also specifies the probabilities of occurrence of the second and higher significant digits, and more generally, the joint distribution of all the significant digits. A general statement of Benford's law that includes the probabilities of all blocks of consecutive initial significant digits is this: For every positive integer m, and for all initial blocks of m significant digits (d1, d2, ..., dm), where d1 is in {1, 2, ..., 9}, and dj is in {0,1, ..., 9} for all j ≥ 2,


where D2, D3, D4, etc. represent the second, third, fourth, etc. significant decimal digits, e.g.,

D2(√2) = 4, D3(π-1) = 8, D4(eπ) = 4.

For example, (1.2) yields the probabilities for the individual second significant digits,


which also are not uniformly distributed on all the possible second digit values 0, 1, ..., 9, but are strictly decreasing, although they are much closer to uniform than the first digits; see Figure 1.1.

More generally, (1.2) yields the probabilities for longer blocks of digits as well. For instance, the probability that a number has the same first three significant digits as π = 3.141 is

Prob (D1 = 3, D2 = 1, D3 = 4)= log10 (1 + 1/314) = log10 315/314 = 0.001380.

A perhaps surprising corollary of the general form of Benford's law (1.2) is that the significant digits are dependent, and not independent as one might expect [74]. To see this, note that (1.3) implies that the (unconditional) probability that the second digit equals 1 is


whereas it follows from (1.2) that if the first digit is 1, the (conditional) probability that the second digit also equals 1 is

Prob(D2 = 1| D1 = 1) = log10 12 - log10 11/log10 2 = 0.1255.

Note. Throughout, real numbers such as √2 and π are displayed to four correct significant decimal digits. Thus an equation like √2 = 1.414 ought to be read as 1414 ≤ 1000 · √2 < 1415, and not as √2 = 1414/1000. The only exceptions to this rule are probabilities given in percent (as in Figure 1.1), as well as the numbers Δ and Δ∞, introduced later; all these quantities only attain values between 0 and 100, and are shown to two correct digits after the decimal point. Thus, for instance, Δ = 0.00 means 0 ≤ 100 · Δ < 1, but not necessarily Δ = 0.


The first known reference to the logarithmic distribution of leading digits dates back to 1881, when the American astronomer Simon Newcomb noticed "how much faster the first pages [of logarithmic tables] wear out than the last ones," and, after several short heuristics, deduced the logarithmic probabilities shown in the first two rows of Figure 1.1 for the first and second digits [111].

Some fifty-seven years later the physicist Frank Benford rediscovered the law [9], and supported it with over 20,000 entries from 20 different tables including such diverse data as catchment areas of 335 rivers, specific heats of 1,389 chemical compounds, American League baseball statistics, and numbers gleaned from front pages of newspapers and Reader's Digest articles; see Figure 1.2 (rows A, E, P, D and M, respectively).

Although P. Diaconis and D. Freedman offer convincing evidence that Benford manipulated round-off errors to obtain a better f t to the logarithmic law [47, p. 363], even the unmanipulated data are remarkably close. Benford's article attracted much attention and, Newcomb's article having been overlooked, the law became known as Benford's law and many articles on the subject appeared. As R. Raimi observed nearly half a century ago [127, p. 521],

This particular logarithmic distribution of the first digits, while not universal, is so common and yet so surprising at first glance that it has given rise to a varied literature, among the authors of which are mathematicians, statisticians, economists, engineers, physicists, and amateurs.

The online database [24] now references more than 800 articles on Benford's law, as well as other resources (books, websites, lectures, etc.).


Many tables of numerical data, of course, do not follow Benford's law in any sense. Telephone numbers in a given region typically begin with the same few digits, and never begin with a 1; lottery numbers in all common lotteries are distributed uniformly, not logarithmically; and tables of heights of human adults, whether given in feet or meters, clearly do not begin with a 1 about 30% of the time. Even "neutral" mathematical data such as square-root tables of integers do not follow Benford's law, as Benford himself discovered (see row K in Figure 1.2 above), nor do the prime numbers, as will be seen in later chapters.

On the other hand, since Benford's popularization of the law, an abundance of additional empirical evidence has appeared. In physics, for example, D. Knuth [90] and J. Burke and E. Kincanon [31] observed that of the most commonly used physical constants (e.g., the speed of light and the force of gravity listed on the inside cover of an introductory physics textbook), about 30% have leading significant digit 1; P. Becker [8] observed that the decimal parts of failure (hazard) rates often have a logarithmic distribution; and R. Buck et al., in studying the values of the 477 radioactive half-lives of unhindered alpha decays that were accumulated throughout the past century, and that vary over many orders of magnitude, found that the frequency of occurrence of the first digits of both measured and calculated values of the half-lives is in "good agreement" with Benford's law [29]. In scientif c calculations, A. Feldstein and P. Turner called the assumption of logarithmically distributed mantissas "widely used and well established" [57, p. 241]; R. Hamming labeled the appearance of the logarithmic distribution in floating-point numbers "well-known" [70, p. 1609]; and Knuth observed that "repeated calculations with real numbers will nearly always tend to yield better and better approximations to a logarithmic distribution" [90, p. 262].

Additional empirical evidence of Benford's law continues to appear. M. Nigrini observed that the digital frequencies of certain entries in Internal Revenue Service files are an extremely good fit to Benford's law (see [113] and Figure 1.3); E. Ley found that "the series of one-day returns on the Dow-Jones Industrial Average Index (DJIA) and the Standard and Poor's Index (S&P) reasonably agrees with Benford's law" [98]; and Z. Shengmin and W. Wenchao found that "Benford's law reasonably holds for the two main Chinese stock indices" [148]. In the field of biology, E. Costas et al. observed that in a certain cyanobacterium, "the distribution of the number of cells per colony satisfies Benford's law" [39, p. 341]; S. Docampo et al. reported that "gross data sets of daily pollen counts from three aerobiological stations (located in European cities with different features regarding vegetation and climatology) fit Benford's law" [49, p. 275]; and J. Friar et al. found that "the Benford distribution produces excellent fits" to certain basic genome data [60, p. 1].

Figure 1.3 compares the probabilities of occurrence of first digits predicted by (1.1) to the distributions of first digits in four datasets: the combined data reported by Benford in 1938 (second-to-last row in Figure 1.2); the populations of the 3,143 counties in the United States in the 2010 census [102]; all numbers appearing on the World Wide Web as estimated using a Google search experiment [97]; and over 90,000 entries for Interest Received in U.S. tax returns from the IRS Individual Tax Model Files [113]. To instill in the reader a quantitative perception of closeness to, or deviation from, the first-digit law (1.1), for every distribution of the first significant decimal digit shown in this book, the number

Δ = 100 · max9d=1 |Prob(D1 = d) - log10(1 + [1/d])|

will also be displayed. Note that Δ is simply the maximum difference, in percent, between the probabilities of the first significant digits of the given distribution and the Benford probabilities in (1.1). Thus, for example, Δ = 0 indicates exact conformance to (1.1), and Δ = 12.08 indicates that the probability of some digit d [member of] {1, 2, ..., 9} differs from log10(1 + d-1) by 12.08%, and the probability of no other digit differs by more than this.

All these statistics aside, the authors also highly recommend that the justifiably skeptical reader perform a simple experiment, such as randomly selecting numerical data from front pages of several local newspapers, or from "a Farmer's Almanack" as Knuth suggests [90], or running a Google search similar to the Dartmouth classroom project described in [97].


Since the empirical significant-digit law (1.1) or (1.2) does not specify a well-defined statistical experiment or sample space, most early attempts to explain the appearance of Benford's law argued that it is "merely the result of our way of writing numbers" [67] or "a built-in characteristic of our number system" [159]. The idea was to first show that the set of real numbers satisfies (1.1) or (1.2), and then suggest that this explains the empirical statistical evidence. A common starting point has been to try to establish (1.1) for the positive integers, beginning with the prototypical set

(D1 = 1} = {1, 10, 11, ..., 18, 19, 100, 101, ..., 198, 199, 1000, 1001, ...},

the set of positive integers with first significant digit 1. The source of difficulty and much of the fascination of the first-digit problem is that the set (D1 = 1} does not have a natural density among the integers, that is, the proportion of integers in the set (D1 = 1} up to N, i.e., the ratio

#{1 ≤ n ≤ N : D1(n) = 1}/N, (1.4)

does not have a limit as N goes to infinity, unlike the sets of even integers or primes, say, which have natural densities 1/2 and 0, respectively. It is easy to see that the empirical density (1.4) of {D1 = 1} oscillates repeatedly between 1/9 and 5/9, and thus it is theoretically possible to assign any number between 1/9 and 5/9 as the "probability" of this set. Similarly, the empirical density of {D1 = 9} forever oscillates between 1/81 and 1/9; see Figure 1.4.

Many partial attempts to put Benford's law on a solid logical basis have been made, beginning with Newcomb's own heuristics, and continuing through the decades with various urn model arguments and mathematical proofs; Raimi [127] has an excellent review of these. But as the eminent logician, mathematician, and philosopher C. S. Peirce once observed, "in no other branch of mathematics is it so easy for experts to blunder as in probability theory" [63, p. 273], and the arguments surrounding Benford's law certainly bear that out. Even W. Feller's classic and hugely influential text [58] contains a critical f aw that apparently went unnoticed for half a century. Specifically, the claim by Feller and subsequent authors that "regularity and large spread implies Benford's Law" is fallacious for any reasonable def nitions of regularity and spread (measure of dispersion) [21].


A crucial part of (1.1), of course, is an appropriate interpretation of Prob. In practice, this can take several forms. For sequences of real numbers (x1, x2, ...), Prob usually refers to the limiting proportion (or relative frequency) of elements in the sequence for which an event such as {D1 = 1} occurs. Equivalently, fix a positive integer N and calculate the probability that the first digit is 1 in an experiment where one of the elements x1, x2, ..., xN is selected at random (each with probability 1/N); if this probability has a limit as N goes to infinity, then the limiting probability is designated Prob(D1 = 1). Implicit in this usage of Prob is the assumption that all limiting proportions of interest actually exist. Similarly, for real-valued functions f : [0, + ∞) -> R, fix a positive real number T, choose a number τ at random uniformly between 0 and T, and calculate the probability that f(τ) has first significant digit 1. If this probability has a limit, as T -> + ∞, then Prob(D1 = 1) is that limiting probability.

For a random variable or probability distribution, on the other hand, Prob simply denotes the underlying probability of the given event. Thus, if X is a random variable, then Prob (D1(X) = 1) is the probability that the first significant digit of X is 1. Finite datasets of real numbers can also be dealt with this way, with Prob being the empirical distribution of the dataset.

One of the main themes of this book is the robustness of Benford's law. In the context of sequences of numbers, for example, iterations of linear maps typically follow Benford's law exactly; Figure 1.5 illustrates the convergence of first-digit probabilities for the Fibonacci sequence (1,1, 2, 3, 5, 8,13, ...). As will be seen in Chapter 6, not only do iterations of most linear functions follow Benford's law exactly, but iterations of most functions close to linear also follow Benford's law exactly. Similarly, as will be seen in Chapter 8, powers and products of very general classes of random variables approach Benford's law in the limit; Figure 1.6 illustrates this starting with U(0,1), the standard random variable uniformly distributed between 0 and 1. Similarly, if random samples from different randomly-selected probability distributions are combined, the resulting meta-sample also typically converges to Benford's law; Figure 1.7 illustrates this by comparing two of Benford's original empirical datasets with the combination of all his data.


Excerpted from An Introduction to Benford's Law by Arno Berger, Theodore P. Hill. Copyright © 2015 Princeton University Press. Excerpted by permission of PRINCETON UNIVERSITY PRESS.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

Preface vii

1 Introduction 1

1.1 History 3

1.2 Empirical evidence 4

1.3 Early explanations 6

1.4 Mathematical framework 7

2 Significant Digits and the Significand 11

2.1 Significant digits 11

2.2 The significand 12

2.3 The significand σ-algebra 14

3 The Benford Property 22

3.1 Benford sequences 23

3.2 Benford functions 28

3.3 Benford distributions and random variables 29

4 The Uniform Distribution and Benford's Law 43

4.1 Uniform distribution characterization of Benford's law 43

4.2 Uniform distribution of sequences and functions 46

4.3 Uniform distribution of random variables 54

5 Scale-, Base-, and Sum-Invariance 63

5.1 The scale-invariance property 63

5.2 The base-invariance property 74

5.3 The sum-invariance property 80

6 Real-valued Deterministic Processes 90

6.1 Iteration of functions 90

6.2 Sequences with polynomial growth 93

6.3 Sequences with exponential growth 97

6.4 Sequences with super-exponential growth 101

6.5 An application to Newton's method 111

6.6 Time-varying systems 116

6.7 Chaotic systems: Two examples 124

6.8 Differential equations 127

7 Multi-dimensional Linear Processes 135

7.1 Linear processes, observables, and difference equations 135

7.2 Nonnegative matrices 139

7.3 General matrices 145

7.4 An application to Markov chains 162

7.5 Linear difference equations 165

7.6 Linear differential equations 170

8 Real-valued Random Processes 180

8.1 Convergence of random variables to Benford's law 180

8.2 Powers, products, and sums of random variables 182

8.3 Mixtures of distributions 202

8.4 Random maps 213

9 Finitely Additive Probability and Benford's Law 216

9.1 Finitely additive probabilities 217

9.2 Finitely additive Benford probabilities 219

10 Applications of Benford's Law 223

10.1 Fraud detection 224

10.2 Detection of natural phenomena 225

10.3 Diagnostics and design 226

10.4 Computations and Computer Science 228

10.5 Pedagogical tool 230

List of Symbols 231

Bibliography 234

Index 245

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews