Every once in awhile I am totally blown away by math – this is definitely one of those moments.
Benford’s Law talks about numbers and how they appear in nature. He says that if you take a random collection of natural numbers, like… the number of leaves on a tree, the height of mountains, the population of foxes, mathematical constants, etc… and you look at the first digit of those numbers, the distribution will not be even.
The first digit means the following:
324524 has a first digit of 3
29 has a first digit of 2
152923840 has a leading digit of 1
So wait… Benford’s suggesting what now?!?
That if you choose random numbers from nature, the probability that the number starts with 1 is different than say… 9?
Yes. In fact, 1’s are a lot more probable than 9’s. there’s about a 30% chance that the number will be a 1 and only a 4% chance that it will be a 9. But why??
Well, basically the distribution of numbers in nature is logarithmic*, not sequential. Let’s look at an example, say… the population of foxes. Let’s suppose you had 1000 foxes. If the foxes started reproducing, it would take a lot longer for their population numbers to go from 1000 to 2000 than it would to go from 8000 to 9000.
Another fun test is mathematical constants. I looked up some Mathematical constants on Wikipedia from here. I jotted down the leading digit in Excel and plotted them here:
Wow! nearly Logarithmic! And with only 43 data points. CRAZY.
“Now wait a minute,” you might interject, “that’s all well and good for populations and math constants but earlier you said that this was also true for distances like the height of mountains… what if I changed the unit from feet to meters?”
well, actually nothing at all. The logarithmic scale still works regardless of the unit. Nature just seems to work Logarithmically.
Does it also work if we fall out of base ten? Heck yes. The actual equation is LOGb((x+1)/x) where b is your base, and x is the number of which you want to find the likelihood. So in base 10, Log (10/9) is about 4% while Log (2/1) is about 30%.
So how is this useful? Well, this phenomenon also appears in accounting. New software packages are incorporating Benford’s law to analyze financial statements looking for the frequency of leading digits. Because people tend to think that the probability of every number appearing is about equal, fraudulent numbers can become very apparent. These software packages help the authorities audit the right people. Neat huh?
A word of warning: while you can use this law for a lot of things, it’s not always as simple as it seems. The specific application might be framed as ‘naturally’ occurring but it may not be. Take, for example, the height of people. If you looked at the leading number of anyone’s height in feet, you’d probably get a ton of 5’s, a bunch of 6’s, and a handful of 4’s. But height follows a bell curve, so Benford’s law won’t work.
Oh man, math and nature are so crazy!
*for any non-math folks, logarithmic distributions look something like this (see source 2):