Power law vs bell curve

How to truly understand and value science; an important lesson in mathematical modeling, from the world of complexity science.

[click to view transcript]

Fair warning: today I’m going to talk about mathematical modeling based on a paper from 2009 called From Gaussian to Paretian Thinking: Causes and Implications of Power Laws in Organizations.

So if you read research papers as often as I do, you’ll know that the data is frequently modeled using linear regression which shows confidence intervals around statistically significant findings and/or it’s analyzed using Gaussian distributions, which flatten the “long tail” of a so-called normal distribution curve.

In other words, scientific studies tend to use math to find the important shared patterns. Which is an excellent way to study a scoped problem area, but less excellent if let’s say, you have an extreme problem that the shared pattern doesn’t address. For example, I can tell you from lifetime experience that most studies on acne show findings that apply to most people, but not to me, a person with unusually reactive skin. We’ll come back to this.

Now this approach also doesn’t quite work out for studying complex systems, which are inherently not scoped problem areas, and this is where power laws come into play. A power law is a mathematical equation that lets you map out what’s happening but without losing the extremes and outliers. So you aren’t simplifying what’s happening into a linear model with confidence intervals, you’re simplifying what’s happening into a relationship between variables that still shows the so-called “heavy tail” of the long tail distribution of data. In other words, the outliers and extremes are accounted for.

And this, it turns out, is an excellent way to map out what’s happening inside organizations. This paper listed out 101 documented power laws that were identified as being relevant within organizations. The studies they listed ranged from #9. distribution of wealth, studied in 1897 and 1997; to #63. alliance networks among biotech firms, studied in 2003; all the way to #78. work incapacity from back pain, studied in 2004.

So why is this important? This paper does a great job speaking to that point from a lot of different angles, but I’ll focus on one in particular. It’s what this paper described as quote, “the growing ineffectiveness between theory and practice.”

One example they gave is for earthquakes in California. Let’s say you plotted all the earthquakes over a 10 year period using the traditional Gaussian methods, and also using Pareto power law methods. Well, with a normal distribution you’d find that almost all earthquakes are totally fine and do almost no damage. But those outliers, the extreme earthquakes, cost billions of dollars and hundreds of lives. So we need the long tail of data to make sure, for example, that the right building codes are in place. The wrong kind of math can provide completely the wrong conclusion.

Or in my personal example, of finding that almost no studies on acne provide data that’s relevant to me… I mean if I had low scientific literacy I might conclude, in my time of extreme need, that science doesn’t work. Thankfully I spent years working in materials science research labs, a few years working with cancer researchers, and then about a decade working with design researchers, so I have a decent idea of how to read research studies with the right context. Which means I spend a lot of time translating, both here and in my personal life, between research and action.

Now this paper is focused on the implications of research for organizational action, and as they state, “Researchers ignoring power-law effects risk drawing false conclusions and promulgating useless advice to practitioners. This is because what is important to most managers are the extremes they face, not the averages.”

The takeaway here is this: Science has nuance and it’s hard to understand nuance when we’re in extremes, especially when the nuance is in the form of which mathematical equations are being applied to the data. What I like about this paper is that it calls out the problem and points to the real-world implications of what math gets used when.

I hope this gives us all a bit more insight into how to understand research findings, because to truly understand and value science, we need the ability to both recognize shared patterns as real while recognizing that outliers and extremes may also be important to look at. Thanks for listening.

Source: https://pubsonline.informs.org/doi/abs/10.1287/orsc.1090.0481 (PDF available via Google Scholar)