Regression to the Mean: Why Extremes Don't Last

Regression to the mean is the statistical phenomenon where unusually extreme measurements — outstanding performance, terrible results, surprise winners — tend to be followed by more average ones. It is one of the most underappreciated ideas in statistics, partly because it is invisible until you understand it, and partly because once you do, you cannot stop seeing it everywhere. Misreading regression to the mean is behind a huge fraction of bad business decisions, false faith in unproven medical treatments, and confused commentary about sports, markets, and education.

This guide covers what the effect actually is, where Francis Galton discovered it in 1886, four common real-world examples (Sports Illustrated cover jinx, fund manager performance, the placebo trap, regression to mediocrity in school cohorts), and a checklist for spotting it in your own thinking.

What regression to the mean actually means

Not magic — just maths plus a confusion about what 'extreme' implies

Most measurable outcomes — sales figures, exam scores, sporting performances, blood pressure readings — combine a stable underlying signal with random noise. Skill, conditions, sample peculiarities, and luck all bake into the number you observe. When the number you measure is unusually high, it is high partly because the underlying signal is good, and partly because the noise was favourable. Repeat the measurement, and the noise re-rolls. The next observation tends to be closer to the underlying signal than the extreme reading was.

The same logic runs the other way. If a fund manager has a terrible year, some of that is genuine underperformance and some is bad luck. The next year, the bad luck does not necessarily continue, so the manager looks better. Crucially, this is true even when nothing about the manager has changed.

Two technical points are worth holding on to. First, regression to the mean only happens when there is some genuine noise — if outcomes were entirely deterministic, extreme readings would just keep happening. Second, it works most strongly when the correlation between two measurements is weakest. The less the underlying signal explains, the more the second measurement looks like the population average rather than a near-copy of the first.

Where the idea came from: Galton's sweet peas

The Victorian polymath who noticed offspring rarely matched their parents' extremes

Sir Francis Galton — Charles Darwin's half-cousin and the founder of much of modern statistics — discovered the effect in the 1880s while studying the seeds of sweet peas. He sorted parent peas by size and tracked the offspring. To his surprise, the children of giant peas were on average smaller than their parents, and the children of tiny peas were on average bigger. Both groups had drifted toward the population average.

He found the same thing in human heights. Tall fathers had sons who were tall but not as tall on average. Short fathers had sons who were short but not as short. He called this 'regression toward mediocrity' — a slightly miserable label that has stuck in the maths world ever since, although modern statisticians prefer the neutral 'regression toward the mean'.

Galton initially mistook this for an inheritance phenomenon, as if biology were pulling everyone toward an average size. The mathematician Karl Pearson later showed it was simply a property of imperfectly correlated measurements: any time two measurements share less than perfect correlation, the extreme values on one will drift toward average on the other. No biological mechanism required.

Example 1: the Sports Illustrated cover jinx

Why the next season tends to disappoint after a magazine cover

For decades, fans have joked about the Sports Illustrated cover jinx — the apparent curse where athletes who appear on the magazine's cover go on to underperform in the following season. There are dozens of high-profile cases, from injured stars to hitters who lose their batting eye, that fuel the legend.

The jinx is real in the sense that the average post-cover season really does tend to be worse. But the cause is not magic. Athletes appear on the cover precisely because they have just had an extraordinary run — a hot streak, a record-breaking month, a championship win. That extreme performance combines genuine skill with favourable random variation. The next season, the variation re-rolls. Their underlying ability has not changed, but the universe is no longer cooperating to the same degree, and they look worse.

The same effect explains why rookies-of-the-year often have weaker sophomore seasons, why the Madden NFL cover has its own apparent curse, and why a basketball player who scores 50 points one night usually scores fewer the next. Each of these involves selecting on extreme observations and then watching the noise drift back to baseline.

Example 2: fund manager performance and the star fund effect

Why last year's top fund is a poor pick for next year

Every year, a few mutual funds dramatically beat the index. Investors flock to them, magazines run features, the manager gets called a genius. The following year, those same funds usually drift back toward the average — not because the manager has changed, but because some of last year's outperformance was luck, and luck does not persist.

This pattern has been documented exhaustively. The S&P SPIVA scorecards consistently show that of the top 25% of US equity funds in any given year, only a small fraction stay in the top 25% the next year — barely better than chance. Pick the very top fund, and it is roughly as likely to land in the bottom half next year as the top half. The same regression-to-mean logic explains why active funds collectively cannot beat the index after fees: the year-on-year persistence in 'skill' is too weak to overcome the cost drag.

If you want a deeper dive into how to think about this kind of probabilistic edge in investing, we cover related ideas in our guide to risk vs uncertainty and the Kelly criterion for bet sizing.

Example 3: the placebo trap in medicine

Why anecdotal cures look impressive even when treatments do nothing

Imagine you have a chronic back problem. Most weeks the pain hovers around a 4 out of 10. Occasionally it spikes to 8 — and on one of those bad days, you finally try a new supplement, an alternative therapy, or a fad treatment. A few days later, the pain is back to 4. You credit the treatment.

The trap is that you would be just as likely to feel better even if you had taken nothing. People typically seek treatment when symptoms are at their worst, and symptoms generally regress to the personal average over time, regardless of intervention. Because the treatment was applied at the peak, the natural drift back to baseline gets falsely attributed to the treatment.

This is one of the central reasons that medicine relies on randomised controlled trials with placebo arms. Without a control group experiencing the same regression to the mean without the active treatment, the apparent effect of any intervention will be inflated. Bayesian thinking is the main antidote — see our guide on Bayesian thinking for the underlying framework.

Example 4: the schools that 'turn around' versus the schools that 'collapse'

Education league tables are dominated by regression — not policy

Every year, a handful of schools post dramatically improved results, sometimes after a new headteacher has been appointed. Other schools see a sharp drop, often after a perceived management failure. The press attributes this to leadership, the new curriculum, or the latest reform. In reality, much of it is regression to the mean.

Consider a school that finished bottom of the league last year. That ranking reflects a mix of genuine challenges (deprivation, intake, staffing) and luck (a difficult cohort, an unusually low-achieving exam group). This year, the unlucky cohort has moved on; the new cohort is closer to the school's underlying baseline. The school's results bounce upward, and whoever is in charge takes the credit.

The reverse happens at top schools. A school riding a remarkable cohort to the top of the table will, all else equal, drop the following year as the next cohort regresses toward the school's actual baseline. This is one of the reasons that educational researchers warn against drawing causal conclusions from year-on-year league-table movements without a careful comparison group.

How to spot regression in your own thinking

Five questions that catch most cases

If you are about to make a decision based on a recent extreme — a personnel change, an investment decision, a treatment switch, a strategy pivot — run the observation through this checklist before committing.

Was the case selected because it was extreme? If yes, the next observation will tend toward average regardless of any action you take.
Could the underlying mechanism have changed since the extreme observation? If no, you are looking at noise. If yes, separate the skill change from the noise drift before judging effectiveness.
Is there a control group? Anecdote, single-case stories, and individual testimonials cannot distinguish treatment effect from regression. A peer comparison without the intervention does.
How much of the original measurement is signal? Quick rule: the lower the correlation between repeated measurements, the more dramatic the regression. Domains like sports, weather, finance, and mood readings have low autocorrelation and dramatic regression. Domains like physical measurements (height, weight) have high autocorrelation and small regression.
What does the population average suggest? The expected next observation, in the absence of new information, is usually closer to the population mean than to the recent extreme. Anchor your forecast there.

Why this matters for everyday decisions

Three patterns where mistaking regression costs real money or wellbeing

Three real-world patterns repay the effort of internalising the idea.

Hiring and promotion decisions. Companies often promote people on the back of one outstanding quarter or year. If that performance was partly luck, the promoted person regresses to a more average level — and the company concludes that promotion ruined them. The Peter Principle (people rising to their level of incompetence) overlaps heavily with regression to the mean. Better promotion decisions average performance over multiple cycles before drawing conclusions.

Investment chasing. Investors who switch out of a fund after a bad year and into one after a great year are buying high and selling low — exactly the wrong move when both will tend to regress. The discipline of dollar-cost averaging into broad indexes sidesteps this trap entirely. Combine it with a probabilistic framework for evaluating choices and most of the noise becomes manageable.

Performance feedback in sport, education, and business. Praising someone after an exceptional performance and then watching them regress can feel like the praise hurt them. Criticising after a terrible performance and watching them improve can feel like the criticism worked. Both are mostly regression. The classic study by Daniel Kahneman of Israeli flight instructors found exactly this effect, and the false lesson the instructors drew was that punishment works better than praise — when in fact neither response was driving the change.

Frequently asked questions

Is regression to the mean the same as the gambler's fallacy?

No — they are nearly opposite errors. The gambler's fallacy assumes random outcomes (like coin flips) become 'due' to balance after a streak, which is wrong because each flip is independent. Regression to the mean applies when measurements include both signal and noise, and the next noisy measurement tends back toward the underlying average. We have a separate guide on the gambler's fallacy if you want the contrast.

Does regression mean my exceptional performance was luck?

Not necessarily — it means some of it was. If you score 90% on a hard exam, your underlying ability is probably above average and the noise was probably also favourable. The next score will tend to be lower, but still well above the population mean. Skill anchors the regression at your personal baseline, not the population baseline.

How do I decide whether an effect is regression or a real change?

Three steps. First, check whether the case was selected on the original extreme — if yes, regression is the default explanation until proven otherwise. Second, look for a comparable control group that did not receive the intervention; their drift tells you what regression alone would predict. Third, look at the size of the change — a regression-only effect should pull the case roughly halfway back to the group average, no further. Larger effects suggest something genuine has happened.

Can regression to the mean be used to make money?

Sometimes, in mean-reverting markets — pairs trading, statistical arbitrage, and certain commodities strategies all exploit regression. But the cost of misidentifying a structural change as 'just noise' is high, and the trade-offs are far more subtle than they look. Most retail attempts to time mean reversion in equities lose money to fees, slippage, and timing errors.

Does regression apply to my own self-assessment over time?

Yes, and it is one of the reasons people overweight recent good or bad days. A great day at work is partly luck. A terrible day is partly luck. Tracking weekly or monthly averages catches the underlying signal far better than any single day, which is one reason habit-tracking apps and decision journals tend to beat one-off self-evaluation.

Regression to the mean is one of those ideas that, once internalised, quietly improves the quality of every decision involving uncertainty. It is the default explanation for the cover jinx, the star fund, the alternative therapy that worked, the school that bounced back. None of those stories require a hidden mechanism — just an awareness that extreme observations are partly luck, and that luck does not persist.

For a wider toolkit on probabilistic reasoning, see our guides to correlation vs causation and thinking in probabilities.

Want to think more clearly about uncertainty?

Our reading list has 15 books that build the foundations of probabilistic thinking, from Tversky and Kahneman to Howard Marks.

See the best decision-making books