The Kelly Criterion: How to Size Your Bets Optimally
The Kelly Criterion tells you exactly how much to stake when you have an edge. Formula derivation, fractional Kelly, history, and worked examples for 2026.
Imagine you've found a coin that lands heads 60% of the time. You have £10,000 to bet on repeated flips, and each bet pays even money (win = double your stake, lose = lose your stake). You clearly have an edge. But how much should you bet on each flip?
Bet too little, and you're leaving money on the table. Bet too much, and a run of bad luck will wipe you out before your edge has time to compound. The extremes are obvious - betting 1% of your bankroll is overly conservative, while betting 100% guarantees eventual ruin (you only need one loss).
This is the bet-sizing problem - separate from the question of whether you have an edge at all (which is what expected value tells you) - and it has an elegant mathematical solution: the Kelly Criterion, developed by John L. Kelly Jr. at Bell Labs in 1956. Originally designed for information theory problems, it quickly found applications in gambling, investing, and anywhere else that decisions must be sized under uncertainty.
Who invented the Kelly Criterion?
From Bell Labs information theory to Las Vegas blackjack
John Larry Kelly Jr. (1923-1965) was a Texas-born physicist who joined Bell Labs in 1953 after a PhD from the University of Texas at Austin. He worked in the same group as Claude Shannon, the father of information theory, and the 1956 paper that gave us the Kelly Criterion sits squarely inside Shannon's intellectual world.
The paper - A New Interpretation of Information Rate, published in the Bell System Technical Journal - wasn't framed as a gambling formula. Kelly was answering a Shannon-style question: if you receive a noisy signal that gives you a probabilistic edge over the market price of a bet, what is the maximum rate at which information (and therefore money) can flow to you? The answer turned out to be a logarithmic utility function, and the optimal staking fraction that maximises long-run growth dropped out as a corollary.
Kelly himself never gambled on his own formula. He died of a brain haemorrhage in 1965 at age 41, having spent his career on speech synthesis and information theory rather than betting. The formula's translation to gambling was the work of Edward Thorp, an MIT mathematician who used it to size his blackjack bets at Reno and Lake Tahoe casinos in the early 1960s. Thorp later extended the same framework to convertible bond arbitrage at Princeton-Newport Partners, one of the first quantitative hedge funds.
The path from a 1956 information-theory paper to modern quantitative finance runs directly through Thorp's 1962 book Beat the Dealer, which made counting cards public and gave the Kelly Criterion its first popular audience. By the 1980s the formula was core curriculum at every quant trading desk.
What is the Kelly formula?
Maximum long-term growth in a single equation
For a simple bet with two outcomes (win or lose), the Kelly Criterion says to stake the following fraction of your bankroll:
f* = (bp - q) / b
Where:
f* = optimal fraction of bankroll to bet
b = net odds received (e.g., even money = 1, 2-to-1 = 2)
Kelly says to bet exactly 20% of your current bankroll on each flip. This fraction maximises the expected logarithm of wealth - or equivalently, maximises the long-term compound growth rate of your bankroll.
How is the Kelly formula derived?
The math behind log-wealth optimisation
The derivation is short enough to do in one sitting and worth understanding because it explains why Kelly is uniquely correct rather than one option among many.
Setup. You bet a fraction f of your bankroll W on a wager that pays net odds b with win probability p. After one bet, your bankroll is either W(1 + bf) on a win or W(1 - f) on a loss.
Step 1: Expected log-wealth. Because gains compound multiplicatively rather than additively, we want to maximise the expected logarithm of wealth rather than expected wealth itself. After N bets, log-wealth grows linearly with the per-bet expected log-return:
E[log(WN)] = log(W0) + N × E[log(1 + return)]
Step 2: One-bet expected log-return.
G(f) = p × log(1 + bf) + q × log(1 - f)
This is the expected geometric growth rate per bet as a function of the stake fraction f.
Step 3: Differentiate and set to zero.
dG/df = pb / (1 + bf) - q / (1 - f) = 0
Multiplying through and rearranging:
pb(1 - f) = q(1 + bf)
pb - pbf = q + qbf
pb - q = pbf + qbf
pb - q = bf(p + q)
f* = (bp - q) / b (since p + q = 1)
Step 4: Second derivative check. The second derivative is negative everywhere on (0, 1/b), confirming f* is a maximum and not a minimum. Kelly is therefore the unique fraction that maximises expected log-wealth - any other fraction grows your bankroll more slowly in the long run.
What this proves. The derivation is constructive: nothing was assumed about your risk preferences or your time horizon beyond the geometric-growth framing. If you bet repeatedly and care about long-run multiplicative wealth, Kelly is optimal. If you care about something else - drawdown psychology, finite horizons, uncertain edge - that's where fractional Kelly comes in.
Why does Kelly maximise growth?
The mathematics of compounding under uncertainty
The Kelly Criterion doesn't maximise expected value of your bankroll (that would mean betting everything). Instead, it maximises the expected geometric growth rate. This distinction is crucial.
With a 60% coin at even money, betting 20% each time:
After a win: bankroll × 1.20
After a loss: bankroll × 0.80
The expected growth factor per bet is:
G = (1.20)^0.60 × (0.80)^0.40 ≈ 1.020
That's about 2% growth per bet. After 100 bets, your expected bankroll is £10,000 × 1.020^100 ≈ £72,400.
Now consider what happens if you over-bet at 50% of bankroll:
After a win: bankroll × 1.50
After a loss: bankroll × 0.50
G = (1.50)^0.60 × (0.50)^0.40 ≈ 0.953
Despite having a clear edge, over-betting produces a growth factor below 1.0 - your bankroll will shrink over time! This is the tragedy of over-betting: you can have a genuine advantage and still go broke.
The deeper reason Kelly works - and why log-wealth optimisation matters - is the concept of ergodicity: for repeated personal bets, the time-averaged growth rate diverges from the ensemble expected value, and Kelly aligns you with the path you actually live.
5% of bankroll
Growth factor: 1.005 per bet (conservative)
10% of bankroll
Growth factor: 1.012 per bet
20% of bankroll (Kelly)
Growth factor: 1.020 per bet (optimal)
30% of bankroll
Growth factor: 1.017 per bet (over-betting)
40% of bankroll
Growth factor: 1.005 per bet (severely over-betting)
50% of bankroll
Growth factor: 0.953 per bet (bankroll shrinks!)
100% of bankroll
Growth factor: 0.000 (guaranteed ruin)
Full Kelly or fractional Kelly: which should you use?
Why every serious practitioner stakes less than the formula says
In theory, full Kelly is optimal. In practice, almost everyone uses a fraction of Kelly - typically between one-quarter and one-half. Here's why:
1. Probability estimates are uncertain
Kelly assumes you have a true probability p in hand. In practice that's almost never the case - you have a guess at p, and the gap between measurable risk and unmeasurable uncertainty matters enormously here. The distinction between risk and uncertainty explains why fractional Kelly is the safe default rather than a fudge factor.
Kelly assumes you know the true probability. But in real life, your estimate of p might be wrong - and as we explore in Thinking in Probabilities, our brains are systematically biased in how we assess likelihood. If you think you have a 60% edge but it's actually 52%, full Kelly based on the wrong probability can lead to severe over-betting. Calibration training is the practical fix: it shrinks the gap between your stated probabilities and reality, so the input to Kelly is honest.
2. Drawdowns are psychologically brutal
Full Kelly can produce drawdowns of 50-80% from peak. Most humans cannot stick to a strategy through such volatility. A smaller stake produces smoother growth, which means you're more likely to actually follow through.
3. Kelly assumes infinite time horizon
If you need the money within a specific timeframe, the variance of full Kelly might be unacceptable. Half-Kelly gives you 75% of the growth rate with significantly reduced variance.
4. Real-world frictions
Transaction costs, taxes, liquidity constraints, and the inability to bet exactly the right fraction all erode the theoretical optimality of full Kelly.
The common recommendation among quantitative traders and professional gamblers is half-Kelly - it captures most of the growth rate at a fraction of the variance and survives realistic levels of probability-estimate error.
Full Kelly (20%)
Max growth: 1.020/bet | Max drawdown: ~60% | P(halving): ~15%
3/4 Kelly (15%)
Growth: 1.018/bet | Max drawdown: ~45% | P(halving): ~5%
Half Kelly (10%)
Growth: 1.012/bet | Max drawdown: ~30% | P(halving): ~1%
Quarter Kelly (5%)
Growth: 1.005/bet | Max drawdown: ~15% | P(halving): <0.1%
How does Kelly handle multiple simultaneous bets?
The multivariate version and why correlation matters
The single-bet Kelly formula assumes you make one wager, observe the result, then make the next. Real bettors and portfolio managers rarely have that luxury - you have positions open on five sports matches concurrently, or hold a dozen stocks at once.
Independent simultaneous bets. When the outcomes are statistically independent, you can apply Kelly to each bet in isolation - but only if the combined stake doesn't exceed your bankroll. A practical adjustment is to compute the unconstrained Kelly fraction for each bet, sum them, and if the total exceeds 1.0, scale every position proportionally. This preserves the ratio of allocations while keeping you solvent.
Correlated bets. Correlation is where naive Kelly goes badly wrong. Suppose you've identified value on three Premier League matches all played on the same wet weekend - rain helps underdogs, so a wet weekend means your three bets are positively correlated. The single-bet Kelly formula treats them as independent and over-allocates because it ignores the compounding downside of all three losing together.
The correct fix is the multivariate Kelly formulation, which in its continuous form looks identical to Markowitz portfolio optimisation:
f* = Σ⁻¹ × μ
Where Σ is the covariance matrix of returns and μ is the vector of expected excess returns. The matrix inverse punishes correlation: two highly correlated bets get a smaller combined allocation than the sum of their independent Kelly fractions.
Practical heuristic. If you don't want to estimate a full covariance matrix, a useful rule of thumb is: estimate the correlation between each pair of bets, multiply each pair's combined Kelly stake by (1 - ρ), and shrink the total until the worst-case drawdown is acceptable. Half-Kelly with this correlation haircut survives almost every realistic correlated-bet scenario.
The practical lesson is the same in finance and sports: always ask what correlates these positions before sizing them. A perfectly Kelly-sized portfolio of correlated bets is no longer Kelly-sized at all.
How do you use Kelly for investing?
Position sizing for stocks and continuous returns
The Kelly framework extends beyond simple win/lose bets to continuous outcomes like stock returns. For an investment with expected return μ and variance σ², the Kelly fraction is approximately:
f* = μ / σ²
This is sometimes called the Merton share in finance. It says you should invest more aggressively when:
Expected returns are higher
Volatility is lower
For example, if a stock index has expected excess return of 6% per year and volatility of 16%:
f* = 0.06 / (0.16)² = 0.06 / 0.0256 ≈ 2.34
Full Kelly suggests 234% allocation - i.e., heavy leverage! This illustrates why fractional Kelly is essential in practice. Half-Kelly gives ~117%, and quarter-Kelly gives ~59%, which aligns much better with conventional investment wisdom for equity allocation.
Warren Buffett, though he likely doesn't use the formula explicitly, has described his approach in Kelly-like terms: "When you have a big edge, bet big. When you don't, don't bet at all." His concentrated portfolio reflects Kelly thinking - large positions where he has high conviction.
How did Edward Thorp apply Kelly in the real world?
From counting cards to running one of the first quant funds
If Kelly is the formula, Edward Thorp is the practitioner who proved it worked. Thorp was an MIT mathematics PhD who, in 1960, discovered that blackjack could be beaten by tracking the count of high cards remaining in the deck. The card-counting edge was small - typically 0.5 to 1.5% on individual hands - but it was real, and Kelly was the natural tool for sizing bets given that small edge.
Thorp's 1962 book Beat the Dealer documented the strategy, ran the math on a £10,000 bankroll played at Reno and Lake Tahoe, and made the Kelly Criterion famous outside Bell Labs. The book triggered a wave of casino countermeasures - more decks, faster shuffles, surveillance cameras - that still shape blackjack today.
Thorp then translated the same framework to financial markets. In 1969 he co-founded Princeton-Newport Partners, widely considered one of the first quantitative hedge funds. The fund traded convertible-bond arbitrage and warrant pricing using statistical edges and Kelly-sized positions. Between 1969 and 1988 it returned roughly 19% annually with very low correlation to the S&P 500 - a profile that became the template for the modern quant industry.
Thorp's practical lessons from those years are worth repeating:
Use fractional Kelly, because your estimate of p is always softer than the formula assumes.
Track correlation aggressively - the worst losses come when multiple positions go bad together.
Recompute your bankroll continuously rather than locking in a fraction of an initial number. Kelly is a fraction of current wealth.
Never bet what you can't afford to lose psychologically, regardless of what the math says. The strategy only works if you stick to it through drawdowns.
Thorp's autobiography A Man for All Markets remains the best primary-source account of using Kelly in practice across two completely different markets.
Worked example: betting Premier League football
Suppose you've built a model that identifies value in football match outcomes. Your model rates Team A to win at 55% probability, but the bookmaker is offering odds of 2.10 (implying only ~48% probability).
Full Kelly says bet 14.1% of your bankroll. Using half-Kelly (a more sensible practical approach), you'd bet 7.05%.
With a £5,000 bankroll, that's:
Full Kelly: £705
Half Kelly: £352
Quarter Kelly: £176
If your model is well-calibrated and you can find several such bets per week, even quarter-Kelly produces substantial long-term growth. The discipline is in never exceeding your Kelly fraction, even when you feel especially confident about a particular bet.
Why can't most people actually run Kelly?
The behavioural reality of long drawdowns
The math says full Kelly is optimal for long-run growth. The behaviour of real bettors says it's almost never the right choice. The gap between the two is psychology.
Drawdown tolerance is the binding constraint. Full Kelly with a 60% coin produces an expected peak-to-trough drawdown of around 60% over a 1000-bet horizon. That's not an unlucky tail outcome - it's the baseline. Few humans can watch their bankroll halve and then halve again and still bet the same fraction on the next flip. The behavioural response is almost always to cut stake size after a drawdown, which converts a Kelly bettor into an inadvertent anti-martingale bettor and destroys the formula's optimality.
Loss aversion biases the recovery. Behavioural finance has documented for fifty years that humans feel losses roughly twice as keenly as equivalent gains (prospect theory, Kahneman and Tversky, 1979). A bettor running full Kelly during a 50% drawdown isn't comparing the current state to the long-run growth path - they're comparing it to the peak, and the emotional pain is dominating the formula. The practical consequence: full Kelly looks fine in simulations and is unsustainable in real accounts.
The half-Kelly compromise. Half-Kelly captures three-quarters of the long-run growth rate at roughly half the drawdown depth. That trade is almost always worth it because the strategy only delivers its long-run growth if you stick to it - and the probability that you stick to a strategy is a strong inverse function of how big the drawdowns get.
The discipline problem applies to professional traders too. Renaissance Technologies, the most successful quant fund of the modern era, reportedly stakes well below full Kelly on individual signals despite having vastly better probability estimates than anyone else. The reason is the same: even institutional capital has a behavioural ceiling on drawdown tolerance, and that ceiling binds the stake-sizing decision more tightly than the math does.
When doesn't Kelly apply?
Kelly is powerful but not universal. It breaks down or requires modification in several scenarios:
Correlated bets - If you're making multiple simultaneous bets that are correlated (e.g., several tech stocks), you need a multivariate version of Kelly. Treating them independently will over-allocate.
Uncertain edge - If you're not confident in your probability estimates, Kelly will recommend stakes that are too large. This is the strongest argument for fractional Kelly.
Non-binary outcomes - Real investments have a distribution of returns, not just "win" or "lose". The formula adapts to continuous distributions, but the calculation is more complex.
Finite bankroll concerns - If losing 50% of your bankroll has non-financial consequences (stress, missed rent, relationship strain), you should use a smaller fraction regardless of what Kelly recommends.
Liquidity constraints - You might not be able to actually deploy the Kelly-optimal amount due to position limits, market depth, or capital lock-up periods.
Frequently asked questions
Q01What happens if I bet more than Kelly suggests?▾
Over-betting reduces your long-term growth rate. At exactly double the Kelly fraction, your expected growth rate drops to zero - your bankroll is expected to stay flat despite having an edge. Beyond 2× Kelly, your bankroll is expected to shrink over time. This is one of the most counterintuitive results in probability: having a genuine edge and still losing money because of reckless sizing.
Q02Does the Kelly Criterion guarantee I won't go broke?▾
Technically yes - since Kelly bets a fraction of your current bankroll, your bankroll never hits exactly zero (you can always take a fraction of a positive number). However, in practice, your bankroll can shrink to the point where it's effectively zero. And if your probability estimates are wrong, you may be inadvertently over-betting, which can lead to ruin.
Q03How is Kelly different from the Sharpe ratio?▾
The Sharpe ratio measures risk-adjusted return (reward per unit of volatility) but doesn't tell you how much to invest. Kelly tells you the optimal allocation size. They're complementary: Sharpe helps you rank opportunities, Kelly tells you how much capital to deploy on each one.
Q04Can I use Kelly for crypto or highly volatile assets?▾
You can, but the high volatility (σ²) means Kelly recommends very small positions. For an asset with 80% annual volatility and 20% expected return, Kelly gives f* = 0.20 / 0.64 = 0.31, or about 31% allocation at full Kelly. But given the fat tails in crypto returns and model uncertainty, quarter-Kelly (about 8%) would be far more prudent.
Q05Who actually uses the Kelly Criterion?▾
Professional poker players, sports bettors (the MIT Blackjack Team used it), quantitative hedge funds (Edward Thorp's Princeton-Newport Partners was an early example), and many institutional traders. The classical reference text is William Poundstone's Fortune's Formula (2005), which documents Kelly's use at Bell Labs, in Vegas blackjack, and on Wall Street.
Q06What's the difference between Kelly and the Merton share?▾
They're the same idea applied in different settings. Kelly was derived for discrete bets (win or lose at known odds). The Merton share, derived by Robert Merton in 1969 for continuous-time portfolio choice, gives the optimal stock allocation under log-utility and produces the identical f* = μ / σ² result Kelly produces in the continuous limit. Both maximise expected log-wealth and both punish high variance.
Q07How does Kelly handle estimation uncertainty?▾
Imperfectly. The basic formula treats p as known with certainty, which is almost never true. The standard practical response is fractional Kelly - typically half or quarter - which is mathematically equivalent to assuming your probability estimate is biased upward and discounting accordingly. A more rigorous response is Bayesian Kelly, which integrates over the posterior distribution of p rather than using a point estimate. In practice, half-Kelly is close enough to the Bayesian answer for most retail use cases.
Q08Why does Kelly recommend leverage for stock investing?▾
Because at typical equity risk-premium and volatility numbers (μ ≈ 6%, σ ≈ 16%), the formula f* = μ / σ² returns values above 1.0 - which mathematically means 'invest more than 100% of your bankroll'. This is leverage. The formula is correct under its assumptions (known μ and σ, log-utility, infinite horizon), but those assumptions are wrong enough in practice that nobody recommends running full Kelly on equities. Half- or quarter-Kelly with conservative μ estimates produces allocations close to standard 60/40 portfolio advice.
Q09Is Kelly the same as the geometric mean?▾
Closely related. The geometric mean of a bet's outcomes is the multiplicative growth factor per bet, and Kelly is the stake fraction that maximises that geometric mean. So Kelly bettors are explicitly geometric-mean maximisers, while arithmetic-mean maximisers (expected-value bettors) would bet everything on a positive-EV proposition and go broke in finite time.