Decision Trees: A Visual Framework for Complex Choices

What a Decision Tree Actually Is

Three node types, one repeatable algorithm

A decision tree is a diagram of a choice that branches forward through time. You sketch every decision you control, every uncertain event you don't, and the outcomes at the end. Then you collapse the tree back into a single number — an expected value for each option — by rolling the maths backwards from right to left.

The technique is older than most readers assume. It was formalised by Howard Raiffa at Harvard in the 1960s and has been standard practice in medical decision analysis, oil-and-gas exploration, and operations research ever since. The reason it never went away is simple: it forces you to write down the things you'd otherwise leave vague — the probability of each scenario, the value of each outcome, the order in which information arrives. Most bad decisions are bad because those numbers stayed in someone's head.

There are exactly three node types. Master these and you can model almost any decision:

Decision nodes (drawn as squares) represent a choice you control. Each branch leaving the square is an option — "buy", "don't buy", "wait six months".
Chance nodes (drawn as circles) represent an uncertainty you don't control. Each branch leaving the circle is a possible state of the world, labelled with a probability. The probabilities on the branches must sum to 1.0.
Terminal nodes (drawn as triangles) sit at the end of every path. Each one carries the value of that complete sequence of choices and chance outcomes — usually money, but it could be life-years gained, utility, or any consistent unit.

Why a Tree Beats a Pros-and-Cons List

Structure forces honesty about probability and magnitude

A pros-and-cons list treats every consideration as equal in weight and certain to occur. That's why it so often produces the wrong answer: a single rare-but-catastrophic con can dominate ten mild pros, but the list has no way to show it.

Decision trees fix three specific failures of unstructured deliberation:

They force you to multiply, not just add. A 5% chance of a £200,000 loss is not the same kind of thing as a 70% chance of a £10,000 gain — but on a pros-and-cons list they're indistinguishable bullet points. The tree makes the multiplication visible.

They force you to be explicit about probabilities. Once you have to put a number on each branch of a chance node, you can no longer hide behind vague language like "there's a real risk that...". You have to commit to a probability. If you can't — if your honest answer is "somewhere between 10% and 60%" — that's useful information too, and you can run the tree at both endpoints.

They make sequential structure visible. Most real decisions aren't one-shot; they're a chain. You choose, an outcome happens, you choose again. A tree captures that ordering, and the value of information shows up automatically — branches that let you wait for a signal before committing usually beat branches that lock you in early.

A related framework that complements decision trees is the pre-mortem — imagining the decision has already failed and reasoning backwards from the failure mode. Pre-mortems generate the chance-node branches; the tree quantifies them.

Building a Decision Tree in Five Steps

The algorithm, then a worked example

Every decision tree follows the same construction recipe. The mechanics are simple; the discipline of doing each step honestly is where the work lives.

Frame the decision

Write the question in one sentence: "Should we expand into the German market in 2026?" Be specific about the time horizon and the alternative — "should we do X" is incomplete without "...instead of what?".

List the options

Draw a decision node and a branch for each option you'll seriously consider. Three to five branches is the sweet spot — fewer and you're probably anchoring on a default; more and you're not really deciding, you're brainstorming.

Identify the key uncertainties

For each option, what's the one or two uncertain things that determine whether it turns out well? Don't list every possible factor — pick the ones that change the answer if they flip. Add a chance node for each, with two to four branches.

Assign probabilities and payoffs

Put a probability on each branch leaving a chance node (they must sum to 1.0) and a monetary value at every terminal node. Start with base rates if you have them — see our guide to [base rate neglect](/blog/base-rate-neglect/) for why this matters — then adjust for your specific circumstances.

Roll back to find the expected value

Start at the terminal nodes on the right. At each chance node, calculate the EV — sum of (probability × value) across its branches — and write that number on the chance node itself. At each decision node, pick the branch with the highest EV and write that number on the decision node. Keep working leftward until you reach the original decision. The branch with the highest rolled-back EV is the EV-maximising choice.

Worked Example: The Counteroffer

Three branches, two chance nodes, one clear answer

Imagine Maya, a senior product manager on £85,000. She's been offered a role at a competitor on £105,000. Her current employer has hinted that they'll counter if she resigns. She has three options:

Accept the new offer and leave.
Resign and use the new offer to extract a counter — risky, because the counter might not materialise or might be insulting.
Stay put and not raise the conversation at all.

Let's build the tree.

Counteroffer Tree — Probabilities, Payoffs, and Rolled-Back EV (3-Year Horizon)

Specification	Value
Branch 1: Take the new offer	Certain: £105k × 3 years = £315,000
Branch 2a: Resign → strong counter (£100k)	P = 0.40 → £100k × 3 = £300,000
Branch 2b: Resign → weak counter (£90k)	P = 0.35 → £90k × 3 = £270,000
Branch 2c: Resign → no counter, take new offer anyway	P = 0.25 → £315,000 (but reputational cost)
Branch 2 rolled-back EV	0.40 × 300k + 0.35 × 270k + 0.25 × 315k = £293,250
Branch 3: Stay put	Certain: £85k × 3 + standard 4% rises = £265,302

Rolling back: Branch 1 has EV £315,000. Branch 2 has EV £293,250. Branch 3 has EV £265,302. The EV-maximising choice is Branch 1 — take the offer and leave.

Notice what the tree reveals that a list wouldn't. Branch 2 looks superficially attractive — "I'll get a counter" — but once you multiply, the chance of a weak counter drags the expected value below Branch 1. The counter-game only beats taking the offer outright if you believe the probability of a strong counter is above ~70%, which most people overestimate.

It also exposes a hidden cost the framing usually hides: in Branch 2c, you've burned political capital by resigning, even though you end up at the new employer anyway. The £315,000 figure there is the same as Branch 1's, but the path was worse. Some practitioners assign a soft penalty here — say -£10,000 in reputational damage — which pushes Branch 2's EV down further.

This kind of structured reasoning is the same engine behind Bayesian thinking in everyday decisions — making the prior probabilities explicit instead of leaving them as intuitions you can't argue with.

Worked Example: Surgery vs Watchful Waiting

Where decision trees became standard practice

Medical decision analysis is where the technique earned its keep. The framework lets a doctor and patient quantify the trade-off between an aggressive intervention and a conservative one, given imperfect diagnostic information.

Consider a hypothetical: a 60-year-old patient has a borderline scan result for a slow-growing tumour. The clinician estimates a 30% probability it's malignant. The options are immediate surgery, or watchful waiting with a follow-up scan in six months.

We'll measure outcomes in quality-adjusted life years (QALYs) — one year of full health is one QALY, one year at 50% quality is 0.5 QALYs. Numbers below are illustrative rather than clinical:

Surgery vs Watchful Waiting — QALY Tree

Specification	Value
Surgery: P(success) = 0.92	20 QALYs gained
Surgery: P(major complication) = 0.05	12 QALYs gained
Surgery: P(mortality) = 0.03	0 QALYs (with surgical decision regret weighting)
Surgery rolled-back EV	0.92 × 20 + 0.05 × 12 + 0.03 × 0 = 19.0 QALYs
Watch: P(benign as suspected) = 0.70	20 QALYs (no intervention needed)
Watch: P(malignant, caught next scan) = 0.20	0.92 × 18 + 0.05 × 10 + 0.03 × 0 = 17.06 QALYs (delayed surgery)
Watch: P(malignant, late discovery) = 0.10	8 QALYs (worse prognosis after delay)
Watch rolled-back EV	0.70 × 20 + 0.20 × 17.06 + 0.10 × 8 = 18.21 QALYs

Surgery wins on raw EV — 19.0 vs 18.21 QALYs. But the gap is small (less than 1 QALY) and depends entirely on the 30% malignancy estimate. If the true probability is 20%, watchful waiting wins. If it's 40%, surgery wins decisively. The tree doesn't tell you the answer — it tells you which input you should care about most.

This is the single most useful output of decision-tree analysis in real life. The point is rarely to compute a single number and act on it. The point is to identify the sensitivity — which input would have to change, and by how much, before your conclusion flips. In medical decision analysis, this is called a one-way sensitivity analysis. In business, it's called a tornado chart. Either way, you're asking: what would have to be true for me to change my mind?

Worked Example: Entering a New Market

Sequential decisions and the value of waiting

Business expansion decisions illustrate the most powerful feature of decision trees — modelling decisions in sequence, where information arrives over time and lets you avoid commitment.

A UK fintech is considering launching in Germany. Two options: launch immediately at £4M cost, or run a six-month pilot in Berlin at £600k cost first. The full launch's payoff depends on the local regulatory environment, which is mid-consultation.

The pilot effectively buys information. The tree captures this by adding a second decision node after the pilot's chance node — once the regulatory signal arrives, you can choose again.

Market Entry Tree — Sequential Decisions

Specification	Value
Path A: Full launch now, favourable regs (P=0.5)	Net £18M over 5 years
Path A: Full launch now, unfavourable regs (P=0.5)	Net -£2M over 5 years
Path A rolled-back EV	0.5 × 18M + 0.5 × (-2M) = £8.0M
Path B: Pilot → favourable signal → launch (P=0.45)	£18M - £600k pilot = £17.4M
Path B: Pilot → mixed signal → launch anyway (P=0.20)	EV £6M - £600k = £5.4M
Path B: Pilot → unfavourable signal → walk away (P=0.35)	-£600k pilot cost only
Path B rolled-back EV	0.45 × 17.4M + 0.20 × 5.4M + 0.35 × (-0.6M) = £8.7M

The pilot path wins by £700k in expected value, even though it spends £600k on the pilot. The reason: it lets you walk away from a £2M loss 35% of the time. That optionality is worth more than the pilot costs.

This is the textbook "value of information" calculation. Whenever you can spend a known amount to learn something material before committing a larger amount, the tree will usually favour learning first — provided the information actually reduces uncertainty (a pilot that doesn't tell you anything useful is just a tax). This logic underpins much of second-order thinking and is the discipline that separates good operators from bold ones.

When Decision Trees Mislead

Five failure modes to watch for

The technique is powerful, which means the mistakes are correspondingly costly. Five common failure modes:

The tree gets too big to think about. Beyond roughly fifteen terminal nodes, the diagram stops aiding intuition. Either collapse low-impact branches into a single "other" bucket or split the problem into a sequence of smaller trees. If you can't draw it on a single sheet of A4, you've overspecified the problem.

The probabilities are guesses dressed as data. A decision tree run with made-up probabilities is just a list of opinions wearing a maths costume. Always start with base rates from real reference classes — historical success rates of comparable launches, published clinical outcomes, observed conversion rates from prior experiments. Adjust from there, but commit to the base rate as your prior.

Base rates get ignored entirely. The classic decision-tree failure is letting an inside view dominate. Maya in the counteroffer example might feel like her counter would be strong because her relationship with the boss is good, but the base rate for "counters that match a 23% external uplift" is well under 40%. Anchor on the outside view first; adjust modestly for the specifics.

Real options are missed. Many decisions look like "do it / don't do it" but actually have a third branch: "do a smaller version, then decide". Pilots, prototypes, MVPs, two-week trials — all are real options that branches on the tree can capture. Skip them and you'll systematically over-commit. This is one of the highest-leverage corrections you can make to your own thinking; thinking probabilistically is essentially the habit of always asking what the next-cheapest piece of information would be.

Terminal values are computed too narrowly. The terminal node at the end of every path needs to include everything — direct payoff, opportunity cost, reputational impact, time spent, tax, and second-order effects. A spreadsheet that only counts the headline number will systematically over-value high-risk, high-headline branches. For long-horizon decisions, also account for the timing of cash flows: a £1 in five years is not the same as £1 today, and the conditional probability of each branch can shift if the underlying conditions change between now and then.

Tools for Building Decision Trees

What to reach for at each level of complexity

Pen and paper

For everything under about ten terminal nodes, this is the right tool. The discipline of sketching the structure by hand is half the value of the exercise.

draw.io / Lucidchart / Whimsical

Free or freemium diagramming tools that handle trees up to roughly thirty nodes cleanly. Best for shareable diagrams you want a team to review. Whimsical is the fastest of the three for getting a clean diagram in under five minutes.

Spreadsheets (Excel, Google Sheets)

Once you need sensitivity analysis or scenario tables, a spreadsheet beats a diagram. Put each path on a row, each node's probability and value in named cells, and the rolled-back EV in a totals column.

TreeAge Pro

The professional choice for medical decision analysts and operations researchers. Handles trees with thousands of nodes, runs Monte Carlo sensitivity automatically, and exports clinical-quality reports. Pricey; only worth it for full-time decision analysts.

Python / R libraries

For analysts comfortable with code: scikit-learn for machine-learning trees, dtreeviz for visualisation, and PyMC or Stan for trees with Bayesian-updated probabilities. Overkill for personal decisions; essential for production decision systems.

Frequently Asked Questions

What's the difference between a decision tree and a flowchart?

A flowchart maps a process — what step happens after each previous step. A decision tree maps a choice — what outcome arises from each combination of decisions and uncertain events, with probabilities and values attached. Flowcharts are for describing systems that already exist; decision trees are for deciding what to do before the system runs.

How many branches should a chance node have?

Two to four, almost always. Five branches is usually a sign you're modelling at too fine a grain — collapse "high / medium-high / medium / medium-low / low" into "good / neutral / bad". The exception is when you have real frequency data and the granular branches change the EV materially. If the answer is the same with three branches as with seven, use three.

Should I include tiny-probability branches?

Only if the magnitude is large enough to matter. A 0.1% probability of a £50,000 loss has an EV impact of £50 — usually negligible. A 0.1% probability of bankruptcy has the same EV impact in raw maths but matters enormously because of ruin avoidance. Apply the same rule as with insurance: tiny probabilities of catastrophic outcomes deserve their own branch even when their EV impact is small.

What if I don't know the probabilities?

Use the outside view as a starting point — historical base rates from comparable situations. Then run the tree at the lower and upper ends of your uncertainty range. If the EV ranking doesn't flip across that range, you can act without further analysis. If it does flip, you've identified the single most valuable thing to investigate next.

Do decision trees work for one-off decisions?

Yes, with two adjustments. First, weight the variance of each branch, not just the expected value — for a one-shot decision you only get one outcome, so a high-variance branch can blow up even if its EV looks attractive. Second, consider ruin and irreversibility — if a bad outcome would prevent you from playing again, treat it as a disqualifying constraint rather than letting the EV maths overrule it. See our guide to [expected value vs expected utility](/blog/expected-value-vs-expected-utility/) for the formal treatment.

How do decision trees relate to Bayesian thinking?

They're complementary. A decision tree captures the structure of a choice and rolls back EV; Bayesian thinking is the discipline of updating the probabilities on the tree's chance nodes as new information arrives. In a sequential decision (like the market-entry pilot example), the post-pilot probabilities are explicitly Bayesian — you update from the prior to the posterior using the pilot's signal.

What's the most common mistake when first using decision trees?

Anchoring on a single best-guess number and forgetting to run sensitivity analysis. The tree's most valuable output is rarely the EV itself — it's the answer to "which input would have to change for me to switch options, and by how much?". If you walk away with a number but not a sensitivity, you've done half the job.

Need a hand running the numbers?

Our expected value calculator works out the EV of any tree's chance node — paste in probabilities and payoffs, get the rolled-back number.

Use the EV calculator