// learn.shawon.ch / research-methods / correlation-vs-causation STUDY GUIDE
← Research methods

Research methods · Foundations

Correlation vs causation

Try this first

Across the year, ice-cream sales and drowning deaths rise and fall together — almost perfectly. Nobody thinks ice cream drowns people. So what is making both move at once? Name it before reading on.

Two numbers move together. That's all "correlation" means: when one is high, the other tends to be high (or reliably low). It's a real, measurable pattern — the ice-cream and drowning lines really do track each other. The trap is the leap your brain makes for free: one must be causing the other. But a pattern in the data is mute. It tells you the two things are linked; it says nothing about why. And there isn't just one "why" hiding behind it — there are four, and three of them are boring.

The one idea

When A and B move together, there are exactly four explanations: A causes B, B causes A (reverse), a hidden third thing C drives both (confounding), or it's chance. Only the first is the headline. Your job is to rule out the other three before believing it.

The four readings of any link

The ice-cream answer is the third one: summer heat drives both. Hot days sell ice cream and push people into the water, where some drown. Heat is the hidden C sitting above both — a confounder. Neither variable touches the other; they just share a common cause. Once you see one confounder, you start seeing them everywhere, because the world is full of things that move on the same seasons, the same incomes, the same kinds of careful people.

C hidden third factor A B C drives both A causes B B causes A (reverse) … or chance — no link at all only the teal path is the headline
One observed link, four explanations — three of them boring.
The same link, four ways
ReadingWhat it meansIce cream & drowning
A → BThe exciting one: A truly causes BIce cream causes drowning (absurd)
B → AReverse: B causes A, you had the arrow backwardsDrownings cause ice-cream sales (absurd)
C → bothConfounding: a third factor drives bothSummer heat drives both — the real answer
chanceA fluke; the link vanishes with more dataPossible in tiny samples, not here

Work one, then finish one

Worked: "Diet soda is linked to weight gain." The headline implies A→B — the sweetener makes you fat. But run the four readings. Reverse causation fits far better: people who are already heavier, or trying to lose weight, are the ones who switch to diet soda. The weight came first and caused the diet-soda choice, not the other way round. The arrow points B→A. (Confounding plays a part too — the same people may diet, snack, and weigh in differently.) The drink looks guilty only because we read the arrow in the flattering direction.

Your turn: "People who meditate are calmer." Give all four readings before you credit meditation. (One: meditation→calm, the headline. Two, reverse: calm, low-stress people are the ones who keep up a meditation habit. Three, confounding: a third factor like more leisure time, higher income, or fewer money worries makes someone both calmer and free to meditate. Four: chance, a fluke in a small or noisy sample. Only the first earns the "meditation works" caption.)

Why this matters

Almost every nutrition and supplement headline you'll ever read is a correlation from an observational study — researchers watched what people already chose to eat or take and tracked who got sick — dressed up as cause. "People who take fish oil have healthier hearts." "Vitamin-D levels are linked to lower risk of everything." Maybe. But fish-oil takers also tend to be wealthier, more health-conscious, and better-doctored; low vitamin D can be a result of being sick and indoors, not a cause of it. Before you spend money on a bottle, ask the one question the marketing never answers: was this a controlled trial where they actually gave people the supplement and watched what happened — or did they just notice that the kind of person who already takes it tends to be healthier anyway? If it's the second, you're looking at a correlation wearing a cause's clothes.

Recall check · no peeking

  1. Name the four explanations for any observed correlation between A and B.
  2. What is reverse causation, in one sentence, with an example of your own?
  3. Which hedge words in a headline signal "this is only a correlation"?

Explain it back

In one plain sentence, tell a friend why a study saying a supplement is "linked to" better health does not mean the supplement causes better health.

Learn · Shawon Chowdhury · a study guide, kept rough on purpose