Research methods · Study designs

Observational studies

Try this first

You cannot ethically assign people at random to smoke a pack a day for thirty years and wait to see who gets lung cancer. So how was smoking ever shown to cause cancer at all? Before reading on, sketch how you'd build the evidence without ever forcing anyone to smoke.

A nutrition lab wants to know whether eating fish protects the heart. The clean way would be to take 20,000 people, flip a coin for each, and order half of them to eat fish for a decade. Nobody can run that study — you can't command people's diets for ten years, and it would be unethical to try. So instead the researchers just watch: they record who already eats fish, follow everyone forward, and count heart attacks. They change nothing. That is the whole idea of an observational study — you measure the world as it is, you never reach in and turn the dial yourself. It is most of the evidence behind every "X is linked to Y" headline you've ever read.

The one idea

Observational studies watch without intervening. The three workhorse designs — cohort, case-control, cross-sectional — differ mainly in how well they pin down time order: did the exposure clearly come before the outcome? Because nobody was randomized, the groups differ in a hundred uncontrolled ways, so a small effect is most likely confounding, not cause.

Three ways to watch

All three observe rather than intervene. What separates them is where they stand in time relative to the outcome.

A cohort study starts with exposure and runs forward: take people who do and don't eat fish today, follow them for years, see who has heart attacks. Time order is clear — the exposure was recorded before the outcome happened — but you wait a long time and need huge numbers if the outcome is rare.

A case-control study starts from the outcome and looks backward: gather people who already have lung cancer (cases) plus similar people who don't (controls), then ask both groups about their past smoking. It's fast and excellent for rare diseases — but it leans on memory, which invites recall bias: sick people search their past harder for a culprit.

A cross-sectional study takes a single snapshot: measure exposure and outcome in everyone at one moment — say, survey 5,000 people today for both vitamin D and body weight. It's cheap and great for "how common is this," but because both are measured at once, it usually can't tell you which came first.

Where each design stands in time — and which way it points.

The three designs, side by side
Design	Direction in time	Good at	Weak at
Cohort	Forward (exposure → outcome)	Clear time order; multiple outcomes	Slow, costly; rare outcomes need huge N
Case-control	Backward (outcome → exposure)	Rare diseases; fast and cheap	Recall bias; choosing fair controls
Cross-sectional	One snapshot (both at once)	Prevalence — how common things are	Can't show which came first

So how was smoking pinned to cancer without a trial? With observational studies, stacked. Case-control studies showed lung-cancer patients had smoked far more than matched non-patients. Then a long cohort study followed tens of thousands of doctors forward for years and watched lung cancer pile up among the smokers. The effect was enormous — many times the baseline risk, dose-dependent, consistent across designs and countries. That size is the point: a huge, repeatable effect is hard to explain away as confounding. A 10% bump is not.

Work one, then finish one

Worked: A cross-sectional survey reports that people with low vitamin D are more likely to be obese. Which came first? A snapshot can't say. Three stories fit the same photo equally well: low vitamin D could promote weight gain; obesity could lower vitamin D (it's fat-soluble and gets sequestered in body fat, and heavier people may spend less time outdoors); or some third thing — little sun exposure, say — could drive both. Measured at one instant, the data cannot rank these. So "low D linked to obesity" is a fine hypothesis and a terrible reason to start taking vitamin D to lose weight.

Your turn: Researchers recruit people who already have a cancer diagnosis and ask them to recall what they ate over the past twenty years, comparing their answers to healthy people's. Which design is this, and what bias should you flag first? (Case-control — start from the outcome, look back for exposure. The first bias to flag is recall bias: a diagnosed person reconstructs their past diet differently — usually harder, hunting for blame — than a healthy person does.)

Why this matters

Nearly every "X is linked to longer life" or "Y raises your risk of Z" claim that reaches you — on a supplement label, a wellness podcast, a brand's landing page, a thread of people swearing by some compound — traces back to a cohort or cross-sectional study. That's not a knock; those designs are how good hypotheses are born. But they share a fatal limit for the decision you actually face, which is should I take this. The people who already eat the "healthy" thing differ from those who don't in dozens of unmeasured ways — wealth, exercise, sleep, baseline health, how much they care about their bodies. Any of those can manufacture a modest association out of nothing. So when you see a small effect from an observational study, your default should be: probably confounding, until a randomized trial says otherwise. Treat it as a reason to be curious, not a reason to buy.

Recall check · no peeking

Name the three observational designs, and for each, one thing it's good at and one thing it's bad at.
Which design best establishes the time order of exposure and outcome, and why?
Why should a small effect from an observational study make you suspicious rather than convinced?

Explain it back

In one plain sentence, tell a friend why a single snapshot of people — measuring two things at the same instant — can't tell you which of the two caused the other.

Learn · Shawon Chowdhury · a study guide, kept rough on purpose