Research methods · Study designs

Randomized controlled trials

Try this first

You want to test whether a supplement works. You have a thousand willing people. How do you split them into two groups that are identical — not just on age and weight, but on the things you never thought to measure, like how stressed they are or how good their genes are? Take ten seconds before reading on.

Say a supplement brand runs a study like this: take 200 people, give everyone the pill, measure a blood marker before and after, and report that it improved. Sounds like evidence. But the marker might drift on its own, the season might have changed, people might have eaten better just because they signed up for a health study. There's no one to compare against, so you can't tell the pill apart from everything else that happened. Now picture a different design: split the 200 by a coin flip, give half the pill and half an identical dummy, and compare the two halves. If only the pill-half improves, the pill did something. That coin flip is the whole difference between a story and a result.

The one idea

Assigning people to treatment or control at random makes the two groups statistically alike on everything at once — including the confounders nobody thought to measure. So when the outcome differs, the treatment is the only thing left that could have caused it. The coin flip is what earns a study the word caused.

Why random beats clever

You could try to build two matched groups by hand — pair people on age, sex, weight, diet. But you can only match on things you measured, and the things that wreck health studies are usually the things you didn't: motivation, undiagnosed illness, genetics, how carefully someone follows instructions. Randomization doesn't need to know what those things are. By splitting on a coin flip, it scatters all of them — measured and unmeasured, known and unknown — roughly evenly into both arms. That's the trick statistics can't reproduce after the fact: you can only adjust for variables you wrote down, but a coin flip balances the ones you'll never even name.

The coin flip scatters every trait, named or not, evenly into both arms.

What "controlled" adds on top of "randomized"

Two separate words, two separate jobs. Controlled means there's a comparison group living through the same time, season, and attention as the treatment group — so anything that happens to everyone cancels out. Randomized means people land in those groups by chance, not by choice, so the groups start equal. You need both. A controlled-but-not-random study lets sicker people sort themselves into one arm; a randomized-but-uncontrolled study has no one to compare against. Miss either and the word "caused" slips out of reach.

What each design can honestly claim
Design	Comparison group?	Strongest honest claim
Before / after, one group	No	"Something changed over time"
Observational (who already chose it)	Yes, but self-selected	"X is associated with Y"
Randomized controlled trial	Yes, by coin flip	"X caused Y, here"

Work one, then finish one

Worked: Why does an RCT of a supplement beat its matched observational cousin? Picture the observational version: researchers track people who already take the supplement and find they're healthier. But people who take supplements also tend to exercise, eat well, and see doctors — the healthy-user effect. The team "adjusts" for diet and exercise in the stats. The catch: you can only adjust for what you measured, and you never measured the full bundle of habits and motivation that makes someone a supplement-taker. So some of that healthy-user advantage always leaks through and gets credited to the pill. An RCT skips the problem entirely: by assigning the supplement at random, the takers and non-takers are equally health-conscious by construction. Nothing to adjust for, because nothing was unbalanced to begin with.

Your turn: A brand cites "a 14-person, 3-week RCT" showing their product moved a blood marker. It was randomized and controlled — so what's still weak about it? Name three problems. (Tiny sample of 14 — far too few to rule out chance, i.e. underpowered; only 3 weeks — far too short to show a lasting or meaningful effect; and a blood marker is a surrogate outcome, a stand-in for health, not a real outcome like fewer heart attacks or living longer.)

Why this matters

This is the check to run the moment a label or a podcaster says clinically studied. Those two words are doing a lot of quiet work, because "studied" doesn't mean "tested against a comparison group." Pick up the supplement promising better sleep or sharper focus and ask two things: was it actually randomized (people assigned by chance, not volunteers who already believed in it), and was it actually controlled (a real placebo group, not just the same people measured before and after)? An enormous share of "clinically studied" claims rest on an uncontrolled before-and-after with no comparison group at all — exactly the design that can't separate the product from the season, the placebo effect, or the simple fact that someone starting a supplement often cleans up the rest of their life too. If there's no control arm and no coin flip, the most the study can honestly say is "something changed." That is not a reason to buy.

Recall check · no peeking

What does randomization balance that statistical adjustment never fully can — and why can't adjustment catch it?
What two things must a study have before it earns the word "caused"?
Name three ways an RCT can still mislead you even when it really was randomized and controlled.

Explain it back

In one plain sentence, tell a friend why the coin flip — not the fancy statistics — is the whole trick that lets a trial claim the treatment caused the result.

Learn · Shawon Chowdhury · a study guide, kept rough on purpose