// learn.shawon.ch / math-for-ai / powers-and-roots STUDY GUIDE
← Mathematics for AI

Mathematics · Arithmetic

Powers stack multiplication; roots take it back

Try this first

A rumour starts with one person, and every hour each person who knows it tells one new person. After 10 hours, roughly how many people know — closer to 20, or closer to 1,000? Guess before you compute.

Most people guess low, because our intuition adds when the world multiplies. Each hour doubles the count: 1, 2, 4, 8, 16,… and after 10 hours it's 2×2×… ten times over — 1,024. We write that pile-up of multiplications as a power: 2¹⁰. Just as multiplication was a shortcut for repeated addition, an exponent is a shortcut for repeated multiplication.

The notation has two parts: 2⁷ means "the base 2, multiplied by itself exponent-many times" — here, 7 twos: 2×2×2×2×2×2×2 = 128. The reason it feels surprising is in the picture — doubling barely moves at first, then erupts:

124 81632 2⁰ 2⁴2⁵
Each bar is twice the last. Flat, flat, flat — then it runs away. That's exponential growth.

The one idea

A power bⁿ is the base b multiplied by itself n times. A root undoes it: the square root √9 asks "what number, squared, gives 9?" — the same inverse move that division was for multiplication.

Roots are the question powers answer

Squaring a number means raising it to the power 2: 3² = 9. The square root runs the film backward — √9 = 3, because 3² = 9. (It's called "square" because is the area of a 3-by-3 square, tying straight back to the grid from lesson 3.) Most roots aren't whole: √2 ≈ 1.414…, an honest number that never settles — but you rarely need it by hand, only to know what it asks.

Three facts about exponents earn their keep early. They're worth recognising more than memorising:

Exponents, the useful corners
ExpressionValueWhy
bone copy of itself
b⁰1no copies — the empty product
b⁻¹1 / ba negative power means "one over"
2³ × 2²2⁵ = 32multiplying powers adds the exponents

That last row is the quiet bridge to the next part of the course: turning multiplication into addition by working with exponents is exactly what a logarithm (lesson 24) does in reverse.

Work one, then finish one

Worked: compute 2⁴. Don't reach for a calculator — just double from 1, four times: 1 → 2 → 4 → 8 → 16. So 2⁴ = 16.

Your turn: a "deep" network stacks layers, and a toy one doubles its neuron count each layer starting from 1. How many neurons in the 8th layer, and what is √64? (Answer: 2⁷ = 128 in the 8th layer; √64 = 8.)

Why this earns a place in your toolkit

Squares and square roots are how machines measure distance. The gap between a prediction and the truth is almost always squared before it's added up — that's the "squared error" a model spends its whole life shrinking, and squaring is what makes every miss count as positive and punishes big misses hardest. The length of a vector, the distance between two data points, the norm that decides whether two things are "close" — all are a sum of squares with a square root on top (the Pythagorean theorem, scaled up). Meanwhile exponential growth, that runaway curve above, is how the cost of an algorithm explodes as data grows, and the reason "it worked on 100 rows" can die on a million. Powers are where size and distance enter the math.

Recall check · no peeking

  1. What does mean as a repeated multiplication, and what is it?
  2. What question does √49 ask, and what's the answer?
  3. Why is b⁰ = 1 for any base — what's being multiplied?
  4. Where does squaring show up when a model measures how wrong it was?

Explain it back

In one sentence, explain to a friend why doubling ten times lands near a thousand and not near twenty.

Learn · Shawon Chowdhury · a study guide, kept rough on purpose