Mathematics · Arithmetic
Multiplication is a grid; division undoes it
Try this first
A training run feeds the model 6 batches, each holding 32 examples. How many examples is that in total — and what did you do to find out?
You could add 32 + 32 + 32 + 32 + 32 + 32, and you'd be right: 192. Multiplication is simply a shortcut for that — repeated addition. 6 × 32 means "add 32 to itself 6 times." Whenever you find yourself adding the same number over and over, multiplication is the tool that collapses the whole job into one step.
But there's a second picture that turns out to matter far more for what's coming. Arrange the things in a grid — 6 rows of 32, or 32 rows of 6 — and the total number of cells is the product. Multiplication counts a rectangle. Here it is in miniature, three rows of four:
The one idea
Multiplication is repeated addition — and, more usefully, it counts a grid: r rows of c is r × c. Division is its inverse: 12 ÷ 4 asks "4 times what makes 12?"
Division is the question multiplication answers
Just as subtraction undoes addition, division undoes multiplication — and it shows up in two flavours of the same question. "Split 12 into 4 equal groups: how big is each?" and "How many groups of 4 fit into 12?" both land on 3. Either way, you're solving 4 × ? = 12.
Two snags are worth meeting now. First, division doesn't always come out even: 13 ÷ 4 gives 3 with 1 left over — a remainder. (Splitting that last piece is what fractions, two lessons ahead, are for.) Second, you can never divide by zero. 12 ÷ 0 would ask "0 times what makes 12?" — and nothing times zero is ever 12, so the question has no answer. That's not a rule someone imposed; it's just a question with no solution.
| Rule | What it says | Why it'll matter |
|---|---|---|
| Order (commutative) | 3 × 4 = 4 × 3 | The same grid, rotated |
| Identity | n × 1 = n | Scaling by 1 leaves it be |
| Zero | n × 0 = 0 | Empty grid, no cells |
| Distributive | a×(b+c) = a×b + a×c | The rule that powers matrix math |
Work one, then finish one
Worked: 12 ÷ 4. Don't reach for a procedure — ask the inverse: "4 times what makes 12?" You know 4 × 3 = 12, so the answer is 3.
Your turn: You have 50,000 training images and want to feed them in batches of 64. How many full batches do you get, and how many images are left over for a final partial batch? (Answer: 50000 ÷ 64 = 781 with remainder 16 — 781 full batches and 16 images left over. Deciding whether to keep or drop that last 16 is a real choice you'll make when training models.)
Why this earns a place in your toolkit
That grid of dots is not a teaching gimmick — it is, quite literally, the shape of the central object in machine learning: the matrix, a rectangle of numbers. The one operation a GPU spends almost all its time doing is multiply-and-add: take rows and columns of these grids, multiply matching numbers, and sum the results. Every layer of a neural network, every step of attention in a language model, is built from it. And the unassuming distributive law in that table is the reason multiplying matrices works at all, and why those multiplications can be split across thousands of processors at once. You're looking at the seed of the whole engine.
Recall check · no peeking
- Give the two pictures of multiplication in your own words.
- Rewrite
56 ÷ 8as a multiplication (missing-factor) question. - Why is dividing by zero not allowed — what goes wrong with the question?
- What is the remainder of
30 ÷ 7, and what will eventually let you split it?
Explain it back
In one sentence, explain why 3 × 4 and 4 × 3 must give the same answer — using the grid.