By Ben Blum-Smith, Contributing Editor
This post discusses three very familiar facts from grade-school mathematics. In spite of their familiarity, I believe they tend to go under-appreciated, at every level of math education. In the elementary grades, my experience is that if they do get explicit attention, we generally treat them as tools students should learn to use, rather than as the subject of their own inquiry. Meanwhile, in the later grades—middle school, high school, college, and beyond—we are already used to them, so they tend to be seen as trivialities, not worthy of further reflection.
I think this might be a missed opportunity. We lose the chance to delectate with our students in these facts’ surprisingness, their non-obviousness. And we pass up the occasion to ask why they hold.
Therefore, I submit to you three foundational theorems of elementary school math.
I use the word “theorem” with great deliberateness. Calling something a theorem is an invitation to take stock of the fact that something wonderful has happened—a truth hidden below the surface has been revealed. This word punctuates our mathematical narratives with moments of celebration. I’m making the case that the word “theorem” gives us a useful framework to appreciate the importance, depth, and richness, of this elementary content.
Another virtue of the word “theorem” in this context is that it calls attention to the need to justify. Slowing down to ask how we know these facts are true leads not only to a delightful insight in each case, but also to the revelation that the three are actually very closely related. This is something I think I myself didn’t fully appreciate until well into my mathematical adulthood—I’m inviting us to consider it as something we might want to share with students.
I view this offering as located in at least two intellectual lineages. One is that of Felix Klein’s Elementary Mathematics from an Advanced Standpoint: revisiting the content of the early grades of math instruction from a point of view informed by research training. Another is the trend, in curriculum and pedagogical design, to look to the practice of research mathematics to inform the design of student experience. We find vertiginous heights of mathematical beauty and power in the search for and study of theorems and their justifications—why not view the truths revealed by the elementary curriculum in that same paradigm of knowledge?
So, without further ado:
1. The commutativity of multiplication of natural numbers
My elementary school had combined first- and second-grade classes. (3rd and 4th, 5th and 6th, and 7th and 8th were similarly paired up. Only Kindergarten got its own classes.) I remember being intensely curious about multiplication before I had learned what it was, because the second graders were doing it. “What does ‘4 times 6’ mean?” It conjured a vision of an anthropomorphic numeral 4, standing with a stopwatch, clocking an anthropomorphic numeral 6 as it ran a race.
At some point, I prevailed on my dad to answer. “‘4 times 6’ means ‘4, 6 times,'” he said. “4 + 4 + 4 + 4 + 4 + 4 = 24, so 4 times 6 is 24.” Such a satisfying answer! “Times,” not like a clock, but like “how many times!”
Later in the year, my teacher (the brilliant Judy Lazrus, who I think was more or less singlehandedly responsible for my love of school) gave a different definition. “‘4 times 6’ means four sixes,” she said. “6 + 6 + 6 + 6 = 24, so 4 times 6 is 24.” According to my dad’s definition, this would’ve been 6 times 4, not 4 times 6. I was thus obliged to choose between believing my father and believing my teacher.
The fact I’d learned my dad’s definition first, plus the fact that it explained the presence of the word “times”, along possibly with personal loyalty to the one who put me to bed at night, added up to a decision to stick with the “4, 6 times” definition. (As it happens, my father was “wrong” and my teacher was “right.” Etymologically, “4 times 6” is an alternative way of saying “6, 4 times”, not “4, 6 times”.)
Fortunately for all involved, the two definitions always yield the same answer.
What I want to bring to your attention is that this is a theorem.
Furthermore, it’s a very, very non-obvious theorem, it has major implications, and the main idea of the proof is also of fundamental importance. Sitting right here in the standard third grade curriculum is an opportunity for students to experience a major theorem and its proof.
[Digression: what is a theorem?
In the introduction I made the case that the word “theorem” is appropriate and useful in this context because it highlights surprising-ness, invokes a celebration of something wonderful, and occasions a search for justification. But let me take a moment now to answer a possible objection: doesn’t framing something as a theorem require a commitment to a particular axiomatic system in which it is a theorem? How do we know it’s not just a definition? (Multiplication is commutative, axiomatically, in any field, for example.)
First of all, I don’t see a reason to stand on ceremony about this when discussing elementary mathematics: it seems to me it would entail an awkward double standard. It has frequently been observed that in the actual practice of research mathematics, unless we are explicitly working on foundations or with an automated proof assistant, we tend to be unconcerned with our underlying formal system. Even if we think of ourselves as “working in ZFC” (or whatever), we do not generally prove our theorems by tracing them back to this bedrock.
But when it comes to a crisis of rigorous argument, the open secret is that, for the most part, mathematicians who are not focussed on the architecture of formal systems per se, mathematicians who are consumers rather than providers, somehow achieve a sense of utterly firm conviction in their mathematical doings, without actually going through the exercise of translating their particular argumentation into a brand-name formal system. – Barry Mazur, emphasis in original
In practice, we call it a theorem if (i) it’s not obvious, (ii) it’s at least somewhat important, and (iii) we have to engage in some deductive reasoning in order to know it. Why not apply the same standard to elementary mathematics?
But even if you don’t buy this, and insist on a formal system as context for any theorem: the commutativity of multiplication of naturals is indeed a theorem in any formal system sufficiently rich to support the content we teach in the elementary grades. (Are the Peano axioms rich enough? Perhaps, and it’s certainly a theorem there.) Sure, it’s an axiom for fields, but this kicks the can down the road: how do we know the natural numbers embed in a field?
Because we are so accustomed to the fact that 6 4’s and 4 6’s have equal totals (and similarly for any pair of natural numbers), I think it’s worth it to take a moment and appreciate just how non-obvious this is.
In the first place, 4+4+4+4+4+4 and 6+6+6+6 are facially different calculations. It is not at all clear to students encountering multiplication for the first time that they should be expected to have equal outcomes. This is reason enough to treat it as something that has to be justified.
Second, if we put on an abstract algebra hat and view multiplication of natural numbers as a special case of something more general, the generalization is often not symmetric in the inputs. For example, one generalization of multiplication is function composition. For the purpose of seeing this as a generalization, we identify each natural number with the semigroup endomorphism that sends , and then ordinary multiplication of naturals becomes composition of these endomorphisms. Composition is commutative in this special case, but not in general. For another example, my teacher’s definition of multiplication generalizes to a map for any semigroup , sending . (Note that this is not, in general, a homomorphism of semigroups: it is linear in the first factor, and bilinear if is commutative.) Multiplication of natural numbers is the case that is itself the additive semigroup , but in general, the inputs aren’t even of the same type.
So the commutativity of multiplication of natural numbers is something quite special indeed.
In terms of major implications, some of these will be discussed below. When we look beyond the horizon to collegiate and research mathematics, they become overwhelming. What grows from the soil of the commutativity of multiplication of natural numbers? All of commutative algebra, at the very least?
And then there’s the matter of the proof. It comes down to double counting, i.e., exchanging the roles of rows and columns. Four 6’s is four rows, each with 6 items; alternatively it can be seen as six columns, each with 4 items i.e., six 4’s.
This trick is ubiquitous throughout mathematics, whether we are reindexing a sum, proving the Burnside counting lemma, or changing the order of integration via Fubini’s theorem. This idea is important.
In case any readers are thinking: perhaps these ideas are too sophisticated for young grade-schoolers—I beg to differ. Probably my proudest moment of pedagogical improvisation to date involved inducing a 3rd grader to find this proof, very nearly on her own. The whole episode is described here. They key steps were: (i) getting her interested in the question; (ii) having her look for six groups of 4 in a picture she herself had drawn of 4 groups of 6—she found them, but it introduced some asymmetry into the picture that she didn’t like; (iii) asking her if she could come up with a way to draw a different picture of 4 groups of 6 that would do a better job of revealing that these were also six 4’s—she came up with the array herself, although she didn’t initially notice that it elegantly solved the problem; and (iv) getting her to notice that it did. This lesson took place in a 1-to-1 setting and depended on particular choices the student made; it wouldn’t translate directly as a classroom activity. That said, I think it does illustrate that a young student has everything needed to appreciate, and even to generate, a proof of the commutativity of multiplication of natural numbers.
2. Equality of quotative and partitive division
I expect that most mathematicians, and perhaps some teachers, will not even be familiar with the distinction between quotative and partitive division models. I had never heard of this distinction until my sixth year in the classroom, when I read the book Young Mathematicians at Work: Constructing Multiplication and Division, by Catherine Twomey-Fosnot and Maarten Dolk.
The quotative model of division is the one I remember being taught in school: “15 divided by 3” means “how many 3’s fit into 15?”
The partitive model is equally central to my understanding of division. Nonetheless, while others doubtless had a different experience, and both models are mentioned in the Common Core State Standards (though not by name), I myself do not remember it from the period in my schooling where division was the explicit object of instruction. In the partitive model, “15 divided by 3” means “if 15 is split equally into 3 groups, what is the size of each group?”
These are not the same question!
Again, we are so accustomed to viewing division in either light that we can fail to notice the distinction! For me personally, learning the words “quotative” and “partitive” was revelatory.
Given that they are not the same question, why is it valid to interpret division either way? Because there is an underlying theorem that they have the same answer!
What’s the proof?
If you’ve never thought about this before, I invite you to take a moment to contemplate it before reading on.
The quotative interpretation of “15 divided by 3” asks “how many 3’s make 15?” Going with my teacher’s definition of multiplication (over my dad’s), this is asking for the solution to the equation
Meanwhile, the partitive interpretation asks “three groups of what make 15?” I.e., it asks for the solution to
Thus, the equality of quotative and partitive division is actually a corollary of the commutativity of multiplication!
Isn’t it worth considering to give students the opportunity to make or at least see this connection?
3. The identification of fractions with division
In elementary school in the United States, we use the obelus () to signify division. Sometime afterward, this symbol falls away, to be replaced uniformly by the slash (as in ) and the (horizontal) fraction bar (as in ). Although we introduce it to children as the symbol for one of the four fundamental operations, the obelus is more or less completely absent from textbooks on undergraduate and graduate mathematics.
This notational quirk of our education system is built on—though it elides!—a profound mathematical insight. The quantity represented by the fraction is the same quantity that results from the division !
In 2013, I attended a conference entitled Mathematicians in Mathematics Education (MIME) at the University of Arizona, organized by Bill McCallum and Yvonne Lai. I have a vivid memory that one of the very first questions Bill asked us was how we would explain this equality to ourselves, let alone to students.
The equality is mentioned in the Common Core State Standards as an interpretation of division that students should be able to access. I have heard the suggestion that it is true by definition. I disagree, and so did the organizers of the MIME conference. As long as division and fractions are not defined as the same thing—and how could they be? division is a binary operation, and a fraction is a type of number!—there is a theorem here.
Or perhaps several theorems! Per the conversation above, there are (at least) two reasonable ways to give meaning to a division expression like . By the same token, there are several reasonable ways to give meaning to a fraction like . So the exact nature of the theorem, and its proof, will depend on exactly what meanings for division, and fractions, the students have access to.
Nonetheless, on any reasonable choice of meanings for the two expressions that yields single-quantity outputs for both, we will have the equality . This is shocking.
To illustrate both the substantiveness, i.e., the non-tautological nature, of the equality , and to indicate one possible way a proof could go, let me return us to my sixth year in the classroom. I was teaching at a middle school in Manhattan. In addition to 8th grade math, I taught an experimental course called “Math Lab”, intended as an auxiliary course for 6th and 7th graders who needed additional work with fractions, decimals and percents to support their work in their main math class. Following ideas in the book Young Mathematicians at Work: Constructing Fractions, Decimals, and Percents (this is the sequel to the above-mentioned work of Fosnot and Dolk), one challenge I posed to students was: “You’ve got 5 Hershey bars and 2 friends (and yourself) to share them with. How do you do it fairly?”
Students did not all approach this task the same way, but a typical response would be to distribute 1 whole bar to each of the 3 friends, leaving 2 whole bars untouched; then distribute 1/2 a bar to each; and finally to split the remaining 1/2 bar in 3 equal pieces. Thus each friend gets 1 + 1/2 + (1/3 of 1/2) of a Hershey bar. (A nontrivial part of our work would then be to recognize (1/3 of 1/2) as 1/6.) If I repeated the question except with different numbers, e.g., 6 bars and 7 friends, students experienced this as a new challenge, and proceeded in a way tailored to the new numbers.
Eventually, one 7th grader, M, hit upon a clever idea that solved all such problems at once: if given 5 Hershey bars to split among 3 friends, he would divide each bar in equal thirds, and then distribute one piece from each bar to each of the friends. There being 5 bars, it follows that each friend gets five thirds. If he was then presented with 6 bars and 7 friends, he would proceed identically: divide each bar into equal sevenths, and distribute one piece from each bar to each friend, yielding six sevenths to each friend in total. Once he found this solution method, he used it every time and these problems were no longer interesting to him. No other student approached the problems in this way.
The point of this story is that M’s solution is a way to prove that , at least in the situation that and are natural numbers. Sharing Hershey bars among friends is modeled by the partitive interpretation of division. In particular, the total amount of chocolate each student receives is whatever is. What does not follow from the definition, but is revealed by M’s approach, is that the answer is always , the fraction. Indeed, M splits each of the candy bars into equal pieces, which are therefore of size . He then hands each friend one of these pieces from each of the bars, for a total of , or .
I’ve drawn this illustration of M’s method in a way intended to echo the illustration of the proof of commutativity for multiplication of natural numbers given above. This indicates a connection between commutativity and the present theorem. And indeed, there is an alternative route to the present theorem that uses commutativity as a lemma:
The partitive division problem means the result of splitting 5 into 3 equal groups. Each group is thus 1/3 of the total. Therefore,
by comparing the partitive interpretation of the division on the left, with the definition of multiplication of a fraction by a whole number (as “1/3 of a 5”) on the right. (This is the definition given in the Common Core State Standards.)
On the other hand, the fraction is times . (This is the definition of non-unit fractions given in the Common Core State Standards, and also the one given by the etymology of the words “numerator” and “denominator”: the expresses the denomination we are working in, thirds in this case, and the enumerates how many of them we have.) Therefore,
So the equality follows again from commutativity of multiplication! If memory serves, the approach taken by the participants at the 2013 MIME conference in response to Bill’s question was roughly of this kind.
I hope that the story of M’s solution to the Hershey bar problems has given you the feeling that “something really happened here”—it can’t be tautological that , because M’s solution, which is equivalent, was a breakthrough.
Notes and references
 This assertion bears three important disclaimers:
First, I’ve qualified my description above with the phrases, “my experience”, “generally”, and “tend to”. I have no intention to downplay the very exciting work that folks have done and are doing to draw out the richness in the topics below. Indeed, some of what I discuss below is informed by such efforts—my interactions with the work of Bill McCallum and Catherine Twomey-Fosnot, for example, will be mentioned explicitly. If you are an educator who treats any of these topics with your students, I’d love to hear about it in the comments.
Secondly, I want to be clear that I have no intention to blame or criticize. I think there are systematic reasons why elementary education historically elides the “theoremful” character of its theorems. In any case, it’s not at all clear to me what the prescription should be. I’m interested in exploring the possibility that there’s something rich and wonderful available in revisiting this content from the point of view that “there be theorems here”—and maybe sharing this point of view with children—but I don’t presume to know better than anyone else if or how this should be done.
Lastly, my characterization of a “missed opportunity” has the US education system in mind. Generalizing from my own experiences is unscientific, and becomes moreso the further one travels.
 New York: Dover, 1945. I will not attempt an inescapably-incomplete bibliography of this genre, which is vast, but a recent-ish example is R. Howe (2014), “Three pillars of first grade mathematics, and beyond”, in Mathematics curriculum in school education pp. 183–207, Springer, Dordrecht (link).
 Again I make no attempt at a bibliography, but offer, by way of illustration, the Young Mathematicians at Work series of books by Catherine Twomey-Fosnot and her collaborators. (The four volumes, all published by Heinemann, are subtitled Constructing Number Sense, Addition, and Subtraction, Constructing Multiplication and Division, and Constructing Fractions, Decimals, and Percents, coauthored 2001–2002 with Maarten Dolk, and Constructing Algebra, coauthored 2010 with Bill Jacob.)
 B. Mazur (2007). “When is one thing equal to some other thing?” Proof and other dilemmas: Mathematics and philosophy, pp. 221–242 (link). See also the discussion with the philosophy student in Philip David and Reuben Hersh’s essay “The Ideal Mathematician”.
 These two points of view on multiplication were called to my attention by this MathOverflow answer. One other instance of the second construction is found in school math: exponentiation is the case where is the multiplicative semigroup . This train of thought contextualizes the non-commutativity of exponentiation in a nice way: the base and the exponent are from two different semigroups! How could they possibly change places?
 I would like to draw out what I see as the critical idea in this proof, and to distinguish it from a common proof strategy I’ve seen discussed, which is rotating an array (such as a muffin tin) 90 degrees, making it an array. I see the argument I’ve illustrated above as very different from the muffin-tin argument, and more transparent.
To clarify. In my view, the real story is one of regrouping. Note what really happens in the diagram above: we construct a group of 4 by picking one element from each of our 4 groups of 6. Then we can construct a second group of 4 similarly, by selecting (a new) one element from each of our (original) groups of 6. Continuing in this way, we end up with 6 groups of 4. This method of rearrangement makes clear that the size of the new groups is equal to the number of the old groups (since each new group was constructed by taking one element from each of the old groups) and also that the number of the new groups is equal to the size of the old groups (because, fixing any one of the old groups, each new group meets exactly one of its elements, so the new groups are in bijective correspondence with the elements of this one old group).
Although it was with reference to the diagram above, the verbal description of the regrouping procedure in the previous paragraph is actually “geometry-free”. What the diagram shows is how the procedure can be carried out in a geometrically organized way. It is both the regrouping procedure itself, and the fact that it can be geometrically organized to make the idea clear—indexing the groups first by the rows, and then by the columns—that I think of as the content of the proof. And this is what I was referring to above, with the phrase “double counting, i.e., exchanging rows and columns.”
I am perhaps being overly pedantic with what follows—and could definitely be wrong, which is why I’m doing this in a footnote—but: In my view, the “muffin tin” proof can easily land for students as handwaving / sleight of hand. I think it tends both to “happen too fast” to really see what’s going on, and also, it introduces some details that are extrinsic to the real story. The heart of the matter, per above, is breaking apart the original groups and re-organizing the total so that you get new groups (whose size is the number of old groups, and whose number is the size of the old groups). I think rotating the array can obscure that this is happening. It seems to me that it can easily fall as, “abra-cadabra, 6 rows and 4 columns became 4 rows and 6 columns!” But the way that the actual elements reorganize themselves “happens all at once,” so it’s hard to see under the hood. This is partly because the transformation “rotate 90 degrees” is doing too much: it’s not just interposing rows and columns—which would just be taking the transpose, like a matrix, i.e. reflecting across the main diagonal—it’s also reversing either the rows or the columns (depending on which way you rotate). This aspect of rotation is extrinsic to the real story, so it’s a distraction. It would be better to reflect across the main diagonal—but even this, I think, can hide the point. In the above picture, the proof idea is displayed in a single rectangular array, without transforming it at all: the key idea is to group first by rows and then by columns, or vice versa. In this way, there’s no danger of obscuring what’s happening to the individual elements in a flurry of motion.
 A logical subtlety bears mention. The theorem on commutativity of multiplication discussed in the last section presumed we were working with natural numbers as factors. Thus, for division problems with natural number answers, the equality of quotative and partitive division problems is assured by the reasoning in the previous section. To view the more general theorem, that quotative and partitive division are equal even when the answer is not whole, as a consequence of commutativity, we would need a more general version of commutativity that extends it at least to rational numbers. Writing the present post made me appreciate that this is actually subtle. Asserting that it follows from commutativity for naturals because is sweeping something significant under the rug: why do we think ? In the fraction field of an integral domain, this equation is the definition of multiplication for fractions, but in the present context, that’s passing the buck. This equation isn’t how we can or should define fraction multiplication to children: as Klein himself (op. cit., pp. 29–30) noted, by itself it has the character of an arbitrary convention, whereas multiplication of rationals ought to be able to be given a meaning. The Common Core defines as , with the division interpreted partitively; in other words, “ is derived from in the same way is derived from .” (This latter sentence is something I’m sure I read somewhere—I thought it was the Common Core, but I can’t seem to find it now.) Then . We need to convince ourselves that this latter is . The trickiest part is a good rationale for , but even granting this, we still have to argue that
In other words, in getting to the bottom of commutativity for rationals, associativity of multiplication is also implicated!
 Here, I need the hedge about rendering a single quantity as output because one reasonable definition of division for whole numbers yields a whole number quotient and a remainder as output, whereupon the question of equality between and becomes meaningless; the two sides are not even the same kind of thing. Similarly, if the fraction is interpreted as a ratio rather than a single quantity, we have the same problem.