An algebraist, a finitist, and a determinist walk into a statistics classroom. They are all the same person and worse: the teacher, so the joke is on the students.
For reasons still partly obscure to me, my department has given me the opportunity to teach an introductory probability and statistics course for a second time. People often speak of impostor syndrome in mathematics, but this is something more like double agency. I feel like an embedded resistance fighter, my mind at intervals crafting subtle acts of sabotage, constantly wary that I might be found out.
I won’t deny the usefulness of adopting a probabilistic perspective, but its utility is also my chief complaint. It is attractive to view the world probabilistically because it allows the simplification of complicated processes into rules whose efficiency outweighs the sacrifice of accuracy. This preference is very human, but it can also lead to convenient fictions which are destructive. In the scariest cases, these hazards become amplified through algorithm and automation, as described in O’Neil‘s Weapons of Math Destruction, for instance. We are nobler, and maybe more human, when we take the time to get the whole story right.
So here I am, in front of a fairly large class of largely business majors, charged with teaching them a mathematical framework for the shorthands they will need to operate in an economy that pressures them (us?) to, above all else, produce, to our collective peril. How can I convey my concern to the students without giving myself away, without demeaning our purposes? Where can we find space for reflection on the apparatus that brings us together when there are eleven chapters to get through and everyone is like freaking out about how to use Bayes’ theorem before the first exam?
It’s not clear to me that this is possible, or at least that it’s not mutually exclusive to my having reasonable expectation of further opportunities to win my bread teaching mathematics to college students (see: economic pressures). So, one resorts to seeking nourishment from the mathematical substance of the course (it helps that we start with a good dose of combinatorics). So let me turn to the fun which motivates this post, with apologies to probabilists and statisticians everywhere for any misrepresentation of which I may be guilty.
Discussion Question:* What’s your favorite infinite discrete probability space?
*Do not attempt use in real classroom situations. The author denies any first-hand experience.
Here’s, I think, the simplest natural example: toss a coin until a heads appears, and count the number of tosses it takes. The points of the sample space can be identified with the positive integers, and so we have a discrete random variable $X$. Supposing the coin is fair, our probability function says that the chances it takes $n$ tosses to get a heads is $P(X = n) = \frac{1}{2^n}$, just by the multiplication rule for probabilities of independent events.
I shied from going further into this example with my class because it occurred to me that showing that this all makes sense requires one to know a thing or two about convergent series (they don’t, mostly). That is, you should have to prove that the sum of all probabilities in this experiment,
\[ \sum_{n=1}^\infty \frac{1}{2^n} =1. \]
But what if we could flip the script . . . Does the fact that this is a sum over the values of a valid probability function for an actual experiment mean that the sum does converge to 1? More broadly, can we prove things in mathematics by reference to real-world (albeit, probabailistic) phenomena?
This all reminded me of a bit of combinatorics folklore:
Theorem. If $k$ and $n$ are natural numbers with $k\leq n$, then $\frac{n!}{k! (n-k)!}$ is a natural number.
Proof. This is $\binom{n}{k}$, which counts something. $\square$
This is actually a common technique in combinatorics: one proves the integrality of some rational expression by giving it a combinatorial interpretation, which is to say that it counts something. In the case of the binomial coefficients, this thing is size $k$ subsets of an $n$ element set, but one might also consider rational expressions that count set partitions, lattice paths, or domino tilings, to name a few. The fact that all of these examples can be rendered mundanely (“How many ice cream combinations can you make choosing $k$ flavors from a menu of $n$?”, etc.) and are thus finite makes this sort of proof-by-reference-to-phenomenon untroubling. But something spooky must happen when we consider the infinite.
The philosophical problems known as Zeno’s paradoxes were contrived to support an argument that motion is an illusion, and they rely on the unwieldiness of the infinite. In the “dichotomy paradox,” Zeno says (through Hofstadter):
‘. . . in getting from A to B, one has to go halfway first — and of that stretch one also has to go halfway, and so on and so forth.’
It’s ordinary that each number has a successor and that one can always split an interval in two, but, in extrapolation, we find ourselves with the tricky business of completing an infinite number of tasks in order to get from A to B. How can we possibly arrive?
It is said that, upon hearing Zeno’s paradoxes, Diogenes the Cynic2 said nothing, stood up, and simply walked (presumably away from whomever was explaining) to demonstrate the absurdity of the argument. At some point in the last few millennia, this incident became entwined with the phrase solvitur ambulando, “it is solved by walking,” now a motto for peripatetics everywhere. But this phrase also captures the spirit of the kind of proof-by-(probabilistic)-phenomenon wished for above. Diogenes’ demonstration is more compelling if we regard it as alternative proof-by-phenomenon of the convergence of the sum $\sum_{n=1}^\infty \frac{1}{2^n}$, which fact actually resolves Zeno’s paradox. For the walker, the time it takes to complete each task necessary to get from A to B diminishes by a constant proportion at each stage, just as the distance does. This is obvious if one already knows from experience that it only takes a finite amount of time to get from A to B, which is I guess what the Cynic was getting at.
In any case, because we can get from A to B, and because the fair coin tossing is an honest experiment, we have $\sum_{n=1}^\infty \frac{1}{2^n} =1$. But there’s more! We can soup up either of these proofs-by-phenomenon to get ourselves convergence any positive geometric series2, without the advanced technology of limits. Let’s just do the probability phenomenon proof, leaving the walking proof as an exercise. : )
Theorem. For any $0<r<1$,
\[\sum_{n=0}^\infty r^n = \frac{1}{1-r}.\]
Proof. Think of a coin which is biased to give heads with probability $1-r$ and tails with probability $r$ when tossed. We toss until we get a heads and count how many tries it takes. Then the probabilities associated with each outcome are given in the table below.
$n$ | 1 | 2 | 3 | 4 | … | |
---|---|---|---|---|---|---|
$P(X=n)$ | $(1-r)$ | $r(1-r)$ | $r^2(1-r)$ | $r^3(1-r)$ | … |
Summing over all outcomes, we obtain
\[ (1-r) \sum_{n=0}^\infty r^n =1. \]
Divide by $(1-r)$. $\square$
And it could go further: If you have a convergent series of positive terms, you could always view (normalized) partial sums as giving the cumulative distribution function for an infinite discrete probability space. I wonder which series among these have “easy” interpretations as real experiments.
In algebraic combinatorics, there are famous collections of numbers (e. g. Kronecker coefficients, Somos sequences) that are known to be integral but lack a combinatorial interpretation. Other collections which do admit combinatorial interpretation, like the binomial coefficients, can be used used to give the information of a (finite) probability space. It seems there should be a brand of “phenomenal calculus” to bridge the gap to the countably infinite. Say, has any one cooked up probability experiments that can be interpreted to compute values of the Riemann zeta function?3
And now I’ve wandered far enough with this that my original dilemma is no longer pressing, solvitur ambulando. The probabilistic perspective has proved itself more than just useful. On a good day, it might even be interesting.
1 Winner: Best Epithet Category, Panhellenic Philosphers’ Games, 383 BCE.
2 Indeed, the probability distribution involved is called a “geometric distribution.”
3 Apparently $\frac{1}{\zeta(2)}=\frac{6}{\pi^2}= 0.6079\dots$ can be interpreted as the probability that two randomly chosen integers are relatively prime. You could “phenomenally” deduce the convergence of $\sum_{n=1}^\infty \frac{1}{n^2}$ from the fact that any two integers must have a GCD, but as far as I can tell you can’t really get at the actual value through this game. See Section 3.5 of Kalman’s recently discussed article and references for more.
Disclaimer: The opinions expressed on this blog are the views of the writer(s) and do not necessarily reflect the views and opinions of the American Mathematical Society.
Comments Guidelines: The AMS encourages your comments, and hopes you will join the discussions. We review comments before they are posted, and those that are offensive, abusive, off-topic or promoting a commercial product, person or website will not be posted. Expressing disagreement is fine, but mutual respect is required.