**A great statement of purpose will make your application.** And while a not-so-great statement of purpose might not break your application, it would be a lost opportunity: the statement of purpose is your chance to convince the admissions committee that you are a good fit for the graduate program they oversee.

The trouble? Writing a convincing statement of purpose is tricky, and it comes naturally to almost no one. When I first tried to write mine, I spent a great deal of time staring at a blank page – writing a few words, only to delete them immediately.

The opening “when I was ten years old…” felt cliché. Plus, my interest in math was less a revelation than it was a snowball effect, made more difficult to ignore with each new, tantalizing piece of information I absorbed (the Fourier transform can decompose sound waves into their constituent parts?!). So pretending that my fifth-grade teacher or a childhood science fair was the singular impetus for my impending commitment to a lifelong career in mathematics seemed dishonest. On the other hand, starting the statement with some variation on “I am excited to apply for the PhD program in math at University X” felt too generic.

In the end, I decided to begin my statement of purpose with, well, a statement of purpose. Without preamble, I laid out my professional goals (they were specific, and somewhat unique) and explained how they had come to be. This opening allowed me to segue into my reasons for applying to each program, and from there, into my research background – two integral components to any statement of purpose.

I say this not to argue that this is the best or only way to structure a statement of purpose – it’s not – but to emphasize the following point: a great essay is always genuine, thoughtful, and specific. Providing unique details about your motivations (think: a story about a memorable encounter with mathematics, rather than a generic “I enjoy problem-solving”) will make for an honest, compelling essay. For more specific advice on crafting a statement of purpose, read this.

If time allows, share your essay with the professors who are writing your recommendation letters – it will allow them to write letters that reflect your strengths as relevant to the programs you’re applying to. And don’t forget to have friends and/or professors edit your essay.

If the prospect of crafting a statement of purpose is overwhelming, remember: at the end of the day, your goal in a grad school application is to communicate that you are prepared, both academically and personally, to do research in the program you’re applying to. That’s it. If you successfully communicate why you’re prepared for a research career in your statement of purpose, you’re well on your way to making a convincing argument for why you should be admitted. And once you are, remember:

**You are your own best advocate. **You may feel lucky to get into grad school when it happens – and you should! – but remember, too, that whichever graduate program you choose is lucky to have you. Advocate for yourself accordingly, and stick to your boundaries when it comes to work environment, hours, pay, health care, teaching load, and the like. While grad school requires a certain amount of sacrifice and compromise, on the whole, it should support, rather than hinder, your personal life – just like any other job.

Here’s a scenario I hear all the time: “My partner and I applied for all (or many) of the same grad schools, but we were accepted to different ones (on different sides of the country).” Sometimes, I’ll ask whether they communicated this fact to the relevant universities, and more often than not, I get a look of confusion in reply. If you’re accepted to a particular program and your partner isn’t, you can write a polite note to the department informing them of the situation – delicately, of course. Yes, you can – and should!

Graduate school is a significant, long-term commitment that people undertake as fully-fledged adults, often with partners, dependents, and major life considerations (starting a family, caring for parents) in tow. In this respect, it is a far cry from undergrad, and should be approached accordingly. When making the decision about which programs to attend, remember that finding a supportive program and advisor, and communicating your personal and professional goals to them as appropriate, is key. Because ultimately, grad school should be a means to pursuing a fulfilling life and career.

For advice on how to survive – and thrive! – in grad school once you’re in, read my previous post here.

]]>

You and nine other friends have been trapped by an evil hat-maker (who is a recurring character in these sorts of riddles). As part of his evil plan, the hat-maker has assigned each one of you a distinct hat color. These ten colors and their assignments are public knowledge, in the sense that they are known to both you and all of your friends. In order to test your affinity for your assigned color, the hat-maker has hidden ten hats of the prescribed colors in ten different boxes (one hat per box). The boxes are also colored with the ten different colors, but the hat contained inside a box may or may not correspond to the color of the box. Your group is now offered the chance to recover their hats, by participating in the following game.

One by one, each of you will be allowed to look inside up to five of the ten boxes. If you successfully find the box containing your hat, then the hat-maker will make a note of this and move on to the next person. (The hat itself is not yet removed from the box in this situation.) If all ten of you succeed, then (as a group) you win the game and are rewarded with the hats! However, if even one person fails to find their hat, then all of you are sentenced to bareheadedness. You are allowed to strategize with your friends before beginning the game, but no communication is allowed once the game starts. (Thus you cannot communicate what you find in your five boxes to your friends, and you can’t be sure which boxes your friends have looked inside unless you agree on this beforehand.) Note, however, that the boxes are inspected one after the other, so that an individual may decide which boxes to inspect based on the results of boxes that he or she has already inspected.

The simplest strategy is of course to simply have each member of the group inspect five random boxes. This obviously results in a $1/2^10$ chance of the group winning. Can you come up with a better strategy to stave off bareheadedness? What if instead of ten people (and hat colors), there are twenty people (and hat colors)? Can you come up with a strategy that gives a non-negligible chance of winning, even as the number of people grows to infinity?

This month’s puzzle was communicated by H. Dai.

]]>“The rapid advance of computers has helped dramatize this point, because computers and people are very different. For instance, when Appel and Haken completed a proof of the 4-color map theorem using a massive automatic computation, it evoked much controversy. I interpret the controversy as having little to do with doubt people had as to the veracity of the theorem or the correctness of the proof. Rather, it reflected a continuing desire for human understanding of a proof, in addition to knowledge that the theorem is true” – Bill Thurston (from [6])

A machine-checked proof is a proof written in a piece of software called a ‘proof assistant’ which ensures the proof complies with the ‘axioms of mathematics’ and the rules of logic. The question of the significance of computers in proving theorems can be polarizing (and for good reason). The quotations above represent some points of view relevant to this topic. In this post we will try answer three questions:

- What exactly is a computer-assisted proof?
- What are the advantages and the drawbacks of using computers to prove theorems?
- What should an interested person do to start learning to use proof assistants?

**The motivating problem**

Computer-assisted proof is a technique. Mathematicians care about new techniques when they solve some problem insoluble by old techniques. This is our motivating problem:

Above we see two solved problems. The first is a solved Rubik’s cube, while the second is Andrew Wiles’ proof of Fermat’s Last Theorem. A big difference between proving facts about Diophantine equations and solving Rubik’s cubes is that when a person solves a Rubik’s cube they know immediately it is solved, whereas Wiles’ proof (for example) took many months to properly referee.

As we progress in our mathematical education, our ability to check answers decreases. For example, my first lesson in mathematics was my mother teaching me to count. I was taught the numbers from one to ten *along with the fact that for small sums I could check my answer by counting with my fingers. *When one is taught to solve algebraic equations, one is also taught that answers can be checked by substitution. Checking Calculus is trickier, yet we can rely with good confidence on Wolfram Alpha. Upon reaching real analysis and abstract algebra, students check their work ultimately by handing it in and seeing if the professor buys their argument.

**What is a proof assistant?**

A computer proof assistant allows for more systematic checking of mathematical arguments. A user writes their proof up in a *semi-formal language* (meaning not as formal as formal logic and not as informal as ordinary mathematics). The proof assistant checks the proof against some foundation of mathematics. Normally when we think of foundations, we think of set theory. Yet for technical reasons proof assistants are implemented with ‘type theoretic’ foundations. Type theory is another foundation of mathematics which was actually proposed by Russell and Whitehead around the same time that others proposed ZFC set theory. Although there are philosophical implications of varying foundations across pieces of mathematics, for us, this is a moot point and we will say no more on the matter.

Below is a picture of a correct proof and an incorrect proof in the proof assistant Lean.

The proof on top is clearly wrong because of the error message showing up in red. The proof on the bottom is quickly seen to be correct since no error occurs. Just like with a Rubik’s cube, it is immediately clear if a proof is correct or not.

This seems a bit complicated. Is it ever necessary? It can be. For example, Hales’ proof of the Kepler conjecture was too complicated to be checked by journal referees. That proof was eventually verified with a proof assistant called HOL-light. The project to formally verify Kepler’s conjecture was called the Flyspeck project. Flyspeck took several years to complete and 5000 processor hours on the Microsoft Azure Cloud [5]. Some people hoped for a less computer-heavy proof so that mathematicians could read the proof.

Georges Gonthier along with his colleagues at Microsoft Research produced the first formally verified proof of the four color theorem [2]. This is different than the original computer-based proof of the four color theorem, which was essentially a standard mathematical argument that involved an absolutely massive computer calculation. Gonthier’s work certified that the algorithm purported to proof the four color theorem actually did what we believed it did.

Gonthier’s team also formally proved the Feit-Thompson odd order theorem, a cornerstone of the classification of finite simple groups, using the proof assistant Coq [3]. The original Feit-Thompson paper was 255 pages. Other high-profile projects include formal proofs of the prime number theorem, Gödel’s incompleteness theorems, and the central limit theorem.

**How to get started?**

These tools are not used exclusively for massive proofs that take years. There exist formal libraries containing theorems and definitions in real analysis, general topology, representation theory, and abstract algebra. Proof assistants are also used in industry to verify software and algorithms. This is quite powerful. As soon as one can state “this program P has no bugs” in a mathematically rigorous way, one can try to prove it. And with a formal proof software developers can be sure their programs are error-free.

A number of proof assistants are available. They are all free.

Isabelle-HOL is a proof assistant created by Lawrence Paulson. It is based on higher-order logic. Isabelle has massive libraries already in place as well as some of the most powerful automation available. Here, automation means the proof assistant can find short proofs for you. HOL-Light is a similar program, with a smaller kernel, written by John Harrison.

Coq and Lean are all based on dependent type theory. They were developed by teams led, respectively, by Thierry Coquand and Leo de Moura. That means data types can depend on other data types–we want that. For example, we’d like a type Fin$(k)$ whose inhabitants are finite sets of cardinality $k$, for arbitrary $k$. But these proof assistants, despite being newer, have weaker automation because it is, for technical reasons, harder to implement automation in dependent type theories (at least at the moment).

Online manuals exist for all the proof assistants mentioned above. Lean 2 has an interactive web tutorial. The current version of Lean is Lean 3, but the author found this tutorial a good way to get a flavor for proof assistants in general.

**Problems and outlook**

Proof assistants are not yet at the point where they can reasonably be used by working mathematicians. Hales did use HOL-light for his work on the Kepler problem, but this project was not the sort of thing mathematicians would do unless they absolutely had to. The current libraries are not large enough to include everyday arguments about everyday theorems. For example, we may go to prove a standard result about compact Lie groups only to discover the Haar theorem (proving the existence of Haar measures) is not in our library. This theorem is quoted all the time but its proof is lengthy and with present technology we should expect a formal proof of the Haar theorem to take a few years.

A more fundamental objection says that mathematics is, to use Thurston’s language, ultimately about the *human understanding* of mathematical objects and that proofs are there only secondarily to prevent our understanding from wandering off. In the words of the great combinatoricist G.-C. Rota, “saying a mathematician ‘proves theorems’ is like saying an author ‘writes words’.” It would follow that algorithmic ‘proof search’ type arguments are undesirable.

Yet there is no reason why one cannot both understand mathematical objects and use computer proof assistants. Machine checking is not synonymous with proof search. Presently, we have human journal referees check detailed technical arguments. If it were possible to use a computer to check these, we would lose nothing and gain more accountability.

Steven Wolfram has recently gotten interested in formal proof. People from Wolfram have worked on getting Mathematica and formal proof to coalesce [1]. The relevant linguists, computer scientists, and mathematicians are constantly considering ways to get the computer code to look more like ordinary mathematics. An ultimate goal is streamlining the refereeing process.

All journals require us to write papers in LaTeX–this was not always so. Perhaps in the future, some journals will require proofs written in a quasi-formal language for computer checking. Then, journal referees could focus more energy on the clarity and the overall presentation of mathematical articles–thus *furthering human understanding of mathematics.*

**Acknowledgements. **I am most indebted to Jeremy Avigad at Carnegie Mellon University and Tom Hales at the University of Pittsburgh for teaching me what I know about proof assistants. I thank also Emily Riehl at Johns Hopkins University for introducing me to Bill Thurston’s brilliant article “On Proof and Progress in Mathematics”, which I referenced several times in this post. And lastly, I dedicate this note to the late Vladimir Voevodsky. I never met him personally but his influence on my many teachers, his papers, and his recorded lectures were highlights of my undergraduate education.

**References**

[1] Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. Intelligent Computer Mathematics, 10th International Conference, CICM 2017, Edinburgh, UK, July 17-21, 2017, Proceedings.

[2] Gonthier, G.: Formal proof—the Four Color Theorem. Notices of the AMS 55(11), 1382–1393 (2008)

[3] Georges Gonthier, Andrea Asperti, Jeremy Avigad, Yves Bertot, Cyril Cohen, Fran¸cois Garillot, Stéphane Le Roux, Assia Mahboubi, Russell O’Connor, Sidi Ould Biha, Ioana Pasca, Laurence Rideau, Alexey Solovyev, Enrico Tassi, and Laurent Théry, A machine-checked proof of the odd order theorem, Interactive Theorem Proving – 4th International Conference, ITP 2013, Rennes, France, July 22-26, 2013. Proceedings (Sandrine Blazy, Christine Paulin-Mohring, and David Pichardie, eds.), Lecture Notes in Computer Science, vol. 7998, Springer, 2013, pp. 163–179.

[4] Hales, T., & Hales, T. C. (2012). *Dense sphere packings: a blueprint for formal proofs* (Vol. 400). Cambridge University Press.

[5] Hales, T.C. Introduction to the Flyspeck project. In Thierry Coquand, Henri Lombardi, and Marie-Franc¸oise Roy, editors, Mathematics, Algorithms, Proofs, number 05021 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 2006. Internationales Begegnungs- und Forschungszentrum f¨ur Informatik (IBFI), Schloss Dagstuhl, Germany. http://drops.dagstuhl.de/opus/volltexte/2006/432.

[6] Thurston, W.: 1994, ‘On Proof and Progress in Mathematics’, *Bulletin of the American Mathematical Society* **30**(2), 161–177.

The spotlight this month discusses an article entitled, “Gerrymandering, Sandwiches and Topology” written by Pablo Soberón. This article is particularly interesting as just last week the Supreme Court heard a case regarding gerrymandering in Wisconsin. Recall that gerrymandering refers the process of drawing congressional districts in such a way that the “dominant” political party will win in a majority of the districts. There are many mathematical articles written about gerrymandering that discuss different ways in which mathematical formulas can be used to create a “fair” political map. The article in the Notices has a different take and shows that there are also mathematical theorems that can be used to create an “unfair” map that fits all criteria that are in place to help avoid gerrymandering. The article discusses three well known algebraic topology results including the so-called ham sandwich theorem. The author presents proofs of these results in a very accessible manner so even a graduate student with little background in algebraic topology can understand. In addition, the author shows how, when using these three theorems, we can still create a division of a state that the dominant political party will win. We encourage you to take a moment out of the busy time of the semester to read this article, or if not this one find an article that strikes your fancy. Until next time, enjoy your semester and tune in in a couple of weeks for the next AMS Notices Spotlight.

]]>

The phrase “receptive learning” conjures up a vision of a lecture hall filled with students, eyes glazed over, starting forward as the instructor, with their back to the class, writes on a chalkboard. Students in this classroom become, as the radical educator Paulo Freire contended in his book *Pedagogy of the Oppressed*, “containers… receptacles to be filled by the teacher.”

Last October, I presented a paper at the Bergamo Conference on Curriculum Theory and Classroom Practice, entitled “The Pedagogy of the Student: Reclaiming Agency in Receptive Subject-Positions.” In this presentation, which went on to be published in the Journal of Curriculum Theorizing, I discuss the active/passive dichotomy and the way in which being active has become masculinized and being passive has become feminized. I discuss work by feminist scholars who seek to reclaim the idea of receptivity. Zelia Gregoriou, in her chapter, “Does speaking of others involve the receiving the ‘other’?” (in the 2005 volume, Derrida & Education), argues for an alternate conception of receptivity, one that involves choosing to invite in, welcome, and host the other.

Imagine what happens if we reconceive listening not as a process of passivity, but as an active process of making sense of ideas? This is where things get tricky; do we want to define listening as an action (much like talking or raising ones hand is?) Or is listening, like receptivity, its own category of not-quite-passive but still not active actions? I suggest in my article that listening can be something that gives students agency without necessarily being active.

What does this mean for math educators? For one, I think the terminology “active learning” is misleading, insofar as it implies that “passive” or “receptive” learning is undesirable and that receptivity is a therefore a negative quality. I think that we need to consider the ways in which we want students to interact and to find ways to value and respect the act of listening rather than only considering talking to be the important part of learning. We can also think about the ways in which we use discursive moves when we teach; do we direct students to consider and think about each others’ ideas, or do we merely judge the ideas that students present to us?

I am a strong believer in group work and participation in the classroom. I am not calling here for a return to the days of boring lectures and disengaged students. But I do think that we need to be careful about the terminology we choose and the conceptions we develop in our ideas of teaching and learning.

]]>And I did. The circumstances of my application-writing meant that I was efficient: I got things out of the way as early as I could, and as a result, I wasn’t racing to meet deadlines come December – instead, I was out and about, exploring my temporary home. On the other hand, I was not as thoughtful as I might have been in putting together my applications, and in considering what my grad school experience might look like.

While I relish the memories of my whirlwind semester abroad, here’s some advice to make your grad school application process a little less hectic – and a little more organized – than mine. This is the first part of a two-part post; Part II will be posted next month.

**First things first: know why you’re applying to grad school and what it’s all about. **To a college senior, grad school can seem like the logical next step in life, the most natural sequel to a successful college career. If I’m being completely honest, I applied to grad school in part because it was a well-defined, familiar path: more schooling (a known quantity), followed by a career in academia, which I imagined would be full of fulfilling teaching interactions and, importantly, blissfully free of business-wear and rigid 9 am start times.

Ultimately, though, a PhD is a research degree. It’s not an extension of college, or an easy career path for people who excel in school. What it is is a fulfilling path for those who love striking out on their own intellectual journeys, who relish both the uncertainty and the thrill of exploring the unknown.

The question that I should have stopped to ask myself at some point during my semester in St. Petersburg was: Why do I want to get a PhD in math? If I could give myself one piece of advice regarding the grad school application process, it would be to consider that question carefully – not least because it would have made sitting down to write my statement of purpose a smoother process.

So, particularly if you’re currently an undergraduate, I suggest that you take out a sheet of paper and write down all the reasons why you want to go to grad school, from the obvious (“I want to pursue a career in academia”) to the personal (“Ever since I took that trip to NASA when I was ten, I haven’t been able to imagine doing anything else”). Be as specific as possible. When you’re done, do some research (including chatting with your college professors) to make sure that your motivations match up with the reality of going to grad school. This exercise will ensure that you know you’re going to grad school for the right reasons and will also set the tone for your statement of purpose and the remainder of your application.

**Once you’ve explored your motivations for applying to grad school, begin gathering your application materials – starting now. **One downside of studying abroad during my senior year was that I didn’t have time to take the GRE subject test, which significantly limited my options for where I could apply. I’m happy with where I ended up, but my experience underscores the point that it’s important to plan as far ahead as possible.

Though individual program requirements vary, most math graduate programs require the following: the GRE (and often, a GRE subject test), three recommendation letters, undergraduate transcripts, a CV/resume, and a statement of purpose. Scheduling the GRE and requesting recommendation letters are two simple but time-critical steps that can save you a huge amount of hassle down the road. GRE spots fill up quickly, so to avoid having to take it at an inconvenient time or location, sign up as soon as possible. Similarly, professors get busy, so requesting letters well ahead of time ensures that no one is left scrambling at the last minute. Knock these items off your to-do list now, and you’ll be on your way to submitting your applications on-time, and with minimal hassle.

Next, update your CV and sketch out a timeline for preparing for the GRE. Everyone has a different approach to studying for the GRE, but what worked well for me was to spend a few days with a prep book right before the exam, memorizing the various formulae (particularly for the essay section and the math section). Any more than that would have been overkill; any less, and I would have put myself at an unnecessary disadvantage. The subject GRE is, by many accounts, harder, and will probably require more preparation.

Another important application material is, well, money. Graduate school applications can be expensive: GRE registration fees, travel to GRE centers, and application fees add up quickly. Be sure to budget for the cost of applying to grad school, and if that’s not feasible, look into resources at your institution or elsewhere that can help with these costs. Many graduate programs have application fee waivers available for eligible students.

**Next, do some research to decide which programs provide the best fit with your personal and professional goals. **If you’ve planned ahead then, unlike me, you won’t make your decision about where to apply based on which schools don’t require subject GRE scores. Hopefully, you’ll instead consider the focus of the program, the professors’ research interests and their similarity to your own, the culture and diversity in the department, and other factors, like location. Much of this information can be determined by looking at the programs’ websites.

Some, however, can’t. One thing that I *did *do right in deciding where to apply was talking to my college professors – particularly those who wrote my recommendation letters. Because they’re plugged-in to the academic network, they knew professors whose work was similar to my interests, and departments that had a particular focus or culture, information that I wouldn’t have been able to find out online. Their insider knowledge helped steer me toward a program that was a good academic fit.

The other helpful step that I took was to contact the professors I was interested in working with. I sent each of them a quick email indicating interest in their research and their graduate program, and asking if they were accepting students. If they were, I set up a time to talk with them. Much like a job interview, it gave me a chance to determine whether I would enjoy working for them, and it also allowed the professor to assess whether I would be a good fit for their group. I suspect that having direct communication with professors also gave me an edge in the application process, as I was more of a known quantity after a simple phone call.

At the same time as I was deciding which programs to apply to, I also decided that I would apply for the NSF fellowship. It seemed overwhelming to add another application (with an additional research statement, to boot) into the mix, but I’m glad I did. The benefits of having extra funding in grad school are well worth the time it takes to put together an application.

After learning about the programs and fellowships that you’re applying to – and maybe even making an excel spreadsheet with their deadlines and requirements – it’s time to return to your application materials. Begin by requesting transcripts and submitting GRE scores.

At this point, you’ll have a solid head start on your grad school applications, leaving you free time to enjoy the last days of summer. Check back next month for part II of this post, in which I’ll focus on crafting a statement of purpose and advocating for yourself throughout the application process.

]]>$ \hspace{4cm} x=x_1 +\frac{x_2}{2!}+\frac{x_3}{3!} + \cdots + \frac{x_n}{n!} + \cdots, \hspace{2cm} (*) $

**where $x_1$ can be any integer, but for $ n \geq 2$, $x_n \in \{ 0,1,…,n-1 \}.$ Furthermore, if we require that the partial sums be strictly smaller than x, then such a representation is unique.**

**Remark:** One cannot help recalling decimal or binary expansion of numbers. Notice that $\frac{n}{n!}=\frac{1}{(n-1)!}$ (drops back to previous digit), so the bound on $x_n$ is logical.

**Proof:**

Choose the biggest integer $x_1$ *strictly* smaller than $x$. If $x_1+\frac{1}{2!}$ is strictly less than $x$ then choose $x_2=1$, otherwise, choose $x_2=0$. Assume we have picked $x_1, x_2, … , x_n$, then we’ll choose $x_{n+1}$ to be the largest of $\{ 0, 1, 2, … , n \}$ so that

$ \hspace{6cm} x_1+\frac{x_2}{2!}+ … +\frac {x_n}{n!} + \frac{x_{n+1}}{(n+1)!} < x .$

We’ll prove that this inductive choice of $ \{ x_n \}_{n=1}^\infty $ satisfies the expansion $(*)$.

**Claim**: For every ,

$ \hspace{6cm} 0 < x- (x_1+\frac{x_2}{2!}+ \cdots +\frac{x_n}{n!}) \leq \frac{1}{n!} \hspace{2cm} (ES) $

Proof of the claim: If $x_n \neq n-1$, this is an immediate consequence of the choice of $x_n$: by optimality of $x_n$ we have

$ \hspace{5.5cm} x_1+\frac{x_2}{2!}+ \cdots +\frac {x_n}{n!} < x \leq x_1+\frac{x_2}{2!}+ \cdots +\frac {x_n + 1}{n!} $

Subtracting $ x_1+\frac{x_2}{2!}+ \cdots +\frac {x_n}{n!} $ from each term yields (ES).

If $x_n=n-1$, the maximum possible, then we have the identity

$ \hspace{1cm} x_1+ \cdots +\frac {x_n}{n!} + \frac{1}{n!} = x_1+\cdots + \frac{x_{n-1}}{(n-1)!}+\frac {n-1}{n!} + \frac{1}{n!} = x_1+\cdots +\frac {x_{n-1}}{(n-1)!} + \frac{1}{(n-1)!} $

To obtain (ES) it suffices to show that this quantity is bigger than or equal to x. Comparing the first and last expressions, we see that we have reduced case n to (n-1). Thus, either we work backwards to reach n=1, considering cases for $x_{n-1}$ for this current step and so on, or we switch to an inductive proof, to attain (ES). Case n=1 is obviously true.

The uniqueness part’s proof: Assume that some has two different representations:

$\hspace{3cm} x_1 +\frac{x_2}{2!}+\frac{x_3}{3!} + \cdots + \frac{x_n}{n!} + \cdots = y_1 +\frac{y_2}{2!}+\frac{y_3}{3!} + \cdots + \frac{y_n}{n!} + \cdots $

We’ll prove that one of them is a finite sum. Assume that $k$ is the first index where $x_k \neq y_k$, and, without loss of generality, that $x_k > y_k \ $ . Then,

$ \hspace{6cm} \frac{x_k – y_k}{k!} = \displaystyle\sum _{n=k+1}^\infty \frac{y_n – x_n}{n!} \hspace{2cm} (EQ)$

Notice that while $\frac{x_k – y_k}{k!} \geq \frac{1}{k!}$,

$\hspace{2cm} \left|\displaystyle\sum _{k+1}^\infty \frac{y_n – x_n}{n!}\right| \leq\displaystyle\sum _{n=k+1}^\infty \frac{|y_n – x_n|}{n!} \leq\displaystyle\sum_{n=k+1}^\infty \frac{n-1}{n!} =\displaystyle\sum_{n=k+1}^\infty \left(\frac{1}{(n-1)!}-\frac{1}{n!}\right) = \frac{1}{k!} .$

Thus, the only way for (EQ) to hold is to have $x_k=y_k+1$ and for all $n>k$, $x_n= 0$ and $y_n=n-1$ ; an analogous situation to $1.73=1.7299999…$ in decimal base.

Note that out of the two representations only one is strictly increasing to its limit, proving the uniqueness claim.

**The Backstory: **My independent discovery of this expansion was triggered by my search for a compact set in $R$ with no isolated points (every point’s every neighborhood contains other points of the set as well) and *no* rationals. This was a question my favorite analysis professor, Dr. Rezaee, had asked me to think about.

For months my approach had been to start with $[0,1]$ and then try to remove successively more and more subsets. But I had failed to land on the right set. Then there was this morning that I was sitting in a different class when suddenly I recalled an exercise from Tom M. Apostol’s *Mathematical Analysis* book:

**Exercise: **The number $x=x_1 +\frac{x_2}{2!}+\frac{x_3}{3!} + \cdots + \frac{x_n}{n!} + \cdots $ (as in proposition 1) is rational if and only if there exists an $N \in \mathbb{N} $ such that

$\hspace{6cm} n>N \implies x_n=n-1.$

“I know so many irrational numbers!” I said to myself and there I was with a set.

Solution to the exercise: Suppose the condition holds. Then, as shown by uniqueness part’s proof above, the sum is equal to a finite sum, each of whose terms are rational.

For the other direction, suppose $x$ is rational then, for relatively prime $p \in \mathbb{Z},q\in \mathbb{N}$ we have

$\hspace{6cm} x=\frac{p}{q}=\displaystyle\sum_{i=1}^{\infty} \frac{x_i}{i!}$

Multiplying the sides by $q!$ yields

$ \hspace{6cm} q!\displaystyle\sum_{q+1}^{\infty} \frac{x_i}{i!} = p(q-1)!-q!\displaystyle\sum_{i=1}^{q} \frac{x_i}{i!}. $

The right hand side is an integer. If for even only one index $ i>q$ the equality $ x_i = i-1$ failed to hold, then we would have

$ \hspace{6cm} 0 < q!\displaystyle\sum_{q+1}^{\infty} \frac{x_i}{i!} <q!\displaystyle\sum_{q+1}^{\infty} \frac{i-1}{i!} =1 $

Which contradicts it being an integer. (The strict positivity is due to strictly increasing assumption on the series.)

**A Perfect Set**

**Proposition: 2 The set**

$\hspace{6cm} S=\left\{\left.\displaystyle\sum_{i =4}^\infty \ \frac{x_i}{i!} \right| \ x_i \in {1,3}\right\}$

**is a “perfect set” (closed, and each point is a limit point) without rationals.**

Proof: The idea is hidden in the arguments we have already made. The point is that when we change one digit by 1, the tail has to go full speed to catch up. Since here we have restricted to choices 1 and 3, when we change a digit then the new number is by a significant distance away from any member of $S$.

Take $y \in R \backslash S$. Let’s restrict to $y$’s of the form

$\hspace{6cm} y=\displaystyle\sum_{i=4}^{\infty} \frac{y_i}{i!} .$

Other cases where $y$ has earlier digits are just as easy, but we want to avoid complications in notation! Since $y$ is not in $S$, there is an index $j$ such that $y_j \notin \{1,3\}.$ For any given $x \in S$, the representations of $x$ and $y$ will differ somewhere earlier than $j$, say at $k$’th component, $4 \leq k\leq j$. Therefore,

$\hspace{2cm} \left|x-y\right|=\left|\displaystyle\sum_{i=k}^{\infty} \frac{x_i-y_i}{i!}\right|\geq \frac{\left|x_k-y_k\right|}{k!}-\displaystyle\sum_{i=k+1}^{\infty} \frac{\left|x_i-y_i\right|}{i!} \geq \frac{1}{k!}-\left|\displaystyle\sum_{i=k+1}^{\infty} \frac{i-2}{i!}\right|. $

It follows that

$\hspace{6cm} |x-y| \geq \frac{1}{(k+1)!}\geq \frac{1}{(j+1)!}$

The index $j$ depends on $y$ only, thus we proved that within radius $1/(j+1)!$ of $y$ there are no points of $S$, or, equivalently, $S^c$ is open, $S$ is closed.

Now, pick any $x\in S$. Given $\epsilon >0$, in order to find another point in $S$, $\epsilon$-close to $x$, move far enough in the representation of $x$, and switch $x_N$ to $1$ if it is $3$, or to 3 if it is 1. Keep all other digits the same. The new number is again is $S$, closer than $\epsilon$ much to $x$ provided that $\frac{2}{N!}<\epsilon$. Thus, every point in $S$ is a limit point.

**Questions:**

Do you know of other perfect sets without rationals (in the usual topology of $\mathbb{R}$)?

What else can be done with the factorials representation of numbers?!

We have proved that $e$ is an irrational number! Do you see where?

]]>The Bad:

- Keeping track of the papers was a nightmare. I had 30 new papers to go through every day and I spent several minutes of my precious class time handing them back to students. I also think the students struggled to keep all of these short quizzes organized. On my end, the fix is to be more organized (always the dream) and to have admitted that my system wasn’t working in order to try something else. To help keep the students more organized with so much paper floating around, I now try to communicate how I organize my resources more openly and that has seemed to help.
- In the last post, I said that grading was a “Good” but it was also a “Bad.” It felt better when (as planned) I could grade the same day as I gave the quiz. However, that wasn’t always possible. The weeks when I got behind on grading were horrible. I stared at the ever increasing pile of quizzes with dread until I got the courage to tackle it. I don’t really have a fix for this part of grading (it is nearly guaranteed that I get behind in grading at least once during any semester) so perhaps this issue is more of a warning label in case you try this yourself: getting behind in grading is rough.

The Ugly:

- I really didn’t enjoy giving something called a quiz every day. I think my students felt unnecessarily pressured to master material quickly. A bit of that pressure is good, but some students felt palpably apprehensive coming to class each day for several weeks at the beginning. They felt better after several reassurances that it wasn’t as high-stakes as the word “quiz” would imply. One possible way to fix this is to stop calling it a quiz, but that would take away all of the pressure, which doesn’t seem ideal either.
- Giving a quiz every day took class time. Sure, the quiz was written to take five minutes, but between passing them out, collecting them, and returning the previous quiz, I lost nearly ten minutes every day. You could fix this by giving these quizzes online before class, but then you lose the opportunity for feedback on the process and some students will use resources other than their brain (regardless of the rules or what is beneficial).

In trying to hold on to the benefits of daily quizzes while addressing some of the issues, I tried something different last spring. Instead of a quiz at the start of each day, I would write the same question that I would have given as a quiz on the board for the students to work on as they came to class and we would discuss paths to solutions together. On Friday, I would give a quiz consisting of questions nearly identical to those we had worked on for the past week. There was incentive to arrive on time since the students got a preview for the quiz, but I didn’t have to keep track of so many papers. The students didn’t have a strong reason to study ahead of time (since no part of this was graded) but we struggled through the problems together, resulting in some really good conversations about the previous material.

This routine came with its own challenges just like any other experiment with teaching style, but overall, I liked the vibe of my classroom with the “warm-up question” structure better than the daily quizzes. Neither of these options is a “one-size-fits-all” solution but both added a lot of richness to my classroom that I couldn’t have predicted.

]]>But one article in particular, “The AI detectives,” captured my attention. Rather than highlighting a specific application of AI, as the other articles do, this piece draws attention to the lack of transparency in certain machine learning algorithms, particularly neural networks. The inner workings of such algorithms remain almost entirely opaque, and they are accordingly termed “black boxes”: though they may generate accurate results, it’s still unclear how and why they make the decisions they do.

Researchers have recently turned their attention to this problem, seeking to understand the way these algorithms operate. “The AI detectives” introduces us to these researchers, and to their approaches to unlocking AI’s black boxes.

One such “AI detective,” Rich Caruna, is using mathematics to impose greater transparency in artificial intelligence. He and his colleagues employed a rigorous statistical approach, based on a generalized additive model, to produce a predictive model for evaluating pneumonia risk. Importantly, this model is intelligible; that is, the factors that the model weighs to make its decisions are known. Intelligibility is crucial in this setting, as previous, more opaque models conflated overall outcomes with inherent risk factors. For example, though asthmatics have a high risk for pneumonia, they typically receive immediate, effective care, which leads to better health outcomes—but which also led early models to flag them, naively, as a low risk group. Caruna et al.’s model is also modular, meaning that any faulty causal links made by the algorithm can be easily removed from its decision-making process. But while it is powerful, this approach is not well-suited to complex signals, like images—and it circumvents the problem of intelligibility in artificial intelligence, rather than addressing it head-on.

Gregoire Montavon and his colleagues, by contrast, have developed a method that uses Taylor decompositions to study the most opaque of machine learning algorithms, Deep Neural Networks. Their approach (which was not mentioned in the *Science *article) has the advantage of explaining the decisions made by Deep Neural Networks in easily interpretable terms. By treating each node of the neural network as a function, Taylor decompositions can be used to propagate the function value backward onto the input variables, such as pixels of an image. What results, in the case of image categorization, is an image with the output label redistributed onto input pixels—a visual map of the input pixels that contributed to the algorithm’s final decision. A fantastic step-by-step explanation of the paper can be found here.

Of course, none of these artificial intelligence techniques would be possible without mathematics. Nevertheless, it is interesting to see the role that math is now playing in furthering artificial intelligence by helping us understand how it works. And as AI is brought to bear on more and more important decisions in society, understanding its inner workings is not just a matter of academic interest: introducing transparency affords more control over the AI decision-making process and prevents bias from masquerading as logic.

]]>

In order to make sure that we are all on the same page, let’s briefly review the difference between topological and smooth manifolds. Recall (see for example previous posts on this blog!) that an $n$-dimensional topological manifold $M$ is defined using the data of local charts, each of which may be identified with an open subset of $\mathbb{R}^n$, together with continuous transition functions between them. If these transition functions are in addition smooth (in the usual sense of smooth maps on $\mathbb{R}^n$), then we say that $M$ has been given a smooth structure and is a smooth manifold. In order to talk about a diffeomorphism between two manifolds, we of course require that the manifolds themselves are smooth.

To those who have not thought about low-dimensional topology (and even for those who have), it is often difficult to get a feel for the difference between topological and smooth manifolds. (I certainly do not have any such intuition.) This is partly due to the fact that in low dimensions (less than or equal to three), every topological manifold admits a unique smooth structure. Thus, in order to think of a topological manifold that (for example) has no smooth structure, or two different smooth structures, we are already forced to consider examples in more dimensions than most of us are comfortable visualizing. Even setting this aside, it is difficult to see how one would go about distinguishing two smooth structures anyway, or how to prove from the definitions that a given topological manifold does or does not admit a smooth structure. Indeed, the first construction of a pair of distinct smooth structures on the same topological manifold (given for $S^7$, by John Milnor in 1956) came as a shock to many mathematicians.

Part of the difficulty in studying smooth structures on manifolds is that many of the introductory invariants in algebraic topology are either formulated purely in terms of the continuous structure (as is the case for homotopy or homology), or turn out to depend only on the continuous structure (as is the case for de Rham cohomology). Thus, slightly fancier tools are needed if one wants to systematically study smooth manifolds. We shall see later that, in dimension four, the primary (and in many cases, the only) strategy for studying smooth topology turns out to be afforded by gauge theory.

Before we continue, let’s recall our discussion of Freedman’s theorem. Associated to any simply-connected, topological four-manifold $M$, we described a unimodular bilinear pairing, called the intersection form, which could be viewed as a symmetric integer matrix with determinant $\pm 1$. Freedman showed that for any unimodular pairing $Q$, one could find a simply-connected, topological four-manifold $M$ with intersection form $Q$. Even more surprisingly, he proved that if $Q$ was even, such an $M$ was unique up to homeomorphism (among simply-connected, topological four-manifolds), while for odd $Q$ there were exactly two possible $M$. We might thus hope for a similar relationship to hold in the smooth category; or, failing this, to understand how the behavior of smooth and topological manifolds differ.

The first question to ask is whether given a unimodular pairing $Q$, it is always possible to construct a *smooth* simply-connected manifold with intersection form $Q$. It turns out that the answer is a resounding *no*. The first result towards this end was established by Simon Donaldson in 1983 using gauge-theoretic methods:

Let $M$ be a smooth, simply-connected four-manifold. Suppose that the intersection form $Q$ of $M$ is positive (or negative) definite. Then $Q$ must be diagonalizable.

Here, an intersection form $Q$ is said to be *positive* (or *negative*) *definite* if $Q(x, x) > 0$ (or $Q(x, x) < 0$) for all $x$, and is said to be *diagonalizable* if it is equivalent to a diagonal matrix over $\mathbb{Z}$ (which in this case must be plus or minus the identity). Although being definite is certainly a restriction on the lattice, it turns out that most unimodular lattices in high dimensions are in fact either positive or negative definite—for example, there are over a billion distinct definite lattices of rank 32. According to Freedman’s theorem, each one of these arises as the intersection form of a simply-connected topological four-manifold. But by Donaldson’s theorem, almost all of these are *not* smooth manifolds—the only definite unimodular pairings that arise as the intersection forms of smooth four-manifolds are (equivalent to) plus or minus the identity!

Donaldson’s theorem immediately implies the existence of a vast class of topological four-manifolds without any smooth structure. Moreover, it showed that the relationship between smooth four-manifolds and their intersection forms is rather more subtle than in the topological category. Since Donaldson’s theorem, much work has been done on investigating exactly which pairings $Q$ can arise as the intersection forms of smooth four-manifolds. So far, what is known is the following:

Let $M$ be a simply-connected, smooth four-manifold with intersection form $Q$. Then:

1) If $Q$ is definite, then $Q$ must be diagonalizable by Donaldson’s theorem. Conversely, all definite, diagonalizable $Q$ indeed arise as the intersection forms of smooth manifolds (namely, $m\mathbb{C}P^2$ or $m\overline{\mathbb{C}P}$$^2$).

2a) If $Q$ is indefinite and odd, then it is an algebraic fact (due to the classification of unimodular lattices) that $Q$ is necessarily equivalent to a diagonal matrix with $\pm 1$ on the diagonal. All such $Q$ indeed arise as the intersection forms of smooth manifolds (namely, $m\mathbb{C}P^2 \# n\overline{\mathbb{C}P}$$^2$).

2b) If $Q$ is indefinite and even, then it is an algebraic fact (due to the classification of unimodular lattices) that $Q$ is necessarily equivalent to a direct sum $aH \oplus b E_8$, where

\[H = \begin{bmatrix}

0 & 1 \\

1 & 0

\end{bmatrix}\]

and

\[E_8 = \begin{bmatrix}

2 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\

1 & 2 & 1 & 0 & 0 & 0 & 0 & 0\\

0 & 1 & 2 & 1 & 0 & 0 & 0 & 0 \\

0 & 0 & 1 & 2 & 1 & 0 & 0 & 0 \\

0 & 0 & 0 & 1 & 2 & 1 & 0 & 1 \\

0 & 0 & 0 & 0 & 1 & 2 & 1 & 0 \\

0 & 0 & 0 & 0 & 0 & 1 & 2 & 0 \\

0 & 0 & 0 & 0 & 1 & 0 & 0 & 2

\end{bmatrix}\]

If $M$ is smooth, then it is known that $b$ must be even and $|a| > |b|$. If $|a| \geq \frac{3}{2}|b|$, then one can explicitly realize $Q$ as the intersection form of a connected sum $mK3 \# nS^2\times S^2$. It is conjectured that for a smooth four-manifold the inequality $|a| \geq \frac{3}{2}|b|$ must hold in general; the strengthening of the condition $|a| > |b|$ to the condition $|a| \geq \frac{3}{2}|b|$ is referred to as the “11/8-conjecture”.

For those unfamiliar with the classification of unimodular lattices, the exact casework above is unimportant—the point is that unlike in the case of topological manifolds, the question of which lattices arise as the intersection forms of smooth four-manifolds is rather more complicated and involves some peculiar numerology. These results indicate that the theory of smooth four-dimensional manifolds is radically different from the study of topological four-manifolds, a divide that has colored the field to the present day. We should note also that the uniqueness analogue of Freedman’s theorem is significantly less well-understood than the existence part—there is not even a single four-manifold for which an exhaustive list of smooth structures has been proven, and in many examples there are an infinite number of known distinct smooth structures on the same topological four-manifold!

We have now come to the end of our brief history of the divide between continuous and smooth topology in dimension four. In the next post, we will begin introducing some basic ideas from gauge theory itself, but I will give an (extremely vague) overview here. Earlier, we alluded to the difficulty of studying smooth structures on manifolds using classical invariants due to their dependence only on the continuous structure. What was needed was thus a new set of tools which were formulated in such a way so as to explicitly see the smooth structure present on a manifold. The idea of mathematical gauge theory was to take certain partial differential equations from physics and study the moduli space of solutions to these PDEs when defined over the target manifold. These moduli spaces see both the topology of the manifold (in constraining the global solutions) and also, implicitly, the smooth structure (in defining the PDEs). The miracle of this approach is that, for the right PDEs, not only does the moduli space succeed in capturing the smooth structure, but also that it is tractable enough to condense into an invariant with good functorial properties!

]]>