Machine-Checked Proof

“In my view, the choice between the conventional process by a human referee and computer verification is as evident as the choice between a sundial and an atomic clock in science.” – Tom Hales (from [4])

“The rapid advance of computers has helped dramatize this point, because computers and people are very different. For instance, when Appel and Haken completed a proof of the 4-color map theorem using a massive automatic computation, it evoked much controversy. I interpret the controversy as having little to do with doubt people had as to the veracity of the theorem or the correctness of the proof. Rather, it reflected a continuing desire for human understanding of a proof, in addition to knowledge that the theorem is true” – Bill Thurston (from [6])

A machine-checked proof is a proof written in a piece of software called a ‘proof assistant’ which ensures the proof complies with the ‘axioms of mathematics’ and the rules of logic.  The question of the significance of computers in proving theorems can be polarizing (and for good reason). The quotations above represent some points of view relevant to this topic. In this post we will try answer three questions:

  1. What exactly is a computer-assisted proof?
  2. What are the advantages and the drawbacks of using computers to prove theorems?
  3. What should an interested person do to start learning to use proof assistants?

The motivating problem

Computer-assisted proof is a technique. Mathematicians care about new techniques when they solve some problem insoluble by old techniques. This is our motivating problem:

Above we see two solved problems. The first is a solved Rubik’s cube, while the second is Andrew Wiles’ proof of Fermat’s Last Theorem. A big difference between proving facts about Diophantine equations and solving Rubik’s cubes is that when a person solves a Rubik’s cube they know immediately it is solved, whereas Wiles’ proof (for example) took many months to properly referee.

As we progress in our mathematical education, our ability to check answers decreases. For example, my first lesson in mathematics was my mother teaching me to count. I was taught the numbers from one to ten along with the fact that for small sums I could check my answer by counting with my fingers. When one is taught to solve algebraic equations, one is also taught that answers can be checked by substitution. Checking Calculus is trickier, yet we can rely with good confidence on Wolfram Alpha. Upon reaching real analysis and abstract algebra, students check their work ultimately by handing it in and seeing if the professor buys their argument.

What is a proof assistant?

A computer proof assistant allows for more systematic checking of mathematical arguments. A user writes their proof up in a semi-formal language (meaning not as formal as formal logic and not as informal as ordinary mathematics). The proof assistant checks the proof against some foundation of mathematics. Normally when we think of foundations, we think of set theory. Yet for technical reasons proof assistants are implemented with ‘type theoretic’ foundations. Type theory is another foundation of mathematics which was actually proposed by Russell and Whitehead around the same time that others proposed ZFC set theory. Although there are philosophical implications of varying foundations across pieces of mathematics, for us, this is a moot point and we will say no more on the matter.

Below is a picture of a correct proof and an incorrect proof in the proof assistant Lean.

The proof on top is clearly wrong because of the error message showing up in red. The proof on the bottom is quickly seen to be correct since no error occurs. Just like with a Rubik’s cube, it is immediately clear if a proof is correct or not.

This seems a bit complicated. Is it ever necessary? It can be. For example, Hales’ proof of the Kepler conjecture was too complicated to be checked by journal referees. That proof was eventually verified with a proof assistant called HOL-light. The project to formally verify Kepler’s conjecture was called the Flyspeck project. Flyspeck took several years to complete and 5000 processor hours on the Microsoft Azure Cloud [5]. Some people hoped for a less computer-heavy proof so that mathematicians could read the proof.

Georges Gonthier along with his colleagues at Microsoft Research produced the first formally verified proof of the four color theorem [2]. This is different than the original computer-based proof of the four color theorem, which was essentially a standard mathematical argument that involved an absolutely massive computer calculation. Gonthier’s work certified that the algorithm purported to proof the four color theorem actually did what we believed it did.

Gonthier’s team also formally proved the Feit-Thompson odd order theorem, a cornerstone of the classification of finite simple groups, using the proof assistant Coq [3]. The original Feit-Thompson paper was 255 pages. Other high-profile projects include formal proofs of the prime number theorem, Gödel’s incompleteness theorems, and the central limit theorem.

How to get started?

These tools are not used exclusively for massive proofs that take years. There exist formal libraries containing theorems and definitions in real analysis, general topology, representation theory, and abstract algebra. Proof assistants are also used in industry to verify software and algorithms. This is quite powerful. As soon as one can state “this program P has no bugs” in a mathematically rigorous way, one can try to prove it. And with a formal proof software developers can be sure their programs are error-free.

A number of proof assistants are available. They are all free.

Isabelle-HOL is a proof assistant created by Lawrence Paulson. It is based on higher-order logic. Isabelle has massive libraries already in place as well as some of the most powerful automation available. Here, automation means the proof assistant can find short proofs for you. HOL-Light is a similar program, with a smaller kernel, written by John Harrison.

Coq and Lean are all based on dependent type theory. They were developed by teams led, respectively, by Thierry Coquand and Leo de Moura. That means data types can depend on other data types–we want that. For example, we’d like a type Fin$(k)$ whose inhabitants are finite sets of cardinality $k$, for arbitrary $k$. But these proof assistants, despite being newer, have weaker automation because it is, for technical reasons, harder to implement automation in dependent type theories (at least at the moment).

Online manuals exist for all the proof assistants mentioned above. Lean 2 has an interactive web tutorial. The current version of Lean is Lean 3, but the author found this tutorial a good way to get a flavor for proof assistants in general.

Problems and outlook

Proof assistants are not yet at the point where they can reasonably be used by working mathematicians. Hales did use HOL-light for his work on the Kepler problem, but this project was not the sort of thing mathematicians would do unless they absolutely had to. The current libraries are not large enough to include everyday arguments about everyday theorems. For example, we may go to prove a standard result about compact Lie groups only to discover the Haar theorem (proving the existence of Haar measures) is not in our library. This theorem is quoted all the time but its proof is lengthy and with present technology we should expect a formal proof of the Haar theorem to take a few years.

A more fundamental objection says that mathematics is, to use Thurston’s language, ultimately about the human understanding of mathematical objects and that proofs are there only secondarily to prevent our understanding from wandering off. In the words of the great combinatoricist G.-C. Rota, “saying a mathematician ‘proves theorems’ is like saying an author ‘writes words’.” It would follow that algorithmic ‘proof search’ type arguments are undesirable.

Yet there is no reason why one cannot both understand mathematical objects and use computer proof assistants. Machine checking is not synonymous with proof search. Presently, we have human journal referees check detailed technical arguments. If it were possible to use a computer to check these, we would lose nothing and gain more accountability.

Steven Wolfram has recently gotten interested in formal proof. People from Wolfram have worked on getting Mathematica and formal proof to coalesce [1]. The relevant linguists, computer scientists, and mathematicians are constantly considering ways to get the computer code to look more like ordinary mathematics. An ultimate goal is streamlining the refereeing process.

All journals require us to write papers in LaTeX–this was not always so. Perhaps in the future, some journals will require proofs written in a quasi-formal language for computer checking. Then, journal referees could focus more energy on the clarity and the overall presentation of mathematical articles–thus furthering human understanding of mathematics.

Acknowledgements. I am most indebted to Jeremy Avigad at Carnegie Mellon University and Tom Hales at the University of Pittsburgh for teaching me what I know about proof assistants. I thank also Emily Riehl at Johns Hopkins University for introducing me to Bill Thurston’s brilliant article “On Proof and Progress in Mathematics”, which I referenced several times in this post. And lastly, I dedicate this note to the late Vladimir Voevodsky. I never met him personally but his influence on my many teachers, his papers, and his recorded lectures were highlights of my undergraduate education.


[1] Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. Intelligent Computer Mathematics, 10th International Conference, CICM 2017, Edinburgh, UK, July 17-21, 2017, Proceedings.

[2] Gonthier, G.: Formal proof—the Four Color Theorem. Notices of the AMS 55(11), 1382–1393 (2008)

[3] Georges Gonthier, Andrea Asperti, Jeremy Avigad, Yves Bertot, Cyril Cohen, Fran¸cois Garillot, Stéphane Le Roux, Assia Mahboubi, Russell O’Connor, Sidi Ould Biha, Ioana Pasca, Laurence Rideau, Alexey Solovyev, Enrico Tassi, and Laurent Théry, A machine-checked proof of the odd order theorem, Interactive Theorem Proving – 4th International Conference, ITP 2013, Rennes, France, July 22-26, 2013. Proceedings (Sandrine Blazy, Christine Paulin-Mohring, and David Pichardie, eds.), Lecture Notes in Computer Science, vol. 7998, Springer, 2013, pp. 163–179.

[4] Hales, T., & Hales, T. C. (2012). Dense sphere packings: a blueprint for formal proofs (Vol. 400). Cambridge University Press.

[5] Hales, T.C. Introduction to the Flyspeck project. In Thierry Coquand, Henri Lombardi, and Marie-Franc¸oise Roy, editors, Mathematics, Algorithms, Proofs, number 05021 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 2006. Internationales Begegnungs- und Forschungszentrum f¨ur Informatik (IBFI), Schloss Dagstuhl, Germany.

[6] Thurston, W.: 1994, ‘On Proof and Progress in Mathematics’, Bulletin of the American Mathematical Society 30(2), 161–177.

About Jacob Gross

Jacob Gross is a graduate student in Geometry at Oxford University, working under the supervision of Dominic Joyce FRS, and supported by the Simons Collaboration on Special Holonomy in Geometry, Analysis, and Physics.
This entry was posted in Math, Technology & Math and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.