As technology and AI continue to advance, it is crucial that we familiarize ourselves with technological advancements and incorporate them into our studies. In particular, I would like to discuss Lean 4, an open-source theorem prover and programming language that is based on dependent type theory. Its main purpose is to formally verify mathematical proofs and algorithms, and it has gained popularity among mathematicians worldwide, including prominent figures such as Prof. Terence Tao. Lean 4 has pushed boundaries as to how mathematics theorems should be proven, and has promoted rigor and accuracy in modern mathematical proofs.
Due to the time and labor-intensive nature of traditional mathematical theorem proving, the use of machine learning algorithms has become crucial in automating the process and improving the efficiency of research, making it easier and more accessible for mathematicians to verify proofs. However, this has raised ethical debates as to whether such proofs can be trusted entirely, such as the four color theorem which was proven completely by computer.
The above talk by Terence Tao will heavily expand on our discussion today, thus it is well worth a read if you would like to learn more about machine proving. I’d also like to express my gratitude to the community managing leanprover on GitHub for their assistance in this article, which has been instrumental in facilitating the learning process of Lean. They are found here – https://leanprover-community.github.io/learn.html.
Before we begin we must discuss more about the history of machine-assisted proving languages and how Lean came about. During the 1950s and 1960s, as the concept of artificial intelligence began to take shape, researchers embarked on investigations into automated theorem proving with the goal of creating computer programs capable of formally verifying mathematical theorems. This era saw the foundational Dartmouth Conference in 1955, where pioneers like John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon laid the groundwork for AI. Notably, in 1956, the Logic Theorist, created by Allen Newell, Herbert A. Simon, and J.C. Shaw, showcased early success by proving theorems from Principia Mathematica, a significant work in mathematical logic by Whitehead and Russell.
Advancements continued in to the 1960s and 1970s, fueled by the development of programming languages like LISP, which facilitated further progress in automated theorem proving. Researchers such as Robert Boyer and J Strother Moore contributed with systems like NQTHM and ACL2, enhancing the capabilities of automated deduction. Additionally, in 1965, logician Alan Robinson introduced the resolution principle, a fundamental concept that provided a sound and complete method for automated deduction, laying the groundwork for subsequent theorem proving algorithms.
The exploration of higher-order logic (HOL) in the 1970s and 1980s marked another significant development, with systems like HOL and Isabelle/HOL enabling formal verification of complex mathematical proofs and software systems. As technology progressed, SMT solvers emerged in the 2000s as powerful tools for automated reasoning, capable of handling theories beyond simple propositional logic. We should also mention here some of the mathematical calculators presently utilized – namely Wolfram Alpha and FriCAS among others which utilizes algorithms and numerical approximations to arrive at a solution. For example, the Risch algorithm, which is an algorithm designed to find indefinite integrals of functions, has been utilized in various calculators. There are several limitations to this algorithm that are worthy of consideration, although they fall outside the scope of this article – to learn more about this visit here for an example.
Currently, there exist two primary categories of algorithmic machines: automated theorem proving algorithms and interactive theorem proving systems. The former makes uses of methods such as brute-forcing and propositional logic to solve problems, whilst the latter makes use of verification to ensure proofs generated by humans are correct. Lean 4, our main focus for today is an interactive theorem proving systems, meaning it requires human input and intervention in the proof construction process.
Lean 4, as an interactive theorem prover, makes use of only the basics of mathematics such as Peano axioms and ZFC (Zermelo–Fraenkel set theory with Axiom of Choice) as its base point for constructing proofs. For example, if we would like to define natural numbers, we can make use of successors. Below is a snippet :
inductive Nat : Type
| zero : Nat
| succ : Nat → Nat
We see that this is very similar to Peano axioms, which defines as
and so on. In fact, we can use this to prove
in Lean.
We also define addition :
def Nat.add : Nat → Nat → Nat
| n, Nat.zero => n
| n, Nat.succ m => Nat.succ (Nat.add n m)
We can now define our addition as follows :
def one := succ zero
def two := succ one
Nat.one_eq_succ_zero : 1 = Nat.succ 0
Nat.two_eq_succ_one : 2 = Nat.succ 1
Nat.add_succ (a d : ℕ) : a + Nat.succ d = MyNat.succ (a + d)
repeat rw [one_eq_succ_zero]
rw [two_eq_succ_one, one_eq_succ_zero]
repeat rw [add_succ]
rw [add]
rfl
This is very similar to what one would expect deriving from Peano axioms – this is shown below (credit to proofwiki.org).

By using this, we can define an addition recursively. However, this is complicated as for every number we need to build a function, in this case one_eq_succ_zero and two_eq_succ_one to establish our equality. This leads us to our second point we need to know – induction. Induction is given in the Peano axioms as follows :
If K is a set such that:
1. 0 ∈ K, and
2. for every natural number n, n ∈ K implies that S(n) ∈ K,
then K contains every natural number.
It would not be an overstatement to say that induction is in essence the basis of all theorems. Quoting directly from leanprover, ‘In fact, in Lean’s library, every concrete type other than the universes and every type constructor other than dependent arrows is an instance of a general family of type constructions known as inductive types. It is remarkable that it is possible to construct a substantial edifice of mathematics based on nothing more than the type universes, dependent arrow types, and inductive types; everything else follows from those.’ Thus, it is of utmost importance that we understand how to utilize such constructs to prove more advanced theorems.
To define an inductive type in Lean, we have the following general construct (however note that this is for finite elements) :
inductive Foo where
| constructor₁ : ... → Foo
| constructor₂ : ... → Foo
...
| constructorₙ : ... → Foo
From this construction we can then progress into actually using induction. In this case we would like to prove that $latex \sum_{i=0}^{n-1} i^2=\frac{n(n-1)(2n-1)}6. Below is a construction as shown here.
-- sum from i = 0 to n - 1 of i^2 is n(n-1)(2n-1)/6
example (n : ℕ) : ∑ i in range n, (i : ℚ)^2 = n * (n - 1) * (2 * n - 1) / 6 :=
begin
induction n with d hd,
simp, },
{ -- inductive step
rw finset.sum_range_succ,
rw hd, -- the inductive hypothesis
simp, -- change `d.succ` into `d + 1`
ring,
}
end
With this as a simple introduction, we can quickly take a glimpse at how Lean4 functions. As a user-interactive language, it takes axioms assumed by the user and builds on from that. Thus, Lean is not a generative algorithm and is thus trusted highly in the mathematical community. For example, one of Peter Scholze’s proofs in condensed mathematics was formatted in Lean and proven to be true after an arduous amount of work. By utilizing Lean in such tasks, we can verify proofs with certainty, and as Scholze himself says here, ‘I now think it’s sensible in principle to formalize whatever you want in Lean. There’s no real obstruction.’
Another example where Lean is used for example is in Terence Tao’s formalization project in which he formats his previous papers in Lean, in this case for PFR as linked below.
In this case Lean has also helped in locating errors in proofs. For Tao’s paper on Maclaurin series ‘a small (but non-trivial) bug’ in his paper was located after Lean failed to compile. By formalizing proofs we can safely put more trust into papers – for example Zhang Yitang’s paper on Siegel Zeros and Inter-Universal Teichmüller Theory (IUT) which I do not pretend to understand which has been ‘in limbo’, if formalized could put an end to this constant debate.
There has also been significant development within the AI realm recently on mathematics as well. Open AI1 and Meta AI2 have both used Lean in AI problem solving, in this case AIME (American Invitational Mathematics Examination) and IMO (International Mathematics Olympiad problems. In the case of Meta, HyperTree Proof Search (HTPS) was utilized, and the algorithm was trained on a dataset of successful mathematical proofs and learned to ‘generalize to new, very different kinds of problems’. In particular, it was able to deduce a correct proof for an IMO problem that involved ‘some arithmetic reduction to a finite number of cases’. With proper training, it does seem inevitable that AI will replace a number of jobs that mathematicians nowadays do – thus we must assess whether AI brings about a net benefit, and if so how we utilize that to our advantage.
We can discuss firstly the effects of interactive theorem proving systems such as the likes of Coq, Isabelle, and Lean. Since its rise to prominence in the 1980s, this methodology has facilitated a partnership between individuals and automated systems to systematize mathematical concepts and validate the accuracy of theorems. Lean has been immensely helpful for mathematicians by enhancing (or forcing one could argue) rigor on proofs, leading to a more concise proof which is guaranteed accuracy.
However, this is not the case for automated machine provers. Even though machine learning models can analyze vast amounts of mathematical literature and data, it is not guaranteed that the solutions provided are accurate much less sensible. The use of machine learning also raises ethical questions about authorship and credit. Who gets credit for a theorem or proof discovered with the assistance of an algorithm? There have been numerous occasions in which I have utilized Chat-GPT for mathematical problem solving. Although the solutions provided are often incorrect, it has introduced new approaches to problems that have proven to be effective when implemented. In research, how we decide to credit this as such, or whether this is even valid in edge cases remains to be seen.
Another issue we face currently is whether we can place trust in such proofs generated. We know that solutions from Lean for example can be trusted to a very high degree, but this may very much not hold for other AI languages. In cutting edge research, proofs become exponentially harder to verify to the point where only a handful of researches may throughly understand the topic – can we trust proofs by AI languages in such cases? While Lean has earned a reputation for producing trustworthy and rigorously verified proofs, the same level of confidence may not necessarily extend to other AI languages and proof systems. Factors such as the sophistication of the proof techniques, the quality of the underlying logic, and the robustness of the implementation can all impact the reliability of the generated proofs.
As machine learning models increasingly contribute to mathematical research, ensuring transparency and explainability becomes paramount. For modern mathematics research, mathematicians will have to be able to comprehend the algorithms’ decision-making processes as proofs grow exceedingly intricate. Thus, we must aim to learn more about how such proofs are derived and be wary of such downfalls of these algorithms.
As we promote better mathematical education for all, one thing I also find extremely important for aspiring mathematicians is to familiarize ourselves with such algorithms. In the coming years, it is clear that AI will continue to play an increasingly prominent role in the field of mathematics. Integrating machine learning-driven theorem proving tools into our arsenal will offer significant benefits and opportunities for advancement in mathematical research and problem-solving.
As mathematicians engage in collaborative endeavors with machines, our roles may eventually shift from mere proof generators to collaborative partners. I look forward to seeing how these algorithmic machines will affect our study of mathematics – and as always, thank you so much for reading and see you next month!


Leave a Reply