FLT3 at LftCM2024
This week, I have been at CIRM, the Centre International de Rencontres Mathematiques in France, attending Lean for the Curious Mathematician 2024. The focus of this conference was to introduce the Lean theorem prover and mathematical formalization to curious people.
I had not had much experience with Lean (or any other automated theorem prover or language or mathematical formalization) before this week. I have heard some talks about Lean, and some time ago I went through parts of the natural number game, a "game" that introduces mathematical theorem proving in Lean by proving fundamental properties of $\mathbb{N}$.
Although called a conference, this was really a workshop. Instead of merely listening/giving talks, we split into project groups on the first day and spent several hours each day working on these projects in groups. This has the added benefit that it's often much more efficient to learn something new while trying to work with it, as opposed to merely studying it.
The full list of proposed projects is available on this thread on the Lean zulip server. I chose to work on a project proposed by Riccardo Brasca to prove the $n = 3$ case of Fermat's Last Theorem. That is, we want to show that
\begin{equation*} X^3 + Y^3 = Z^3 \end{equation*}
has only trivial solutions over $\mathbb{Z}$ (i.e. $XYZ = 0$). This statement already has a proof in Lean, proved in a mostly elementary way.
I say mostly because the proof relies on the following nonobvious fact.
If an odd integer $s$ satisfies an equation $s^3 = u^2 + 3 v^2$ with $\gcd(u, v) = 1$, then $s = e^2 + 3f^2$ for some integers $e$ and $f$.
In fact, every odd factor of a number of the form $u^2 + 3v^2$ has the same form.
Aside
This theorem might look surprising, because it is surprising. But it's not alone.
Here are three statements that are all true.
- If a prime $p$ divides $u^2 + v^2$ and $\gcd(u, v) = 1$, then $p = e^2 + f^2$ for some $e$ and $f$.
- If a prime $p$ divides $u^2 + 2v^2$ and $\gcd(u, v) = 1$, then $p = e^2 + 2f^2$ for some $e$ and $f$.
- If a prime $p$ divides $u^2 + 3v^2$ and $\gcd(u, v) = 1$, then $p = e^2 + 3f^2$ for some $e$ and $f$.
The case $u^2 + 4v^2$ should perhaps be thought of as $u^2 + (2v)^2$, and is annoying to disentangle.
The next case, $p \mid u^2 + 5v^2$ with $\gcd(u, v) = 1$ does not have the property that $p = e^2 + 5f^2$. This is almost true — one can show that either $p = e^2 + 5f^2$ or $2p = e^2 + 5f^2$ (depending on $p \bmod 20$).
These sorts of questions are worth investigating. When Euler looked into these patterns, it led him to his ideas on quadratic reciprocity. Understanding why this pattern fails naturally leads to full reciprocity, genus theory, and even class field theory.
Proving FLT3 after Kummer
The previous proof (and the nonobvious fact) are very finicky and require a lot of casework. There are many branching statements where one reduces to a finite set of shapes of gcds and prime decompositions, and then proves them all by direct enumeration.
For our approach, we decided to formalize a different proof due to Kummer. Let $\zeta_3 = e^{2 \pi i / 3}$ denote a primitive $3$rd root of unity, and define $\lambda = \zeta_3 - 1$.
The number $\lambda$ is prime in $\mathbb{Z}[\zeta_3]$.
There are general ways to go about this, but here I note directly that \begin{equation}\label{eq:threediv} -\lambda (1 - \zeta_3^2) = 3, \end{equation} and so $\lambda$ divides $3$.
For number theorists, I obtained this by considering $N(\lambda)$, and the fact that this is prime in $\mathbb{Z}$ shows that $\lambda$ is prime.
Ignoring that, we can compute directly that \begin{equation} \big\lvert \mathbb{Z}[\zeta_3] / 3 \mathbb{Z}[\zeta_3] \big\rvert = 9, \end{equation} represented by the elements \begin{equation} \{ a + b \zeta_3 : 0 \leq a, b < 3 \}. \end{equation} The ideal $(\lambda)$ properly contains $3$ by \eqref{eq:threediv} (and the observation that $1 - \zeta_3^2$ is not a unit). Hence $\lvert \mathbb{Z}[\zeta_3] / \lambda \mathbb{Z}[\zeta_3] \rvert$ must be a proper divisor of $9$. As it's not trivial, the only possibility is $3$.
Hence $\mathbb{Z}[\zeta_3] / \lambda \mathbb{Z}[\zeta_3]$ is a field, which gives directly that $\lambda$ is prime. $\diamondsuit$
Kummer realized that it is easier to prove that there are no nontrivial solutions to the $n = 3$ case of Fermat's Last Theorem in the \emph{bigger} ring $\mathbb{Z}[\zeta_3]$. And the key new idea is that $X^3 + Y^3$ factors in $\mathbb{Z}[\zeta_3]$ as
\begin{equation}\label{eq:factor} X^3 + Y^3 = (X + Y)(X + \zeta_3 Y)(X + \zeta_3^2 Y). \end{equation}
(The second key aspect is that $\mathbb{Z}[\zeta_3]$ is a unique factorization domain. This is also nontrivial, but it follows from important and commonly regarded algebraic number theory facts: in this specific case, it follows immediately from Minkowski's theorem once we know $\mathbb{Z}[\zeta_3]$ is the ring of integers of $\mathcal{O}[\zeta_3].$ Fortunately, this fact is already in Lean's mathlib!)
The proof of FLT3 here again proceeds by descent. But the key factorization ideas are straightforward if we believe that $\mathbb{Z}[\zeta_3]$ is a unique factorization domain.
In this bigger field, we consider
\begin{equation}\label{eq:factorfull} X^3 + Y^3 = (X + Y)(X + \zeta_3 Y)(X + \zeta_3^2 Y) = u Z^3 \end{equation}
for some unit $u$. (i.e. we look for a solution $(X, Y, Z, u)$). The hard case is when $\lambda \mid Z$ and there are some annoying details about extracting powers of $\lambda$ from each term on the left, but after that we have unique factorization and can consider each term independently. (The final step of the descent occurs on the powers of $\lambda$, showing that any solution yields another solution with fewer powers of $\lambda$ appearing).
Lean Plan
For this project, I worked with Riccardo Brasca, Sanyam Gupta, Omar Haddad, Lorenzo Luccioli, Pietro Monticone, Alexis Saurin, and Florent Schaffhauser. This was a project organized by (and largely due to) Riccardo Brasca.
Riccardo did extremely valuable setup before the conference began. He already knew what proof he wanted to formalize and proved perhaps the most complicated, technical sublemmas (and some cyclotomic field lemmas). He also went through the proof and made a scaffold of what results to prove (without their proofs).
In Lean, if you don't have a proof of something, you can use the keyword
sorry
to say that Lean should just believe you and you're sorry to make it do
this. (Linters and editors then complain and remind you that you didn't
actually formally prove what you claimed to have proved).
With Riccardo's setup, there were initially $37$ sorry
s to handle.
Over the week, we added some additional lemmas, removed some lemmas, and added
new sorry
s, but really we didn't deviate from Riccardo's setup much.
I think one benefit that I can't stress enough is that this meant the project was structured in a good way.1 1This is very reminiscent of the initial work that goes into making a project for undergraduates or other young scientists. It's nice to be back on the receiving end of this work! Another benefit is that this enabled the 7 of us that were newer to Lean to split up and work on different aspects.
In practice we split into two teams and handled separate proofs. Actually, my team (Omar, Florent, Alexis, and I) almost didn't talk to the other team at all, except for occasional short messages on our zulip chat instance.
Initially we thought that we would take every other lemma. But after a couple we deviated, approximately going towards a you-claim-it-and-then-you-prove-it model. (And sometimes we had partial proofs and tossed those back and forth).
The first lemma that I participated in proving was the following:
/– Given `(S : Solution)`, we have that `λ ^ 2` does not divide `S.a + η * S.b`. -/
lemma lambda_sq_not_a_add_eta_mul_b : ¬ λ ^ 2 ∣ (S.a + η * S.b) := by
simp_rw [a_add_eta_b, dvd_add_right S.hab, pow_two,
mul_dvd_mul_iff_left lambda_ne_zero, S.hb, not_false_eq_true]
In terms of \eqref{eq:factorfull}, this task was to prove the following:2 2For the very observant, I'll note that $\lambda$ will divide one of the factors in \eqref{eq:factorfull} with higher multiplicity, and we decided WLOG to call that factor $(x + y)$. Riccardo set that up too.
For a quadruple $(x, y, z, u)$ solving \eqref{eq:factorfull}, $\lambda^2 \nmid (x + \zeta_3 b)$.
In slightly verbose math, the proof goes like this:
We are assuming elsewhere that $\lambda^2 \mid (x + y)$ and that $\lambda \mid z$. Then $\lambda \nmid y$, as this would imply $\lambda \mid x$ and we assume $\gcd(x, y, z) = 1$.
Rewrite $x + \zeta_3 y = x + y + \lambda y$ and note that $\lambda^2 \mid (x + y)$, but $\lambda^2 \nmid \lambda y$ as $\lambda \nmid y$.
What the lean proof does is apply a sequence of earlier lemmas and rewrites. It does the following.
- Rewrite $x + \zeta_3 y = x + y + \lambda y$.
- Use that $a \mid b$ and $a \mid (b + c)$ implies $a \mid c$, called
dvd_add_right
. We use this with the hypothesis that $\lambda^2 \mid (x + y)$. This has shown that $\lambda^2 \mid (x + y + \lambda y) \iff \lambda^2 \mid \lambda y$. - Use that $\lambda^2 = \lambda * \lambda$, called
pow_two
. - Use the result that $ab \mid ac$ and $a \neq 0$ implies that $b \mid c$, and the result (proved earlier) that $\lambda \neq 0$. This has shown that $\lambda^2 \mid \lambda y \iff \lambda \mid y$.
- Use that $\lambda \nmid y$ (proved earlier).
- Use that
not False
meansTrue
.
There are 5 commas in the Lean proof, and the 6 arguments are exactly the ones above. Some line up well with the math proof. But steps (3) and (6) really demonstrate how pedantic interacting with Lean can be.
I'll also note that the enclosing simp_rw
indicates that Lean is trying to
automatically simplify expressions it sees, and to rewrite terms based on
the provided lemmas, in a smart way. This is a Tactic in Lean, and includes
some metaprogramming and metalogic. This saves us from even deeper levels of
pedantry.We can hope that as Lean develops more, better and better
tactics and metalogic will be developed. But it seems very hard to
make big jumps.
The second proof I was involved in sounded substantially similar: prove the analogous statement for $\lambda^2 \nmid (x + \zeta_3^2 y)$. But the analogous proof wants to use that $1 + \zeta_3 + \zeta_3^2 = 0$ and this took considerably more effort. We ended up writing a 17 line Lean proof by contradiction, and this was very nontrivial.3 3Looking now, I suspect it would be possible to simplify our proof. It's remarkable how much you can learn in one week.
Regardless, these were very concrete, largely computational proofs, and for that reason they were excellent first proofs to work through.
By the end of the first day, we had actually added 3 new sorry
s, and I was
relying very strongly on Lean help from Riccardo and Florent.
On the second day, I proved my first lemma in Lean on my own.
I showed that, after handling $\lambda$ appropriately, $(x + y)$ and $(x +
\zeta_3^2 y)$ are coprime.
More precisely, if we call $(x + y) = \lambda^{3n - 2} X$ and $(x + \zeta_3^2 y)
= \lambda Z$, then I showed that $X$ and $Z$ are coprime.
(My Lean proof was strongly modelled on Florent's proof of the analogous
statement for $(x + y)$ and $(x + \zeta_3 y)$).
On the second day, I proved five lemmas completely on my own. I was starting to get the hang of things.
It's interesting to think about what it means to be a lemma in Lean. Really, a lemma is any mathematical statement that we claim. In standard math, a proof consists of a sequence of statements, organized roughly into sentences and paragraphs — which are then more broadly organized into lemmas, propositions, and theorems. But each lemma in Lean is more closely related to each statement in a proof.
To a first approximation, each sentence in a proof might correspond to one mathematical statement. So proving five Lean lemmas is sort of like saying that I understood five sentences in a proof. Actually, the five Lean lemmas I showed on Wednesday corresponds to one sentence, as well as two implicit facts used in the proof we were following.
There are a lot of implicit things taken for granted in typical matheamtical writing.
On Wednesday evening, we had 6 sorry
s left. The other team recognized that a
structural change would help and so by Thursday morning, there were only 3
left.
Two of these were mostly straightforward, and I filled in most of the details. But one thing that kept coming up concerned exponents.
We had statements that included the following simple fact about exponents:
\begin{equation} \lambda(\lambda^{n - 1})^3 = \lambda^{3n - 2}. \end{equation}
Making Lean work with this was really annoying. The problem was that the actual type of $\lambda$ that we were using was that it was an element in $\mathbb{Z}[\zeta_3]$, and not necessarily $(\mathbb{Z}[\zeta_3])^\times$. Of course it is invertible, and we proved that. But Lean uses dependent type theory and at its type core, this is inconvenient.
One thing that you can't do if you don't know that you're a unit is to assume that an inverse exists. In practice, this means that the default domain of exponents for $\lambda$ consists of natural numbers (including $0$). As we are in the midst of standard divisibility statements, I would even argue that this is a reasonable restriction to make.
The problem is that if, for example, $n = 0$, then $\lambda^{n-1}$ is confusing
when exponents are nonnegative.
We can't expect Lean to reason about nonsensical objects.
Similarly $3n - 2$ looks confusing.
Somewhere else we had shown that $n \geq 2$ (in the notation above).
But combining these pieces of information was surprisingly painful.
Pietro, Lorenzo, and Riccardo sat down and figured out the process to follow,
and we morally copied that same logic 3 times. Their approach was to isolate only the identity $\lambda
(\lambda^{(n-1)})^3 = \lambda^{3n - 2}$ and use congr
repeatedly to get
there. Eventually Lean would recognize that this reduced to the equation over
the naturals that $3(n-1) + 1 = 3n-2$. We could feed in the lemma that $n \geq
2$ and pass this to the omega
tactic. Very painful.
I now wonder if it would be possible to coerce $\lambda$ to a multiplicative
unit, and then hope that Lean's knowledge of basic ring theory (by which I mean
the ring
and group
tactics) would suffice?
I don't have a good mental model of how that works yet.
Nonetheless, we finished proving everything on Thursday afternoon. Now we begin the process of cleaning this up and getting it into the main Lean library.
There it will serve as a proof of perhaps the simplest and most trivial component of the ongoing effort to formalize a proof of Fermat's Last Theorem (for every exponent).
Source
Everything we did is currently available on github. We're beginning the process of putting it into mathlib — but I don't know anything about that yet. Certainly I'll need to rewrite my proofs in a shorter way, and in a more lean way.
Leave a comment
Info on how to comment
To make a comment, please send an email using the button below. Your email address won't be shared (unless you include it in the body of your comment). If you don't want your real name to be used next to your comment, please specify the name you would like to use. If you want your name to link to a particular url, include that as well.
bold, italics, and plain text are allowed in
comments. A reasonable subset of markdown is supported, including lists,
links, and fenced code blocks. In addition, math can be formatted using
$(inline math)$
or $$(your display equation)$$
.
Please use plaintext email when commenting. See Plaintext Email and Comments on this site for more. Note also that comments are expected to be open, considerate, and respectful.
Comments (2)
2024-04-02 DLD
I attended Lean for the Curious Mathematician 2024 at CIRM. Along with Riccardo Brasca, Sanyam Gupta, Omar Haddad, Lorenzo Luccioli, Pietro Monticone, Alexis Saurin, and Florent Schaffhauser, we proved the $n=3$ case of Fermat's Last Theorem in a good way in Lean.
I wrote a bit more about this experience.
2024-04-02 DLD
Thank you to Edward van de Meent and Riccardo Brasca for comments and pointing out errors in earlier versions of this post.