# FLT3 at LftCM2024

This week, I have been at CIRM, the Centre International de Rencontres Mathematiques in France, attending Lean for the Curious Mathematician 2024. The focus of this conference was to introduce the Lean theorem prover and mathematical formalization to curious people.

I had not had much experience with Lean (or any other automated theorem prover or language or mathematical formalization) before this week. I have heard some talks about Lean, and some time ago I went through parts of the natural number game, a "game" that introduces mathematical theorem proving in Lean by proving fundamental properties of $\mathbb{N}$.

Although called a *conference*, this was really a *workshop.*
Instead of merely listening/giving talks, we split into project groups on the
first day and spent several hours each day working on these projects in groups.
This has the added benefit that it's often much more efficient to learn
something new while trying to work with it, as opposed to merely studying it.

The full list of proposed projects is available on this thread on the Lean zulip server. I chose to work on a project proposed by Riccardo Brasca to prove the $n = 3$ case of Fermat's Last Theorem. That is, we want to show that

\begin{equation*} X^3 + Y^3 = Z^3 \end{equation*}

has only trivial solutions over $\mathbb{Z}$ (i.e. $XYZ = 0$). This statement already has a proof in Lean, proved in a mostly elementary way.

I say *mostly* because the proof relies on the following nonobvious fact.

If an odd integer $s$ satisfies an equation $s^3 = u^2 + 3 v^2$ with $\gcd(u, v) = 1$, then $s = e^2 + 3f^2$ for some integers $e$ and $f$.

In fact, every odd factor of a number of the form $u^2 + 3v^2$ has the same form.

## Aside

This theorem might look surprising, because it *is* surprising.
But it's not alone.

Here are three statements that are all true.

- If a prime $p$ divides $u^2 + v^2$ and $\gcd(u, v) = 1$, then $p = e^2 + f^2$ for some $e$ and $f$.
- If a prime $p$ divides $u^2 + 2v^2$ and $\gcd(u, v) = 1$, then $p = e^2 + 2f^2$ for some $e$ and $f$.
- If a prime $p$ divides $u^2 + 3v^2$ and $\gcd(u, v) = 1$, then $p = e^2 + 3f^2$ for some $e$ and $f$.

The case $u^2 + 4v^2$ should perhaps be thought of as $u^2 + (2v)^2$, and is annoying to disentangle.

The next case, $p \mid u^2 + 5v^2$ with $\gcd(u, v) = 1$ does *not* have the
property that $p = e^2 + 5f^2$.
This is *almost* true — one can show that either $p = e^2 + 5f^2$ or $2p =
e^2 + 5f^2$ (depending on $p \bmod 20$).

These sorts of questions are worth investigating. When Euler looked into these patterns, it led him to his ideas on quadratic reciprocity. Understanding why this pattern fails naturally leads to full reciprocity, genus theory, and even class field theory.

## Proving FLT3 after Kummer

The previous proof (and the nonobvious fact) are very finicky and require a lot of casework. There are many branching statements where one reduces to a finite set of shapes of gcds and prime decompositions, and then proves them all by direct enumeration.

For our approach, we decided to formalize a different proof due to Kummer. Let $\zeta_3 = e^{2 \pi i / 3}$ denote a primitive $3$rd root of unity, and define $\lambda = \zeta_3 - 1$.

The number $\lambda$ is prime in $\mathbb{Z}[\zeta_3]$.

There are general ways to go about this, but here I note directly that \begin{equation}\label{eq:threediv} -\lambda (1 - \zeta_3^2) = 3, \end{equation} and so $\lambda$ divides $3$.

For number theorists, I obtained this by considering $N(\lambda)$, and the fact that this is prime in $\mathbb{Z}$ shows that $\lambda$ is prime.

Ignoring that, we can compute directly that \begin{equation} \big\lvert \mathbb{Z}[\zeta_3] / 3 \mathbb{Z}[\zeta_3] \big\rvert = 9, \end{equation} represented by the elements \begin{equation} \{ a + b \zeta_3 : 0 \leq a, b < 3 \}. \end{equation} The ideal $(\lambda)$ properly contains $3$ by \eqref{eq:threediv} (and the observation that $1 - \zeta_3^2$ is not a unit). Hence $\lvert \mathbb{Z}[\zeta_3] / \lambda \mathbb{Z}[\zeta_3] \rvert$ must be a proper divisor of $9$. As it's not trivial, the only possibility is $3$.

Hence $\mathbb{Z}[\zeta_3] / \lambda \mathbb{Z}[\zeta_3]$ is a field, which gives directly that $\lambda$ is prime. $\diamondsuit$

Kummer realized that it is easier to prove that there are no nontrivial solutions to the $n = 3$ case of Fermat's Last Theorem in the \emph{bigger} ring $\mathbb{Z}[\zeta_3]$. And the key new idea is that $X^3 + Y^3$ factors in $\mathbb{Z}[\zeta_3]$ as

\begin{equation}\label{eq:factor} X^3 + Y^3 = (X + Y)(X + \zeta_3 Y)(X + \zeta_3^2 Y). \end{equation}

(The second key aspect is that $\mathbb{Z}[\zeta_3]$ is a unique factorization domain. This is also nontrivial, but it follows from important and commonly regarded algebraic number theory facts: in this specific case, it follows immediately from Minkowski's theorem once we know $\mathbb{Z}[\zeta_3]$ is the ring of integers of $\mathcal{O}[\zeta_3].$ Fortunately, this fact is already in Lean's mathlib!)

The proof of FLT3 here again proceeds by descent. But the key factorization ideas are straightforward if we believe that $\mathbb{Z}[\zeta_3]$ is a unique factorization domain.

In this bigger field, we consider

\begin{equation}\label{eq:factorfull} X^3 + Y^3 = (X + Y)(X + \zeta_3 Y)(X + \zeta_3^2 Y) = u Z^3 \end{equation}

for some unit $u$. (i.e. we look for a solution $(X, Y, Z, u)$). The hard case is when $\lambda \mid Z$ and there are some annoying details about extracting powers of $\lambda$ from each term on the left, but after that we have unique factorization and can consider each term independently. (The final step of the descent occurs on the powers of $\lambda$, showing that any solution yields another solution with fewer powers of $\lambda$ appearing).

# Lean Plan

For this project, I worked with Riccardo Brasca, Sanyam Gupta, Omar Haddad, Lorenzo Luccioli, Pietro Monticone, Alexis Saurin, and Florent Schaffhauser. This was a project organized by (and largely due to) Riccardo Brasca.

Riccardo did extremely valuable setup before the conference began. He already knew what proof he wanted to formalize and proved perhaps the most complicated, technical sublemmas (and some cyclotomic field lemmas). He also went through the proof and made a scaffold of what results to prove (without their proofs).

In Lean, if you don't have a proof of something, you can use the keyword
`sorry`

to say that Lean should just believe you and you're sorry to make it do
this. (Linters and editors then complain and remind you that you didn't
*actually* formally prove what you claimed to have proved).
With Riccardo's setup, there were initially $37$ `sorry`

s to handle.

Over the week, we added some additional lemmas, removed some lemmas, and added
new `sorry`

s, but really we didn't deviate from Riccardo's setup much.

I think one benefit that I can't stress enough is that this meant the project
was structured in a good way.^{1}
^{1}This is very reminiscent of the initial work
that goes into making a project for undergraduates or other young scientists.
It's nice to be back on the receiving end of this work!
Another benefit
is that this enabled the 7 of us that were newer to Lean to split up and work
on different aspects.

In practice we split into two teams and handled separate proofs. Actually, my team (Omar, Florent, Alexis, and I) almost didn't talk to the other team at all, except for occasional short messages on our zulip chat instance.

Initially we thought that we would take every other lemma. But after a couple we deviated, approximately going towards a you-claim-it-and-then-you-prove-it model. (And sometimes we had partial proofs and tossed those back and forth).

The first lemma that I participated in proving was the following:

```
/– Given `(S : Solution)`, we have that `λ ^ 2` does not divide `S.a + η * S.b`. -/
lemma lambda_sq_not_a_add_eta_mul_b : ¬ λ ^ 2 ∣ (S.a + η * S.b) := by
simp_rw [a_add_eta_b, dvd_add_right S.hab, pow_two,
mul_dvd_mul_iff_left lambda_ne_zero, S.hb, not_false_eq_true]
```

In terms of \eqref{eq:factorfull}, this task was to prove the
following:^{2}
^{2}For the very observant, I'll note that $\lambda$ will divide
one of the factors in \eqref{eq:factorfull} with higher multiplicity, and we
decided WLOG to call that factor $(x + y)$. Riccardo set that up too.

For a quadruple $(x, y, z, u)$ solving \eqref{eq:factorfull}, $\lambda^2 \nmid (x + \zeta_3 b)$.

In slightly verbose math, the proof goes like this:

We are assuming elsewhere that $\lambda^2 \mid (x + y)$ and that $\lambda \mid z$. Then $\lambda \nmid y$, as this would imply $\lambda \mid x$ and we assume $\gcd(x, y, z) = 1$.

Rewrite $x + \zeta_3 y = x + y + \lambda y$ and note that $\lambda^2 \mid (x + y)$, but $\lambda^2 \nmid \lambda y$ as $\lambda \nmid y$.

What the lean proof does is apply a sequence of earlier lemmas and rewrites. It does the following.

- Rewrite $x + \zeta_3 y = x + y + \lambda y$.
- Use that $a \mid b$ and $a \mid (b + c)$ implies $a \mid c$, called
`dvd_add_right`

. We use this with the hypothesis that $\lambda^2 \mid (x + y)$. This has shown that $\lambda^2 \mid (x + y + \lambda y) \iff \lambda^2 \mid \lambda y$. - Use that $\lambda^2 = \lambda * \lambda$, called
`pow_two`

. - Use the result that $ab \mid ac$ and $a \neq 0$ implies that $b \mid c$, and the result (proved earlier) that $\lambda \neq 0$. This has shown that $\lambda^2 \mid \lambda y \iff \lambda \mid y$.
- Use that $\lambda \nmid y$ (proved earlier).
- Use that
`not False`

means`True`

.

There are 5 commas in the Lean proof, and the 6 arguments are exactly the ones above. Some line up well with the math proof. But steps (3) and (6) really demonstrate how pedantic interacting with Lean can be.

I'll also note that the enclosing `simp_rw`

indicates that Lean is trying to
automatically *simplify* expressions it sees, and to *rewrite* terms based on
the provided lemmas, in a smart way. This is a **Tactic** in Lean, and includes
some metaprogramming and metalogic. This saves us from even deeper levels of
pedantry.We can hope that as Lean develops more, better and better
tactics and metalogic will be developed. But it seems **very hard** to
make big jumps.

The second proof I was involved in sounded substantially similar: prove the
analogous statement for $\lambda^2 \nmid (x + \zeta_3^2 y)$.
But the analogous proof wants to use that $1 + \zeta_3 + \zeta_3^2 = 0$ and
this took considerably more effort.
We ended up writing a 17 line Lean proof by contradiction, and this was very
nontrivial.^{3}
^{3}Looking now, I suspect it would be possible to simplify our
proof. It's remarkable how much you can learn in one week.

Regardless, these were very concrete, largely computational proofs, and for that reason they were excellent first proofs to work through.

By the end of the first day, we had actually added 3 new `sorry`

s, and I was
relying very strongly on Lean help from Riccardo and Florent.
On the second day, I proved my first lemma in Lean on my own.
I showed that, after handling $\lambda$ appropriately, $(x + y)$ and $(x +
\zeta_3^2 y)$ are coprime.
More precisely, if we call $(x + y) = \lambda^{3n - 2} X$ and $(x + \zeta_3^2 y)
= \lambda Z$, then I showed that $X$ and $Z$ are coprime.
(My Lean proof was strongly modelled on Florent's proof of the analogous
statement for $(x + y)$ and $(x + \zeta_3 y)$).

On the second day, I proved five lemmas completely on my own. I was starting to get the hang of things.

It's interesting to think about what it means to be a *lemma* in Lean.
Really, a *lemma* is *any mathematical statement that we claim*.
In standard math, a proof consists of a sequence of statements, organized
roughly into sentences and paragraphs — which are then
more broadly organized into lemmas, propositions, and theorems.
But each *lemma* in Lean is more closely related to each statement in a proof.

To a first approximation, each sentence in a proof might correspond to one mathematical statement. So proving five Lean lemmas is sort of like saying that I understood five sentences in a proof. Actually, the give Lean lemmas I showed on Wednesday corresponds to one sentence, as well as two implicit facts used in the proof we were following.

There are **a lot** of implicit things taken for granted in typical
matheamtical writing.

On Wednesday evening, we had 6 `sorry`

s left. The other team recognized that a
structural change would help and so by Thursday morning, there were only `3`

left.

Two of these were *mostly* straightforward, and I filled in most of the
details. But one thing that kept coming up concerned exponents.

We had statements that included the following simple fact about exponents:

\begin{equation} \lambda(\lambda^{n - 1})^3 = \lambda^{3n - 2}. \end{equation}

Making Lean work with this was really annoying. The problem was that the actual type of $\lambda$ that we were using was that it was an element in $\mathbb{Z}[\zeta_3]$, and not necessarily $(\mathbb{Z}[\zeta_3])^\times$. Of course it is invertible, and we proved that. But Lean uses dependent type theory and at its type core, this is inconvenient.

One thing that you can't do if you don't know that you're a unit is to assume that an inverse exists. In practice, this means that the default domain of exponents for $\lambda$ consists of natural numbers (including $0$). As we are in the midst of standard divisibility statements, I would even argue that this is a reasonable restriction to make.

The problem is that if, for example, $n = 0$, then $\lambda^{n-1}$ is confusing
when exponents are nonnegative.
We can't expect Lean to reason about nonsensical objects.
Similarly $3n - 2$ looks confusing.
Somewhere else we had shown that $n \geq 2$ (in the notation above).
But combining these pieces of information was surprisingly painful.
Pietro, Lorenzo, and Riccardo sat down and figured out the process to follow,
and we morally copied that same logic 3 times. Their approach was to isolate only the identity $\lambda
(\lambda^{(n-1)})^3 = \lambda^{3n - 2}$ and use `congr`

repeatedly to get
there. Eventually Lean would recognize that this reduced to the equation over
the naturals that $3(n-1) + 1 = 3n-2$. We could feed in the lemma that $n \geq
2$ and pass this to the `omega`

tactic. Very painful.

I now wonder if it would be possible to coerce $\lambda$ to a multiplicative
unit, and then hope that Lean's knowledge of basic ring theory (by which I mean
the `ring`

and `group`

tactics) would suffice?
I don't have a good mental model of how that works yet.

Nonetheless, we finished proving everything on Thursday afternoon. Now we begin the process of cleaning this up and getting it into the main Lean library.

There it will serve as a proof of perhaps the simplest and most trivial component of the ongoing effort to formalize a proof of Fermat's Last Theorem (for every exponent).

## Source

Everything we did is currently available on
github.
We're beginning the process of putting it into mathlib — but I don't know
anything about that yet.
Certainly I'll need to rewrite my proofs in a shorter way, and in a more *lean*
way.

### Leave a comment

## Info on how to comment

To make a comment, please send an email using the button below. Your email
address **won't be shared** (unless you include it in the body
of your comment). If you don't want your real name to be used next to your
comment, please specify the name you would like to use. If you want your name
to link to a particular url, include that as well.

**bold**, *italics*, and plain text are allowed in
comments. A reasonable subset of markdown is supported, including lists,
links, and fenced code blocks. In addition, math can be formatted using
`$(inline math)$`

or `$$(your display equation)$$`

.

**Please use plaintext email** when commenting. See
Plaintext Email and
Comments on this site for more. Note also that
**comments are expected to be open, considerate, and
respectful.**

Comments (2)2024-04-02 DLDI attended Lean for the Curious Mathematician 2024 at CIRM. Along with Riccardo Brasca, Sanyam Gupta, Omar Haddad, Lorenzo Luccioli, Pietro Monticone, Alexis Saurin, and Florent Schaffhauser, we proved the $n=3$ case of Fermat's Last Theorem in a

good wayin Lean.I wrote a bit more about this experience.

2024-04-02 DLDThank you to Edward van de Meent and Riccardo Brasca for comments and pointing out errors in earlier versions of this post.