Elliptic curves and modular forms are the same thing, plus Fermat's Last Theorem

The TLS handshake that fires when you open an HTTPS site is, in practice, almost certainly using elliptic curves for key exchange and signing today. The signatures behind Bitcoin and Ethereum (secp256k1), the message encryption in Signal and WhatsApp (Curve25519), SSH public keys (Ed25519), TLS 1.3 key exchange (X25519, P-256) — all of these run on top of curves from a family called “elliptic curves.” They give the same or better security as RSA with much shorter keys, which is why they have become the standard choice over the last decade.

The security rests on a deceptively simple piece of algebra: solving the “point addition” on the curve in the inverse direction (the elliptic curve discrete logarithm problem) is exponentially expensive. Computing $nP$ from $P$ in the forward direction is cheap; finding $n$ such that $Q = nP$ is exponentially heavy. That asymmetry is what keeps secret keys secret.

I had seen the phrase “elliptic curve” countless times without ever sitting down and asking how exactly you can build crypto out of a curve, what these curves actually look like, or how they differ from an ellipse. When I did, it turned out that the very same “point addition” structure used in crypto is also the tool that finally proved Fermat’s Last Theorem — the two are continuous, not separate stories.

This article first chases what an elliptic curve actually is, with figures and equations, and shows why the point addition can power a crypto algorithm. If you then extend the same addition structure across to modular forms (the Taniyama–Shimura conjecture), it lines up in a single trail with the Frey curve proving Fermat’s Last Theorem. The prerequisites are roughly “have brushed up against group theory and complex analysis once.” Fully digesting it needs a textbook.

A quick refresher on groups, rings, and fields

The words “the rational field $\mathbb{Q}$ ,” “abelian group,” and " $\mathbb{F}_p$ " show up below as-is, so a minimal cheat sheet. Skip if you already know.

Structure	What you can do	Main properties	Examples
Group	One operation (addition or multiplication)	Associativity, identity, inverse	$(\mathbb{Z}, +)$ , the points on an elliptic curve
Abelian group	Group + you can swap the order	$a+b = b+a$	$(\mathbb{Z}, +)$ , $(\mathbb{Q}, +)$
Ring	Addition and multiplication	Addition is an abelian group; multiplication has associativity and distributivity	$\mathbb{Z}$ , $\mathbb{R}[x]$
Field	Addition, subtraction, multiplication, division (except by $0$ )	Ring + every non-zero element has a multiplicative inverse	$\mathbb{Q}$ , $\mathbb{R}$ , $\mathbb{C}$ , $\mathbb{F}_p$

$\mathbb{F}_p$ is the world of remainders modulo a prime $p$ (the set $\{0, 1, \dots, p-1\}$ ), closed under addition, multiplication, and division mod $p$ . This shows up later when we “reduce” elliptic curves modulo a prime.

What an elliptic curve actually is

An elliptic curve over the rational field $\mathbb{Q}$ (or some other field) is a curve of the form

y^2 = x^3 + ax + b

where the right-hand side has no repeated roots, i.e.

4a^3 + 27b^2 \neq 0

(the discriminant is non-zero). If this fails, the curve develops a cusp or self-intersection and becomes “singular.” Elliptic curves avoid that.

It’s easier to just look at one.

y² = x³ − x + 1 (one connected component)

This is the case where the discriminant is positive and the curve is one continuous piece. Changing $a, b$ changes the shape.

y² = x³ − 3x + 1 (the right-hand cubic has three distinct real roots, so the curve splits into a bounded "egg" component and an unbounded branch)

The name “elliptic” has nothing to do with ellipses. It comes from a historical connection to elliptic integrals (integrals computing the arc length of an ellipse), so the shape is not an ellipse.

Elliptic curves come with an addition

You can put an “addition” on the points of an elliptic curve, and that turns the set of points into a group.

Take two points $P, Q$ and draw the line $PQ$ . Since the curve is cubic, the line meets the curve at exactly three points counted with multiplicity. Call the third intersection $R$ and reflect it across the $x$ -axis. The result is defined to be $P + Q$ .

This is not adding the coordinates. Even though it is called “addition,” the coordinates of $P + Q$ are not the coordinate-wise sum of those of $P$ and $Q$ . It is a different point, determined by the geometric construction.

Point addition on y² = x³ − x + 1 (the generic case). The line y = 1 through P = (−1, 1) and Q = (1, 1) meets the curve at three points; reflecting the third intersection R = (0, 1) across the x-axis gives P + Q = (0, −1). You can check it does not match the coordinate-wise sum.

This is the simplest case. The formula changes depending on how $P, Q$ are chosen.

When $P = Q$ , the line $PQ$ isn’t uniquely defined, so we use the tangent to the curve at $P$ instead. Implicit differentiation gives the slope $m = (3p_x^2 + a) / (2p_y)$ .

Doubling. At P = (1, 1) the tangent has slope 1 (line y = x). It meets the curve at x = −1; reflecting R = (−1, −1) gives 2P = (−1, 1).

When $P$ and $Q$ share the same $x$ -coordinate with opposite $y$ -signs ( $Q = -P$ ), the line is vertical. The “third” intersection only exists at infinity in the $y$ -direction.

The case that produces the point at infinity. The line through P = (1, 1) and −P = (1, −1) is vertical and meets the curve at only two finite points. The third intersection is placed at infinity, and we define P + (−P) = 𝒪 (the point at infinity, which serves as the identity of the group).

Now to see that this addition really does form a group. For any two points $P, Q$ on the curve, the line $PQ$ is uniquely determined, the third intersection $R$ with the cubic is uniquely determined (with multiplicity), and the reflection $P + Q$ is uniquely determined too. The operation “two points in, one curve point out” is well-defined. The curve $E$ generally has infinitely many rational points, and this single addition is defined across all of them.

If you fix $P$ and vary $Q$ , then $P + Q$ also moves across the curve with no gaps and no overlaps (adding $-P$ takes you back, so $Q \mapsto P + Q$ is a bijection).

The group axioms hold, one by one, under this addition:

Closure. If $P, Q \in E$ then $P + Q \in E$ . The third intersection is on the curve by definition, and reflecting across the $x$ -axis stays on the curve because the curve is symmetric about the $x$ -axis.
Commutativity. $P + Q = Q + P$ . The line $PQ$ and the line $QP$ are the same line, so the third intersection is the same.
Identity $\mathcal{O}$ . The “line” through $P$ and $\mathcal{O}$ is the vertical line through $P$ , which meets the curve at $P, -P, \mathcal{O}$ . Reflecting the third intersection $-P$ gives back $P$ , so $P + \mathcal{O} = P$ .
Inverse. For $P = (x, y)$ , the point $-P = (x, -y)$ is its inverse. As shown in figure 3, $P + (-P) = \mathcal{O}$ .

What’s left is associativity, $(P+Q) + R = P + (Q+R)$ . This is the only axiom that is not visually obvious, and the proof requires projective-plane algebraic geometry (a corollary of Bézout’s theorem).

This addition follows the geometry of the curve and branches by case depending on $P, Q$ , so it is genuinely different from coordinate-wise addition. The forward computation (find $nP$ given $P$ and $n$ ) only needs about $O(\log n)$ additions using a double-and-add binary expansion. The inverse problem (find $n$ such that $nP = Q$ ) costs exponential time under any naïve search.

Under this addition, the set of rational points (those with coordinates in $\mathbb{Q}$ ), $E(\mathbb{Q})$ , is an abelian group. The Mordell–Weil theorem says it is finitely generated, i.e.

E(\mathbb{Q}) \cong \mathbb{Z}^r \oplus T

with rank $r$ and torsion part $T$ . Understanding the rational points on an elliptic curve thereby becomes a problem in group theory.

How elliptic curves run as a cryptosystem

So far we have “an addition that makes the points into a group” and “forward $nP$ is fast, the inverse problem is exponentially heavy.” Using that asymmetry directly, you can build a crypto algorithm out of the curve itself.

The actual crypto is done not over $E/\mathbb{Q}$ but over $E/\mathbb{F}_p$ , the same equation $y^2 = x^3 + ax + b$ with coordinates evaluated mod $p$ for a large prime $p$ . The set of points $E(\mathbb{F}_p)$ is a finite abelian group (with roughly $p$ points), and the addition described above carries over unchanged.

Elliptic Curve Diffie–Hellman (ECDH) works like this.

sequenceDiagram
  participant A as Alice
  participant B as Bob
  Note over A,B: Public parameters = curve E/F_p and base point G
  A->>A: Generate private key a, compute aG
  B->>B: Generate private key b, compute bG
  A->>B: Send public key aG
  B->>A: Send public key bG
  A->>A: a × bG = abG
  B->>B: b × aG = abG
  Note over A,B: Both end up with the shared secret abG

The public parameters — the curve $E/\mathbb{F}_p$ and a base point $G \in E(\mathbb{F}_p)$ — are shared with everyone. Alice picks a private key $a$ (an integer), computes $aG$ , and sends it to Bob. Bob picks $b$ and sends $bG$ back. When each multiplies the received public key by their own private key, both arrive at the same point $abG = a(bG) = b(aG)$ , and that becomes the shared secret.

An eavesdropper sees $G$ , $aG$ , $bG$ , and to recover $a$ or $b$ they would need to solve “how many times do you have to multiply $G$ to get $aG$ .” That is the elliptic curve discrete logarithm problem (ECDLP), and the best known generic algorithms (Pollard’s rho, etc.) still cost $O(\sqrt{p})$ . With $p$ around 256 bits, that is out of reach for any realistic computing budget.

For comparable security, RSA needs about a 2048-bit key while ECC gets by with 256 bits. Shorter keys mean lighter computation and smaller payloads. That performance difference is part of why TLS 1.3 made ECDHE the default and why Signal, Bitcoin, and SSH’s Ed25519 became the standards.

Signatures (ECDSA / Ed25519) are built on the same group structure. The flow itself mirrors ECDH: derive a public key from the private key, verify by group operations.

That covers the connection to crypto. Next we follow the same addition structure in a very different direction.

What modular forms are

We’re moving into complex analysis here. Everything downstream (from Taniyama–Shimura onward) only uses the “Fourier coefficient sequence $\{a_n\}$ ” that this section ends on, so if the definitional details feel heavy, it’s fine to skim down to the line where that series appears.

We introduce an object that at first looks completely unrelated to elliptic curves. A modular form is a holomorphic function $f(\tau)$ on the upper half plane

\mathbb{H} = \{ \tau \in \mathbb{C} : \mathrm{Im}(\tau) > 0 \}

with a strong symmetry property. Specifically, for any integer matrix

\begin{pmatrix} a & b \\ c & d \end{pmatrix} \in SL_2(\mathbb{Z})

acting on $\tau$ by

\tau \mapsto \frac{a\tau + b}{c\tau + d}

the function satisfies

f\left(\frac{a\tau + b}{c\tau + d}\right) = (c\tau + d)^k \, f(\tau)

(a modular form of weight $k$ ). It is also bounded at $i\infty$ . Those with $a_0 = 0$ (i.e. vanishing at infinity) are called cusp forms.

“Strong symmetry” means $f$ is invariant under $\tau \mapsto \tau + 1$ and changes only by a factor of $\tau^k$ under $\tau \mapsto -1/\tau$ , simultaneously. Imposing both is so restrictive that the space of functions meeting them is low-dimensional, essentially determined by the weight $k$ and the level (replace $SL_2(\mathbb{Z})$ by a subgroup $\Gamma_0(N)$ ).

The “Fourier expansion” is what we need next. Since $f$ is invariant under $\tau \mapsto \tau + 1$ , setting $q = e^{2\pi i \tau}$ gives

f(\tau) = \sum_{n=1}^{\infty} a_n q^n

(with $a_0 = 0$ for a cusp form). This coefficient sequence $\{a_n\}$ is the object that lines up with the elliptic-curve side in the next section.

Elliptic curves and modular forms produce the same L-function (the Taniyama–Shimura conjecture)

For an elliptic curve $E/\mathbb{Q}$ , define for each prime $p$

a_p(E) = p + 1 - \#E(\mathbb{F}_p)

You reduce $E$ mod $p$ (coordinates mod $p$ ) and count the points on $\mathbb{F}_p$ . Since $E(\mathbb{F}_p)$ is finite, you take how far that count deviates as $a_p(E)$ .

Use this to build an $L$ -function.

L(E, s) = \prod_p \frac{1}{1 - a_p(E) p^{-s} + p^{1 - 2s}} = \sum_{n=1}^{\infty} \frac{a_n(E)}{n^s}

The arithmetic data of the elliptic curve (the count of points at each prime) lines up as the coefficients $\{a_n(E)\}$ of a Dirichlet series.

On the other side, a modular form $f(\tau) = \sum a_n q^n$ also gives an $L$ -function.

L(f, s) = \sum_{n=1}^{\infty} \frac{a_n}{n^s}

The Taniyama–Shimura conjecture (now a theorem; the full proof was completed by Wiles–Taylor in 1995 plus Breuil–Conrad–Diamond–Taylor in 2001) makes the following claim.

For every elliptic curve $E/\mathbb{Q}$ , there is a weight-2 cusp form $f \in S_2(\Gamma_0(N))$ such that $L(E, s) = L(f, s)$ . The integer $N$ is called the conductor of $E$ and is determined only by the primes of bad reduction.

The sequence $\{a_n(E)\}$ built from the elliptic curve and the Fourier coefficients $\{a_n(f)\}$ of some modular form agree completely. The phrase “they’re the same object” refers, concretely, to this coincidence of coefficient sequences, mediated by the $L$ -function.

A geometric object (the elliptic curve) and an analytic object (the modular form) overlap at the level of numerical sequences. The deep reason this happens runs through Galois representations, but for this article we just accept it as fact.

The Frey curve forces a Fermat solution into an elliptic curve

Fermat’s Last Theorem: for any integer $n \geq 3$ , the equation

a^n + b^n = c^n

has no solution in pairwise coprime positive integers $a, b, c$ .

Composite exponents reduce to prime factor exponents, so it suffices to prove the cases $n =$ prime $p$ and $n = 4$ . The cases $n = 3, 4, 5, 7$ had already been proved individually by the 19th century using various tools from algebraic number theory, but there was no single instrument that took care of all primes $p \geq 5$ at once. That instrument is the identification of elliptic curves with modular forms, and the Frey curve is the entry point that activates it.

Frey’s idea was: if a solution $(a, b, c, p)$ existed, use that solution as material to build one specific elliptic curve.

E_F : y^2 = x(x - a^p)(x + b^p)

The right-hand side is a product of three linear factors, and $y = 0$ at $x = 0, a^p, -b^p$ .

y² = x(x − 2)(x + 2) (an analogue showing the Frey curve's "right side factors into three linear pieces with one root at 0" structure; for actual (aᵖ, bᵖ) the values are too large to visualize)

What makes this curve special shows up in its discriminant.

\Delta(E_F) = 16 \cdot (abc)^{2p}

The discriminant is built entirely from $p$ -th powers, which is not a configuration that ordinary elliptic curves exhibit. Each prime factor appears in a heavily skewed way, which forces strong constraints on the Galois representation side.

Ribet’s theorem closes the contradiction

Assuming a Fermat solution existed, we now have the Frey curve $E_F$ . By the Taniyama–Shimura conjecture (assumed proved), $E_F$ corresponds to some modular form $f$ .

Here we use Ribet’s level-lowering theorem (1990). Ribet showed, from properties of the $p$ -adic Galois representation attached to $E_F$ , that the level (the $N$ in $\Gamma_0(N)$ ) of the corresponding modular form $f$ can be lowered all the way down to 2. Intuitively, the very specific shape of the Frey curve’s discriminant narrows the “primes of bad reduction” down to just 2.

That means if a Fermat solution existed, there would have to be a cusp form

f \in S_2(\Gamma_0(2))

of weight 2 and level 2.

But the dimension of $S_2(\Gamma_0(2))$ is

\dim S_2(\Gamma_0(2)) = 0

so no such cusp form exists. The contradiction is “you are forced to correspond to a modular form that does not exist.”

The full logical chain.

graph TD
  A["Assume Fermat solution a^p + b^p = c^p exists"]
  B["Frey curve E_F<br/>y^2 = x(x - a^p)(x + b^p)"]
  C["Taniyama-Shimura<br/>E_F corresponds to a weight-2 cusp form f"]
  D["Ribet's level lowering<br/>f's level can be lowered to 2"]
  E["So we need a weight-2 level-2 cusp form f"]
  F["S_2(Γ_0(2)) is 0-dimensional"]
  G["Contradiction<br/>therefore no Fermat solution exists"]

  A --> B
  B --> C
  C --> D
  D --> E
  E --> F
  F --> G

Wiles’s actual work was to prove Taniyama–Shimura for the class of “semistable” elliptic curves, which is exactly the class Frey curves belong to (Ribet had set up the necessary stage for proving FLT in 1986). The remaining cases (generalization to non-semistable elliptic curves) were completed by Breuil–Conrad–Diamond–Taylor in 1999–2001.

Two objects with different shapes and different homes — “elliptic curves” and “modular forms” — can be identified via the $L$ -function. That identification lets you take the bogus elliptic curve built from a hypothetical Fermat equation, pair it with a non-existent modular form, and squeeze a contradiction out. The reason the two are the same is precisely the reason Fermat’s Last Theorem falls, and the same addition structure is still running inside the TLS session in your browser right now.