Fun with y^2=x^p-x

Here’s a basic example that comes up if you work with elliptic curves: Let p be a prime which is 3 \mod 4. Let E be the elliptic curve v^2=u^3-u over a field of characteristic p. Then E has an endomorphism F(u,v) = (u^p, v^p). It turns out that, in the group law on E, we have F^2 = [-p]. That is to say, F(F(u,v)) plus p copies of (u,v) is trivial.

I remember when I learned this trying to check it by hand, and being astonished at how out of reach the computation was. There are nice proofs using higher theory, but shouldn’t you just be able to write down an equation which had a pole at F(F(u,v)) and vanished to order p at (u,v)?

There is a nice way to check the prime 3 by hand. I’ll use \equiv for equivalence in the group law of E. Remember that the group law on E has -(u,v) \equiv (u,-v) and has (u_1,v_1)+(u_2,v_2)+(u_3,v_3) \equiv 0 whenever (u_1, v_1), (u_2, v_2) and (u_3, v_3) are collinear.

We first show that

\displaystyle{ F(u,v) \equiv (u-1, v) - (u+1, v) \quad (\dagger)}

Proof of (\dagger): We want to show that F(u,v), (u+1,-v) and (u-1,v) add up to zero in the group law of E. In other words, we want to show that these points are collinear. We just check:

\displaystyle{ \det \begin{pmatrix} 1 & u^3 & v^3 \\ 1 & u-1 & -v \\ 1 & u+1 & v \end{pmatrix} =   2 v (v^2-u^3+u) = 0}

as desired. \square.

Use of (\dagger): Let (u_0, v_0) be a point on E. Applying F twice, we get

\displaystyle{ F^2(u_0,v_0) \equiv F \left( (u_0-1,v_0) - (u_0+1,v_0) \right)}

\displaystyle{  \equiv (u_0-2,v_0) - 2 (u_0,v_0) + (u_0+2,v_0)}.

Now, the horizontal line v=v_0 crosses E at three points: (u_0, v_0), (u_0-2, v_0) and (u_0+2, v_0). (Of course, u_0 -2 =u_0+1, since we are in characteristic three.) So (u_0-2, v_0) + (u_0, v_0) + (u_0+1, v_0)  \equiv 0 and we have

\displaystyle{F^2(u_0, v_0) \equiv -3 (u_0, v_0)}

as desired. \square.

I was reminded of this last year when Jared Weinstein visited Michigan and told me a stronger statement: In the Jacobian of y^2 = x^p-x, we have F^2 = [(-1)^{(p-1)/2} p], where F is once again the automorphism F(x,y) = (x^p, y^p).

Let me first note why this is related to the discussion of the elliptic curve above. (Please don’t run away just because that sentence contained the word Jacobian! It’s really a very concrete thing. I’ll explain more below.) Letting C be the curve y^2 = x^p-x, and letting p be 3 \mod 4, we have a map C \to E sending (x,y) \mapsto (y x^{(p-3)/4}, x^{(p-1)/2}), and this map commutes with F. I’m going to gloss over why checking F^2 = [(-1)^{(p-1)/2} p] on C will also check it on E, because I want to get on to playing with the curve C, but it does.

So, after talking to Jared, I was really curious why F^2 acted so nicely on the Jacobian of C. There are some nice conceptual proofs but, again, I wanted to actually see it. Now I do.

Let p be any odd prime. Let C be the curve y^2=x^p-x, over a field of characteristic p. We’ll be working in the Jacobian of C. This is a group J, generated by the points of C, and subject to the relation that, if there is a polynomial f(x,y) vanishing precisely at the points (x_1, y_1), (x_2, y_2), …, (x_n, y_n), then \sum (x_i, y_i)=0. If you’ve seen this theory laid out for projective curves, then you use rational functions rather than polynomials, and you have to keep track of poles as well as zeroes. Because I’m using the affine curve y^2=x^p-x, I get to just work with polynomials and their zeroes.

When we run this construction for a curve of the form y^2=\mathrm{cubic}, then the elements of the group are just the points on the curve, plus the additive identity. Let’s see why: (x_0, y_0) = - (x_0, - y_0), as these are the two points on the line x=x_0, so we don’t need formal inverses. And for any two points (x_1, y_1) and (x_2, y_2) with x_1 \neq x_2, the line through these points will meet the curve once more, say at (x_3, y_3), so we have (x_1, y_1) + (x_2, y_2) = - (x_3, y_3)= (x_3, -y_3). We can repeatedly use this trick to reduce sums of many points to sums of fewer points.

I am now asking you to consider y^2=x^p-x, which (except when p=3) is not of the form
y^2=\mathrm{cubic}. This means that not every element of J will be equivalent to a single point of the curve C. Other than that, the theory really isn’t harder.

The curve C has an endomorphism F(x,y) = (x^p, y^p), just as before. And, just as before, what we are going to be showing is that F^2 = (-1)^{(p-1)/2} p as endomorphisms of the group J. In particular, when p \equiv 3 \mod 4, we have F^2 =-p.

The key identity

We’re going to figure out how to generalize equation (\dagger) for all primes. The equation we want is

\displaystyle{ F(x_0,y_0) \equiv - \sum_{j=1}^{p-1} \left( \frac{j}{p} \right) (x_0+j,y_0)}.

Here \left( \frac{j}{p} \right) is the quadratic residue symbol: it equals 1 if j is a quadratic residue modulo p; it equals -1 if j is a non-QR and 0 if j=0.

Just as in the p=3 case, we’ll want to rewrite this to have fewer minus signs. For any (x_0, y_0) on C, the vertical line x=x_0 meets C at two points: (x_0, y_0) and (x_0, -y_0). So -(x_0, y_0) = (x_0, -y_0). So we can rewrite our desired equation as

\displaystyle{F(x_0, y_0) + \sum_{j=1}^{p-1} (x_0+j, \left( \frac{j}{p}\right) y_0)=0. \quad (\dagger \dagger)}.

So we need to find a polynomial which passes through the points (x_0+j, \left( \frac{j}{p}\right) y_0). Set

\displaystyle{ g(x,y) = y- (x-x_0)^{(p-1)/2} y_0 }.

Since \left( \frac{j}{p} \right) = (j)^{(p-1)/2}, the polynomial g will pass through all of the points (x_0+j, \left( \frac{j}{p}\right) y_0). Where else does g=0 meet C?

Well, if g=0 then y = (x-x_0)^{(p-1)/2} y_0. Plugging this into the equation for C, we have

\displaystyle{ (x-x_0)^{p-1} y_0^2 = x^p -x}.


\displaystyle{ (x-x_0)^{p-1} (x_0^p-x_0) = x^p -x \quad (\ast)}.

This is a degree p polynomial in x. We already know p-1 of the roots — they are at x_0+j for j a nonzero element of \mathbb{F}_p.

Equation (\ast) has leading term x^p, and constant term - (-x_0)^{p-1} (x_0^p-x_0). So we can conclude that the product of the roots of (\ast) is x_0^{p-1} (x_0^p-x_0). We have \prod_{j=1}^{p-1} (x_0-j) = x_0^{p-1}-1. So the last root of (\ast) is at \frac{x_0^{p-1} (x_0^p-x_0)}{x_0^{p-1} - 1} = x_0^p. A little more thought shows that the intersection of g=0 with C is at (x_0^p, y_0^p), not (x_0^p, - y_0^p).

So we have found a polynomial which vanishes precisely at the points (x_0+j, \left( \frac{j}{p} \right) y_0) and F(x,y), and we have proved equation (\dagger \dagger).

Why we’ve won

Let \zeta be the automorphism (x,y) \mapsto (x+1,y) of the curve C. Clearly, \zeta^p=\mathrm{Id}. A little less obviously, I claim that 1+\zeta+\zeta^2+\cdots + \zeta^{p-1} =0 in the endomorphism ring of the group J. In other words, I am claiming that, for any (x_0, y_0) on C, we have

\displaystyle{ \sum_{j=0}^{p-1} (x_0+j, y_0) \equiv 0}

This is because the points (x_0+j, y_0) are precisely the intersections of C with the horizontal line y=y_0.

So J is a module for the ring \mathbb{Z}[\zeta]/(1+\zeta+\cdots+\zeta^{p-1}). This is better known as the ring of cyclotomic integers. And identity (\dagger \dagger) tells us that

\displaystyle{ F = \sum_{j=0}^{p-1} \left( \frac{j}{p} \right) \zeta^j}

in the ring \mathrm{End}(J).

The right hand side of the above equation is a very famous element of R: The Gauss sum. And it’s most famous property is that F^2 = (-1)^{(p-1)/2} p — exactly what we wanted to show.

Let’s review why the Gauss sum has the desired square.

\displaystyle{ F^2 =\sum_j \sum_k \left( \frac{j}{p} \right) \left( \frac{k}{p} \right) \zeta^{j+k} = \sum_j \sum_k \left( \frac{jk}{p} \right)  \zeta^{j+k}}

Group together terms with the power of \zeta to get

\displaystyle{ \sum_a \sum_{r} \left( \frac{r(a-r)}{p} \right) \zeta^a = \sum_a \sum_{r \neq 0} \left( \frac{ar^{-1}-1}{p} \right) \zeta^a}.

If a \neq 0, then a r^{-1}-1 takes on every value in \mathbb{F}_p once, except that it misses -1. If a=0, then a r^{-1}-1 takes on the value -1 over and over, p-1 times in total. Using \sum_j \left(\frac{j}{p} \right)=0, our sum is

\displaystyle{ - \sum_{a \neq 0} \left( \frac{-1}{p} \right) \zeta^a + (p-1) \left( \frac{-1}{p} \right) = p \left( \frac{-1}{p} \right) = (-1)^{(p-1)/2} p}

as desired. \square

What I want to emphasize is that every equality of this proof corresponds to writing down a rational function on C with the corresponding poles and zeroes. For example, the second to last equality above replaced \sum_{a \neq 0} \zeta^a by -1. There is a corresponding rational function which has zeroes at \sum_{a \neq 0} \zeta^a(x_0, y_0) and has a pole at (x_0, -y_0): Namely, (y-y_0)/(x-x_0). It would be painful, but completely doable, to actually use this proof to write down a rational function on C with a zero at (x_0^p, y_0^p), and a p-fold pole at (x_0, (-1)^{(p-1)/2} y_0). In fact, I cheated a bit in the beginning of this post. I didn’t just cleverly guess the formula (\dagger); I instead wrote down (\dagger \dagger) and plugged in p=3.

Descending the resulting rational function to the curve u^2 = v^3-v would probably leave an unmotivated mess. I understand how to do it: You need to look at the rational function of (x,y) and rewrite it in the variables (u,v); Galois theory guarantees that you will succeed. But I suspect the result would be unenlightening.

So, finally, I understand why F^2=-p without going through the theory of supersingular curves, or the Weil bounds, or anything deeper than Gauss sums. Hope you enjoyed!

3 thoughts on “Fun with y^2=x^p-x

  1. You have a little mixup in notation at the beginning: you define E as u^2=v^3-v, but then you prove things about v^2=u^3-u.

    Looking forward to reading the rest of this!

  2. Thanks Allison!

    Jared also e-mailed and mentioned that a similar trick should work for y^{p-1} = x^p-x and indeed it does. For j \in \mathbb{F}_p^{\times}, let \tau_j be the automorphism (x,y) \mapsto (x, jy). Let \sigma be (x,y) \mapsto (x+1,y). Then looking at the intersections of y = (x-x_0) y_0 with y^{p-1} = x^p-x shows that F = \sum_{j \neq 0} \sigma^j \tau_j. The group generated by the \tau‘s is isomorphic to \langle \tau \rangle/\tau^{p-1}; the ring of endomorphisms generated by \sigma and the \tau_j is isomorphic to

    \mathbb{Z}[\sigma, \tau] /\langle \sigma^{p-1}+\cdots +\sigma+1, \tau^{p-2}+\cdots+\tau+1 \rangle.

    By mapping \sigma and \tau to various nontrivial p-th and (p-1)-st roots of unity, the element F = \sum_{j \neq 0} \sigma^j \tau_j get’s sent to all of the various Gauss sums. I haven’t checked this carefully, but I believe the right statement should be that each of the (p-1)(p-2) nontrivial Gauss sums appears once in the cohomology of y^{p-1} = x^p-x.

Comments are closed.