Divergent sums and the class number formula

Over on MathOverflow, we’ve had a bunch of discussions about the class number formula. In particular, Keith Conrad pointed out a paper of Orde which gives a beautiful nonsense proof of the class number formula. This post is my attempt to understand why Orde’s argument works. Specifically, I am going to use an idea which I learned from Terry Tao’s blog: Arguments about divergent sums are often really arguments about the constant term in asymptotic expressions for smoothed sums.

This post assumes familiarity with the concepts and notations of a first course in algebraic number theory.

The class number formula

Let D be a positive integer which is either (a) squarefree and congruent to 3 \mod 4 or (b) of the form 4D', where D' is squarefree and congruent to 1 or 2 \mod 4. Let R be \mathbb{Z}[\frac{1+\sqrt{-D}}{2}] or \mathbb{Z}[\sqrt{-D'}] respectively. Let h and w respectively be the class number and the number of units of R. Let \chi(k) be the Kronecker symbol \left( \frac{-D}{k} \right). (This is the same as the Jacobi symbol when k is odd and positive, but it is defined for all integers. See the link for details.)

There are two formulas which I wish to discuss, and both of which I’ve heard called the class number formula.

\displaystyle{ \sum_{k=1}^{\infty} \frac{\chi(k)}{k} = \frac{2 \pi h}{\sqrt{D} w} }. \quad (1)

\displaystyle{ \mathrm{Cesaro} \! \sum_{k=1}^{\infty} \chi(k) = \frac{2 h}{w} }. \quad (2)

The Cesaro sum in (2) can be, and often is, rewritten into a finite sum. I’ll discuss this below, but I’ve become convinced that the Cesaro sum is the most natural formulation.

Until this week, all the proofs I knew obtained the equation (1) in a reasonably motivated way. They then proved that \sum_{k=1}^{\infty} \chi(k) / k = (\pi/\sqrt{D}) \mathrm{Cesaro} \! \sum_{k=1}^{\infty} \chi(k) in some mysterious way: either by proving the functional equation for L(s, \chi), or by manipulations with Gauss sums. This is particular odd when one considers that equation (2) is simpler than (1), and (after a little rewriting) can be made into a purely rational expression, with no limits or transcendental quantities at all. The first thing that fascinated me about Orde’s paper is that he gets (2) directly, without passing through (1). In this post, we will prove both (1) and (2) directly.


If D=4, then R=\mathbb{Z}[i], h=1 and w=4. The character \chi periodically repeats

\displaystyle{1,\ 0,\ -1,\ 0,\ 1,\ 0,\ -1,\ 0,\ \ldots}.

Sure enough

\displaystyle{ 1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \cdots = \frac{2 \cdot \pi}{4 \sqrt{4}} },

confirming (1).
For (2), the partial sums of \sum \chi(k) are periodic, repeating

\displaystyle{ 1,\ 1,\ 0,\ 0,\ 1,\ 1,\ 0,\ 0,\ 1,\ 1,\ 0,\ 0,\ \ldots}

By definition, the Cesaro sum is the limit of the average of these partial sums. If we average over 4k terms, the average is exactly 1/2, and in general the averages approach 1/2. Sure enough, this is (2 \cdot 1)/4.

In general, the sum in (1) is painful to compute directly. There is an algorithm to do it: write \sum \chi(k) x^k = p_D(x)/(1-x^D) and evaluate the integral \int_0^1 \frac{p_D(x)/x}{1-x^D} by the standard algorithm for integrating rational functions. (Note that p_D(1)=p_D(0)=0, so the integral converges.) For example, when D=3, we have h=1, w=6 and

\displaystyle{  \frac{1}{1}-\frac{1}{2}+\frac{1}{4}-\frac{1}{5}+\frac{1}{7}-\frac{1}{8}+\cdots = \int_0^1 \frac{1-x}{1-x^3} dx = \frac{2 \pi}{6 \sqrt{3}} }

The numerators of these rational functions are the Fekete polynomials.

The first sum quickly gets out of hand for naive hand computation. The second sum, on the other hand, is very tractable. Let’s take D=23. Clearly, w=2 and it turns out that h=3. The character \chi periodically repeats

+ + + + - + - + + - - + + - - + - + - - - - 0

with partial sums

1 2 3 4 3 4 3 4 5 4 3 4 5 4 3 4 3 4 3 2 1 0 0.

The average of these 23 numbers is 3=(2 \cdot 3)/2, as desired.

As should be clear, the Cesaro sum in (2) can be easily turned into a finite sum. The most direct translation of the above argument gives 1/D \sum_{k=1}^{D} (D-k) \chi(k). This can be simplified in various ways, to give formulas like the one discussed here. However, doing so tends to introduce cases based on what D is modulo 8 which, to my mind, obscure the main result.

Ideals and Quadratic forms

For N a positive integer, let S(N) be the number of ideals of R with norm N. By unique factorization into prime ideals, S(N) is a multiplicative function. A little thought about the prime power case reveals that

\displaystyle{ S(N) = \sum_{d|N} \chi(d) }.

We now give a second description of S(N). Let I_1, I_2, … I_h be representatives for the classes in the class group of R. Let’s consider how many ideals J of R have norm N and have class [I_j]. A fractional ideal has class I_j if and only if it is of the form \alpha I_j, for \alpha \in \mathrm{Frac} \ K. The number of ways to express a given fractional idea in this form is w. The fractional ideal \alpha I_j is an honest ideal of R if and only if \alpha \in I_j^{-1}. And |\alpha I_j| = |\alpha| |I_j|. In short,

\displaystyle{ S(N) = (1/w) \sum_{j=1}^{h} \# \{ \alpha \in I_j^{-1} : |\alpha| |I_j| = N \} }.

Choosing an identification of I_j^{-1} with \mathbb{Z}^2, let Q_j be the quadratic form Q_j(\alpha) = |\alpha| |I_j|. So

\displaystyle{ S(N) = (1/w) \sum_{j=1}^{h} \# \{ (u,v) \in \mathbb{Z}^2 : Q_j(u,v) = N \} } \quad (3).

For future reference, note that Q_j has determinant D/4.

Equation (3) is valid for N>0. Inspired by it, we define

\displaystyle{ S(0):=h/w}.

To summarize, for N>0, we have

\displaystyle{ S(N) = (1/w) \sum_{j=1}^{h} \# \{ (u,v) \in \mathbb{Z}^2 : Q_j(u,v) = N \}  =  \sum_{d|N} \chi(d) }

and we want to prove that

\displaystyle{ S(0) := }

\displaystyle{  (1/w) \sum_{j=1}^{h} \# \{ (u,v) \in \mathbb{Z}^2 : Q_j(u,v) = 0 \}  =  (1/2) \mathrm{Cesaro} \! \sum_{d>0} \chi(d) }.

Orde has an intuition for that 1/2 term involving negative definite quadratic forms, but it doesn’t help me. I’ll just leave it as a mystery for now.

The divergent sum “argument”

I should say at the outset that, since I am going to give a correct proof later, I have felt free to abuse divergent sums even more blatantly than Orde does. I gave a more faithful write up of his argument here.

Consider \sum_{N=0}^{\infty} S(N). On the one hand, this is

\displaystyle{  (1/w) \sum_{j=1}^{h} \sum_{(u,v) \in \mathbb{Z}^2} 1 }.

As any student of divergent sums knows, \sum_{u \in \mathbb{Z}} 1 =0, so this is 0.

On the other hand, \sum_{N=0}^{\infty} S(N) is

\displaystyle{ S(0) + \sum_{N=1}^{\infty}  \sum_{d|N} \chi(d). }

Interchanging summation, we have

\displaystyle{ S(0) + \sum_{d=1}^{\infty} \chi(d) \sum_{k=1}^{\infty} 1 }

Now, \sum_{k=1}^{\infty} 1=-1/2. (Notice that \sum_{k \in \mathbb{Z}} 1 = \sum_{k=-\infty}^{-1} 1 + 1 + \sum_{k=1}^{\infty} 1 = -1/2+1-1/2=0. Consistency!) So we obtain

\displaystyle{ S(0) + (-1/2) \sum_{d=1}^{\infty} \chi(d) =0 .}

Exactly what we wanted!

Smoothing sums

In order to turn this into an actual proof, we will need to introduce a cutoff function, which we will call \eta. Examples of \eta‘s which we might use are e^{-x} or \max(1-x, 0). We should have \eta(0)=1, we should make sure that \eta “dies fast enough near \infty“, that \eta is “differentiable enough” and that \eta' has bounded variation. The terms in quotes are intended to be sloppy.

For any sequence a(n), the \eta-summation of the sum \sum a(n) is \lim_{\delta \to 0^{+}} \sum a(n) \eta(\delta n). Note that, if we take \eta(x) to be 1 for x \in [0,1] and 0 otherwise, then \eta-summation is ordinary summation. However, this choice of \eta will not be smooth enough for me. Note also that taking \max(1-x, 0) gives Cesaro summation; this choice is smooth enough for my argument.

Orde does something like this. He computes bounds for the sum \sum_{n \leq M} S(n) q^n, where M \to \infty and q \to 1^{-} at certain linked rates, and q is rational. The reason he chooses this smoothing is that he wants to work only with finite sums and rational numbers, so that he can claim to have a fully rational proof of the rational relation (2). In pursuit of this goal, Orde also rewrites all limits into their (\delta, \epsilon) definitions, and uses some clever tricks so that he only has to do this a few times. I, on the other hand, will be profligate with my limits and integrals. If we had some sort of formal theory of constant terms of asymptotic series, comparable to the theory of formal power series, then one might be able to turn this post into a rational proof.

The proof

So, let us consider \sum_{N=0}^{\infty} \eta(\delta N) S(N).

On the one hand, this is

\displaystyle{ (1/w) \sum_{j=1}^{h} \sum_{(u,v) \in \mathbb{Z}^2} \eta( \delta Q_j(u,v)) . }

Using the Euler-Maclaurin formula, we can approximate this sum by the integral \int_{(u,v) \in \mathbb{R}^2}\eta(\delta Q_j(u,v)) du dv. Ordinarily, the main source of error in such an approximation comes from boundary terms, but in this case there are none. The other source of error is lack of smoothness of \eta. Our (unstated) assumptions on \eta get this error down to at least O(\delta). If \eta is smooth, we get all the way down to O(\delta^C), for any positive constant C! (We will feel free to omit the phrase “as \delta \to 0^+” when giving bounds like this.)

We can change variables in the integral so that Q_j becomes the standard norm on \mathbb{R}^2. The determinant of this change of variables is 2/\sqrt{D}, since \det Q_j = D/4. So

\displaystyle{  \int_{(u,v) \in \mathbb{R}^2}\eta(\delta Q_j(u,v)) du dv = \frac{2}{\sqrt{D}} \int_{(x,y) \in \mathbb{R} } \eta(\delta (x^2+y^2)) dx dy. }

Switching to polar coordinates, this is

\displaystyle{ \frac{2}{\sqrt{D}} \int_{r=0}^{\infty} 2 \pi r \eta(\delta r^2) dr = \frac{2 \pi}{\delta \sqrt{D}} \int_{0}^{\infty} \eta(s) ds.}

In summary,

\displaystyle{ \sum_{N=0}^{\infty} S(N) \eta(\delta N) = \frac{h 2 \pi}{w \delta \sqrt{D}} \int_{0}^{\infty} \eta(s) ds + O(\delta). \quad (4)}

Now, on the other hand, we have

\displaystyle{ \sum_{N=0}^{\infty} S(N) \eta(\delta N) = S(0) + \sum_{N=1}^{\infty} \eta(\delta N) \sum_{d|N} \chi(d).}

Interchanging summation, we get

\displaystyle{S(0) + \sum_{d=1}^{\infty} \chi(d) \sum_{k=1}^{\infty} \eta(\delta kd).}

Defining \psi(x) := \sum_{k=1}^{\infty} \eta(kx), we can rewrite this as

\displaystyle{S(0) + \sum_{d=1}^{\infty} \chi(d) \psi(\delta d).}

Using Euler-Maclaurin summation,

\displaystyle{  \psi(x) = \frac{1}{x} \int_{s=0}^{\infty} \eta(s) ds  - \frac{\eta(0)}{2} + O(x) \quad \mbox{as} \ x \to 0^+.}

Define the function \theta(x) by

\displaystyle{  \psi(x) = \frac{1}{x} \int_{s=0}^{\infty} \eta(s) ds - \frac{1}{2} \theta(x).}

So \theta(0)=\eta(0) = 1. If we choose \eta is smooth enough and fast enough decaying then \theta is a valid smoothing function. UPDATE I earlier asserted that the choice \eta(x) = \max(1-x,0) worked; I’m not sure that was true.

Using our new notation, the \eta-summation of \sum S(N) is

\displaystyle{ S(0) + \sum_{d=1}^{\infty} \frac{\chi(d)}{d} \frac{1}{\delta} \int_{s=0}^{\infty} \eta(s) ds - \frac{1}{2} \sum_{d=1}^{\infty} \chi(d) \theta(d \delta). \quad (5)}

Comparing the \delta^{-1} terms of (4) and (5), we see that

\displaystyle{ \frac{2 \pi h}{w  \sqrt{D}}  \int_{s=0}^{\infty} \eta(s) ds = \sum_{d=1}^{\infty} \frac{\chi(d)}{d}  \int_{s=0}^{\infty} \eta(s) ds.}

Canceling the integral, we get equation (1).

Now, let’s compare the constant terms. We get

\displaystyle{ 0 = S(0) - (1/2) \lim_{\delta \to 0^{+}} \sum_{d=1}^{\infty} \chi(d) \theta(d \delta)}.

The second term is the \theta-summation of \sum \chi(d).

Key Fact: If a(d) is periodic with average 0, then the \theta-summation of \sum a(d) is always the Cesaro sum, independent of the choice of \theta.

Terry proves this for a(d)=(-1)^d (see the discussion above Exercise 1). It is a good exercise to adjust this proof for the general case. So we have shown that

\displaystyle{ S(0)= (1/2) \mathrm{Cesaro} \! \sum_{d=1}^{\infty} \chi(d) .}

This is equation (2).


4 thoughts on “Divergent sums and the class number formula

  1. “This can be simplified in various ways, to give formulas like the one discussed here LINK” ?

  2. What happens when you look at real quadratic fields? Is there anything similar? I know I shouldn’t expect too much, especially as we are going to have to involve the regulator, and that not that much is known about class numbers of real quadratic fields. Nice post though, I wasn’t aware of the Cesaro sum formulation, and was quite surprised!

  3. The trouble with quadratic fields is that the unit group is infinite, so every ideal can be written infinitely many was as \alpha I. Thus, instead of just counting each ideal w times, you need to sum over I/R^{\times}. (That’s just a quotient of sets.) We can still approximate this by an integral over (I \otimes \mathbb{R})/R^{\times}. This is where the regulator comes in. I haven’t checked, but what you should get is that the Cesaro sum of \chi(k) is h times a term involving the regulator.

Comments are closed.