Christol’s theorem and the Cartier operator

Let’s suppose that we want to compute 2^n \mod p, and we have already been given n written out in base p as n = \sum n_i p^i. Here p is a small prime and we want to do this conversation repeatedly for many n‘s.

Remember that 2^p \equiv 2 \mod p and thus 2^n \equiv 2^{n_0+n_1 p + n_2 p^2 + \cdots + n_{\ell} p^{\ell}} \equiv 2^{n_0} 2^{n_1} \cdots 2^{n_{\ell}}  \mod p. So, start with 1, multiply it by 2^{n_0}, then by 2^{n_1}, then by 2^{n_2} and so forth. When you get to the end, read off 2^n.

We can precompute the effect on \mathbb{F}_p of multiplying by 2^k, for 0 \leq k \leq p-1. Then we can compute 2^n just by scanning across the base p representation of n and applying these precomputed maps to the finite set \mathbb{F}_p.

The precise way to say that this is a simple, one-pass, process is that it is a computation which can be done by a finite-state automaton. Here is the definition: let I, S and O be finite sets (input, states and output), and s \in S (the start). For each i \in I, let A(i) be a map S \to S. We also have a map r:S \to O (readout). Given a string (i_0, i_1, \ldots, i_{\ell}) in I^{\ell}, we compute r( A(i_{\ell}) \circ \cdots A(i_1) \circ A(i_0) (s)). So our input is a string of characters from I, and our output is in O.

We can say that our example above shows that (n_{\ell}, \ldots, n_1, n_0) \mapsto 2^{\sum n_i p^i} \mod p is computable by a finite-state automaton. (In our example, the sets I, S and O all have cardinality p, but I do not want to identify them.)

This is a special case of an amazing result of Christol et al: Let f_n be a sequence of elements of \mathbb{F}_p. Then (n_{\ell}, \ldots, n_1, n_0) \mapsto f_{\sum n_i p^i} can be computed by a finite-state automaton if and only if the generating function \sum f_n x^n is algebraic over \mathbb{F}_p(x)!

We have just explained the case \sum f_n x^n = 1/(1-2x). The reader might enjoy working out the cases 1/(1-x-x^2) (the Fibonacci numbers) and (1 - \sqrt{1-4x})/2x (the Catalans).

In this post, I will use Christol’s theorem as an excuse to promote the Cartier operator, an amazing tool for working with differential forms in characteristic p.

Reformulating the problem

Let f be a power series in \mathbb{F}_p[[ x ]]. Define the power series s_0(f), s_1(f), …, s_{p-1}(f) by f(x) = \sum_{i=0}^{p-1} x^i s_i(f)(x^p). So s_i( 1/(1-2x)) = 2^i/(1-2^p x) = 2^i/(1-2x). Notice that g(x^p) = g(x)^p for g \in \mathbb{F}_p[[ x ]]. It will be convenient for future generalization to write the defining equation of the s_i‘s as

\displaystyle{f(x) = \sum_{i=0}^{p-1} x^i s_i(f)(x)^p}. \quad (*).

Let \mathfrak{S}(f) be the set of all power-series which can be obtained from f by applying the operators s_i finitely many times. So \mathfrak{S}(f) consists of f, s_0 \ f, …, s_{p-1} f , s_0  s_0  \ f,, s_1 s_0\ f, …, s_{p-1} s_{p-1} f, s_0 s_0 s_0 \ f and so forth.

Please stare at the following claim until it is obvious to you: The function (n_{\ell}, \ldots, n_1, n_0) \mapsto f_{\sum n_i p^i} can be computed by an FSA if and only if the set \mathfrak{S}(f) is finite. Indeed, in that case, we can take S = \mathfrak{S}(f).

Thus, Christol’s theorem is that \sum f_n x^n is algebraic if and only if \mathfrak{S}(f) is finite. It is convenient to generalize from \mathbb{F}_p to an arbitrary perfect field k of characteristic p. We continue to define the operators s_i by equation (*). So, for example, if a is some element of k, then s_i(1/(1-ax)) = a^{i/p}/(1-ax).

The right generalization of Christol’s theorem is then the following: \sum f_n x^n is algebraic over k(x) if and only if the k-vector subspace of k[[x]] spanned by \mathfrak{S}(f) is finite dimensional. Obviously, this reduces to Christol’s theorem when k=\mathbb{F}_p.

If k \cdot \mathfrak{S}(f) is finite dimensional, it is fairly easy to see that f, and in fact all of k \cdot \mathfrak{S}(f), must be algebraic. Taking y_1, …, y_s a basis for k \cdot \mathfrak{S}(f). we have s equations of the form

y_n = \sum x^i \left( \sum a_{imn} y_m \right)^p

for constants a_{imn} in k. These are s polynomial equations in s variables with coefficients in k(x). One can show that the solution (y_1, \ldots, y_s) is isolated and is therefore algebraic over k(x).

We will now focus on the other direction. Let K be a finite extension of k(x), contained in k (( x )), and let f \in K. We need to show that k \mathfrak{S}(f) is finite dimensional.

Surprising fact: s_i takes K to itself

Nothing like this is true in characteristic 0. Working over \mathbb{C}, if we extract the terms with exponent divisible by p from the power series f:=\sqrt{1+x}, we get g:=(1/p) \sum_{j=0}^{p-1} \sqrt{1+\zeta^j x}, where \zeta is a primitive p-th root of unity. Although g is algebraic over \mathbb{C}(x), it is not in \mathbb{C}(x)(f).

In orer to show s_i(K) \subseteq K, it is enough to do the case i=p-1, as s_i(f) =s_{p-1}(x^{p-1-i} f).
We have the formula

\displaystyle{s_{p-1}(f) = \sqrt[p]{\frac{1}{(p-1)!} \frac{\partial^{p-1} f}{(\partial x)^{p-1}}}}. \quad (**)

Since K \subset k((x)), the extension K/k(x) is separable (exercise!). We can use this to show that \partial/\partial x takes K to itself. The proof is simply implicit differentiation: Any f \in K satisfies a polynomial relation F(f,x)=0. So \partial f/\partial x = -F_2(f,x)/F_1(f,x), where the subscripts mean differentiation with respect to the first and second inputs of F. In particular, \partial f/\partial x is in K. We used that K/k(x) was separable to ensure that we didn’t divide by zero.

This shows that \partial^{p-1} f/(\partial x)^{p-1} is in K. Note also that \partial^{p} f/(\partial x)^{p} =0. I leave it as an exercise for the reader to check that, if g \in K satisfies \partial g/\partial x=0, then g \in K^p. Hint: Remember that k is perfect!

Putting the above together, the right hand side of (**) is in K. So the s_i carry K to itself.

Making the s‘s coordinate free

But K is infinite dimensional over k. Why is k \cdot \mathfrak{S}(f) finite dimensional? Thinking geometrically, K is the field of merormorphic functions on some complete curve X over k. To prove a finite dimensionality theorem, we will need to work globally on X. But the definition of the \sigma_i is very dependent on the choice of the coordinate x, which is a local choice. In this section, I will show how to get away from that choice.

Define a map \mathcal{C}: k((x)) dx \to k((x)) dx by f(x) dx \mapsto s_{p-1}(f) dx. Here is the miracle: The map \mathcal{C} does not depend on the choice of coordinate x.

Let’s see an example. Clearly, \mathcal{C}(x^{p-1} dx) = dx. Let x=u+u^2. So x^{p-1} dx = (u+u^2)^{p-1} (1+2u) du. Now, (u+u^2)^{p-1} (1+2u) = \left( u^{p-1} + \cdots + u^{2p-2} \right) (1+2u) = \left( u^{p-1} + \cdots + 2u^{2p-1} \right). So s_{p-1} \left( (u+u^2)^{p-1} (1+2u) \right) = 1 + 2u and \mathcal{C}\left( (u+u^2) (1+2u) du\right) = (1+2u) du = dx!

This miracle deserves a proof, but I won’t give it. I’ll just make an analogy: In characteristic zero, there is a residue map, which takes the differential form \sum f_i x^i dx to the coefficient of dx/x. A differential form can be locally written as dg if and only if its residue is zero. And, working purely in formal power series, it is somewhat challenging to show that the residue map is well defined. \mathcal{C} plays the same role in characteristic p: We have \mathcal{C}(\omega)=0 if and only if \omega can be written as dg, and it is true but subtle that \mathcal{C} does not depend on the choice of coordinates. But the residue is a mere number, while \mathcal{C} is a differential form, so \mathcal{C} is far more powerful! The operator \mathcal{C} is merely the simplest example of a Cartier operator, an operation on differential forms which exists only in characteristic p.

The fact that \mathcal{C} is independent of the choice of coordinate should be a hint that it sheafifies. Indeed, let \Omega be the Kahler-differentials of K over k. The elements of \Omega are precisely f dx for f \in K, but the formulation of \Omega does not depend on the choice of x. Then \mathcal{C} is a well-defined map from \Omega to itself. Moreover, if \omega does not have a pole at some point z \in X, then neither does \mathcal{C}(\omega).

Finishing the proof

We now restate our goal, changing all of our functions to differentials.

Let x be an element of K. Define the operations \sigma_0, \sigma_1, …., \sigma_{p-1} from \Omega to \Omega by \sigma_i(\omega) = \mathcal{C}(x^{p-i-1} \omega). Fix a differential form \omega \in \Omega. Let \Sigma(\omega) be the k-vector space spanned by those differential forms we get by applying the s_i to \omega. We want to show that \Sigma(\omega) is finite dimensional.

Let z be a point of X. As we have already mentioned, if x and \omega do not have poles at z, then all the forms in \Sigma(\omega) are likewise pole-free.

Suppose that z is one of the finitely many points where \omega or x has a pole. If \omega has a pole of order \leq a, and x has a pole of order \leq b at some point, then an easy computation shows that \sigma_i(\omega) has a pole of order \leq 1 + (a+(p-1-i)b)/p \leq a/p + 1 + (p-1)b/p. Feeding this bound into itself, \sigma_i \sigma_j(\omega) has a pole of order at most a/p^2 + (1 + p^{-1}) ( 1 + (p-1)b/p). Continuing in this manner, no element of \Sigma(\omega) will have a pole of order more than \max_n(a/p^n + (1+p^{-1} + \cdots + p^{-n+1})( 1 + (p-1)b/p)) \leq a + b + 1-p^{-1} at z.

The exact numbers don’t matter. The moral is that there are only finitely many points where the forms in \Sigma(\omega) can have poles; and the orders of the poles at those points are bounded. So the vector space \Sigma(\omega) is finite dimensional. QED.

Advertisements

3 thoughts on “Christol’s theorem and the Cartier operator

  1. Very nice post! I think I remember hearing about this from your MCF talk way back when.

    I think some of the sigmas under the paragraph “Reformulating the Problem” ought to be s’s, or maybe vice versa.

  2. Yes, very nice! A couple of small typos:

    “It is convenient to generalize from F_p to an arbitrary perfect field k”:
    but char. k = p!

    a_{imn} \in k (not =)

    df/dx = -F_2/F_1.

    C \omega = 0 (not \omega = 0)

    and I agree with Jared’s final comment.

Comments are closed.