# Bounds for sum free sets in prime power cyclic groups — three ways

A few weeks ago, I e-mailed Will Sawin excitedly to tell him that I could extend the new bounds on three-term arithmetic progression free subsets of $(\mathbb{Z}/p \mathbb{Z})^n$ to $(\mathbb{Z}/p^r \mathbb{Z})^n$. Will politely told me that I was the third team to get there — he and Eric Naslund already had the result, as did Fedor Petrov. But I think there might be some expository benefit in writing up the three arguments, to see how they are all really the same trick underneath.

Here is the result we are proving: Let $q=p^r$ be a prime power and let $C_q$ be the cyclic group of order $q$. Let $A \subset C_q^n$ be a set which does not contain any three term arithmetic progression, except for the trivial progressions $(a,a,a)$. Then

$\displaystyle{|A| \leq 3 | \{(m^1, m^2, \ldots, m^n) \in [0,q-1]^n : \sum m^i \leq n(q-1)/3 \}|.}$

The exciting thing about this bound is that it is exponentially better than the obvious bound of $q^n$. Until recently, all people could prove was bounds like $q^n/n^c$, and this is still the case if $q$ is not a prime power.

All of our bounds extend to the colored version: Let $(\bar{a}_i, \bar{b}_i, \bar{c}_i)$ be a list of $N$ triples in $((C_q)^n)^3$ such that $\bar{a}_i-2 \bar{b}_i+\bar{c}_i=0$, but $\bar{a}_i-2\bar{b}_j+\bar{c}_k \neq 0$ if $(i,j,k)$ are not all equal. Then the same bound applies to $N$. To see that this is a special case of the previous problem, take $\bar{a}_i = \bar{b}_i = \bar{c}_i$. Once the problem is cast this way, if $q$ is odd, one might as well define $(a_i, b_i, c_i) = (\bar{a}_i, -2 \bar{b}_i, \bar{c}_i)$, so our hypotheses are that $a_i+b_i+c_i = 0$ but $a_i+b_j+c_k \neq 0$ if $(i,j,k)$ are not all equal. We will prove our bounds in this setting.

## My argument — Witt vectors

I must admit, this is the least slick of the three arguments. The reader who wants to cut to the slick versions may want to scroll down to the other sections.

We will put an abelian group structure $\oplus$ on the set $\mathbb{F}_p^r$ which is isomorphic to $C_q$, using formulas found by Witt. I give an example first: Define an addition on $\mathbb{F}_3^2$ by

$\displaystyle{(x_0,x_1) \oplus (y_0, y_1) = (x_0+y_0, x_1+y_1 - x_0^2 y_0 - x_0 y_0^2)}$

The reader may enjoy verifying that this is an associative addition, and makes $\mathbb{F}_3^2$ into a group isomorphic to $C_9$. For example, $(1,0) \oplus (1,0) \oplus (1,0) = (2,1) \oplus (1,0) = (0,1)$ and $(0,1) \oplus (0,1) \oplus (0,1) = (0,0)$.

In general, Witt found formulas

$\displaystyle{(x_0,x_1, \ldots x_{r-1}) \oplus (y_0, y_1, \ldots, y_{r-1}) = (S_0(x,y), S_1(x,y), \ldots, S_{r-1}(x,y))}$

such that $\mathbb{F}_p^r$ becomes an abelian group isomorphic to $C_q$. If we define $x_e$ and $y_e$ to have degree $p^e$, then $S_e$ is homogenous of degree $p^e$. (Of course, Witt did much more: See Wikipedia or Rabinoff.)

Write

$\displaystyle{(x_0,x_1, \ldots x_{r-1}) \oplus (y_0, y_1, \ldots, y_{r-1}) \oplus (z_0, \ldots, z_{r-1}) = (S_0(x,y,z), S_1(x,y,z), \ldots, S_{r-1}(x,y,z))}$.

and set

$\displaystyle{\bar{F}(x,y,z) = \prod_{i=0}^{r-1} \left( 1-S_i(x,y,z)^{p-1} \right)}$.

For example, when $q=9$, we have

$\displaystyle{\bar{F} = \left( 1-(x_0+y_0+z_0)^2 \right) \left( 1-(x_1+y_1+z_1 - x_0^2 y_0 - x_0 y_0^2 - x_0^2 z_0 - x_0 z_0^2 - y_0^2 z_0 - y_0 z_0^2 + x_0 y_0 z_0)^2 \right)}$.

So $\bar{F}(x,y,z)=0$ if and only if $(x_0, \ldots, x_{r-1}) \oplus (y_0, \ldots, y_{r-1}) \oplus (z_0, \ldots, z_{r-1}) \neq 0$ in $C_q$.

We now work with $3kn$ variables, $x_e^f$, $y_e^f$ and $z_e^f$, where $0 \leq e \leq k-1$ and $1 \leq f \leq n$. Consider the polynomial

$\displaystyle{ \bar{G} = \prod_{f=1}^n \bar{F}(x^f, y^f, z^f) }$.

Here each $\bar{F}$ is a polynomial in $3k$ variables.

So $\bar{G}$ is a polynomial on $(\mathbb{F}_p^k)^{3n}$. We identify this domain with $C_q^{3n}$. Then $\bar{G}(x,y,z)=0$ if and only if $x+y+z \neq 0$ in the group $C_q^n$.

We define the degree of a monomial in the $x_e^f$, $y_e^f$ and $z_e^f$ by setting $\deg(x_e^f)=\deg(y_e^f)=\deg(z_e^f)=p^e$. In this section, “degree” always has this meaning, not the standard one. The degree of $S_e$ is $p^e$; the degree of $\bar{F}$ is $(p-1) + p(p-1) + p^2(p-1) + \cdots + p^{r-1}(p-1) = p^r-1 = q-1$ and the degree of $\bar{G}$ is $(q-1)n$.

From each monomial in $\bar{G}$, extract whichever of $\prod (x_e^f)^{i_e^f}$, $\prod (y_e^f)^{j_e^f}$ or $\prod (z_e^f)^{k_e^f}$ has lowest degree. We see that we can write

$\displaystyle{ \bar{G}(x,y,z) = \sum A_s(x) P_s(y,z) + \sum B_s(y) Q_s(x,z) + \sum C_s(z) R_s(x,y)}$

where $A_s$, $B_s$ and $C_s$ are monomials of degree $\leq (q-1)n/3$.

The now-standard argument (I like Terry Tao’s exposition) shows that $N$ is bounded by three times the number of monomials $\prod (x_e^f)^{m_e^f}$ of degree $\leq (q-1)n/3$. One needs to check that the argument works when the “degree” of a variable need not be $1$, but this is straightforward.

Except we have a problem! There are too many monomials. To solve this issue, let $F$ be the polynomial obtained from $\bar{F}$ by replacing every monomial $u^{\bar{k}}$ by $u^k$ where $k \equiv \bar{k} \bmod p-1$ with $k=0$ if $\bar{k}=0$ and $1 \leq k \leq p-1$ if $\bar{k}>0$. So $F$ coincides with $\bar{F}$ as a function on $\mathbb{F}_p^k$, but $F$ uses smaller monomials. For example, the reader who multiplies out the expression for $\bar{F}$ when $q=9$ will find a term $2 x_0^6 y_0 z_0$. In $F$, this is replaced by $2 x_0^2 y_0 z_0$. The polynomial $F$ does not have the nice factorization of $\bar{F}$, but it is much smaller. For example, when $q=9$, $\bar{F}$ has $137$ nonzero monomials and $F$ has $65$. Replacing $\bar{F}$ by $F$ can only lower degree, so $\deg(F) \leq q-1$. Now rerun the argument with $G(x,y,z) := \prod_{f=1}^n F(x^f,y^f,z^f)$. Our new bound is three times the number of monomials $\prod (x_e^f)^{m_e^f}$ of degree $\leq (q-1)n/3$, with the additional condition that all exponents $m_e^f$ are $\leq p-1$.

Now, the monomial $\prod_e x_e^{m_e}$ has degree $\sum_e p^e m_e$. Identify $[0,p-1]^r$ with $[0,q-1]$ by sending $(m_0, m_1, \ldots, m_{r-1})$ to $\sum m_e p^e$. We can thus think of $[0,p-1]^{rn}$ as $[0,q-1]^n$. We get the bound $N \leq 3 | \{(m^1, m^2, \ldots, m^n) \in [0,q-1]^n : \sum m^i \leq n(q-1)/3 \}|$, just as in the prime case.

## Naslund-Sawin — binomial coefficients

Let’s be much slicker. Here is how Naslund and Sawin do it (original here).

Notice that, by Lucas’s theorem, the function $x \mapsto \binom{x}{m}$ is a well defined function $C_q \to \mathbb{F}_p$ when $0 \leq m \leq q-1$. Moreover, using Lucas again,

$\displaystyle{\binom{x-1}{q-1}=\begin{cases} 1 & x \equiv 0 \bmod q \\ 0 & \mbox{otherwise} \end{cases}}.$

Define a function $F: C_q^3 \to \mathbb{F}_p$ by

$\displaystyle{F(x,y,z) = \binom{x+y+z-1}{q-1} = \sum_{i+j+k = q-1} \binom{x-1}{i} \binom{y}{i} \binom{z}{k}}$

$\displaystyle{=\sum_{i+j+k \leq q-1} (-1)^{q-1-i-j-k} \binom{x}{i} \binom{y}{i} \binom{z}{k}}$.

Here we have expanded by Vandermonde’s identity and used $\binom{x-1}{m} = \sum_{i \leq m} (-1)^{m-i} \binom{x}{i}$.

Define a function $G: C_q^{3n} \to \mathbb{F}_p$ by $G(x,y,z) = \prod_{f=1}^n F(x^f,y^f,z^f)$ just as before. So $G(x,y,z)=0$ if and only if $x+y+z \neq 0$ in the abelian group $C_q^n$. Expanding $G$ gives a sum of terms of the form $\prod_f \binom{x^f}{i^f} \binom{y^f}{j^f} \binom{z^f}{k^f}$. Considering such a term to have “degree” $\sum_f (i^f+j^f+k^f)$, we see that $G$ has degree $\leq (q-1)n$.

As in the standard proof, factor out whichever of $\prod_f \binom{x^f}{i^f}$, $\prod_f \binom{y^f}{j^f}$ or $\prod_f \binom{z^f}{k^f}$, has least degree. We obtain

$\displaystyle{ G(x,y,z) = \sum A_s(x) P_s(y,z) + \sum B_s(y) Q_s(x,z) + \sum C_s(z) R_s(x,y)}$

where $A_s$, $B$ and $C$ are products of binomial coefficients and, taking $\deg \binom{w}{m}=m$, we have $\deg A_s$, $\deg B_s$ and $\deg C_s \leq (q-1)n/3$.

We derive the bound $N \leq 3 | \{(m^1, m^2, \ldots, m^n) \in [0,q-1]^n : \sum m^f \leq n(q-1)/3 \}|$, exactly as before.

## Group rings — Petrov’s argument

I have taken the most liberties in rewriting this argument, to emphasize the similarity with the other arguments. The reader can see the original here.

Let $\Gamma = C_q^n$. Let $\mathbb{F}_p^{\Gamma}$ be the ring of functions $\Gamma \to \mathbb{F}_p$ with pointwise operations, and let $\mathbb{F}_p[\Gamma]$ be the group ring of $\Gamma$. We think of $\mathbb{F}_p[\Gamma]$ acting on $\mathbb{F}_p^{\Gamma}$ by $\left( \left( \sum a_{\sigma} \sigma \right) f \right)(w) = \sum a_{\sigma} f(w+\sigma)$.

Let $\tau_1$, $\tau_2$, …, $\tau_n$ be generators for $\Gamma = C_q^n$. Let $Y_{\leq d} \subset \mathbb{F}_p^{\Gamma}$ the functions annihilated by the operators $\prod_f (\tau_f-1)^{m_f}$ where $\sum m_f = d+1$. For example, $Y_{\leq 1}$ is the functions $f: \Gamma \to \mathbb{F}_p$ which obey $f(w+\tau_i+\tau_j) - f(w+\tau_i)-f(w+\tau_j)+f(w)=0$ for any $w$, $i$ and $j$. We think of $Y_{\leq d}$ as polynomials of degree $\leq d$, and the dimension of $Y_{\leq d}$ is the number of monomials in $n$ variables of total degree $\leq d$ where each variable has degree $\leq q-1$.

Define $H: \Gamma \to \mathbb{F}_p$ by $H(0)=1$ and $H(w)=0$ otherwise. Define $G: \Gamma^3 \to \mathbb{F}_p$ by $G(x,y,z) = H(x+y+z)$.

We write $\alpha_i$, $\beta_i$ and $\gamma_i$ for the generators of the three factors in $\Gamma \times \Gamma \times \Gamma$.
Then we have

$\displaystyle{ \left( \prod (\alpha_f-1)^{i_f} \prod (\beta_f-1)^{j_f} \prod (\gamma_f-1)^{k_f} G\right) (x,y,z) = }$

$\displaystyle{ \left( \prod (\tau_f-1)^{i_f+j_f+k_f} H\right) (x+y+z)}.$

So, if $i_f+j_f+k_f = q-1$, then $\prod (\alpha_f-1)^{i_f} \prod (\beta_f-1)^{j_f} \prod (\gamma_f-1)^{k_f} G=0$ as a function on $\Gamma^3$.

On the other hand, we can expand $G(x,y,z) = \sum A_s(x) B_s(y) C_s(z)$ for $A_s$, $B_s$ and $C_s$ in $\mathbb{F}_p^{\Gamma}$. We see that, if $i_f+j_f+k_f = q-1$, then

$\displaystyle{\sum_s \left( \prod (\alpha_f-1)^{i_f} A_s \right) (x) \left( \prod (\beta_f-1)^{j_f} B_s \right) (y) \left( \prod (\gamma_f-1)^{k_f} C \right) (z) =0}$.

We make the familiar deduction: We can write $G$ in the form

$\displaystyle{ G(x,y,z) = \sum A_s(x) P_s(y,z) + \sum B_s(y) Q_s(x,z) + \sum C_s(z) R_s(x,y)}$

where $A_s$, $B_s$ and $C_s$ run over a basis for $Y_{\leq (q-1)n/3}$.

Once more, we obtain the bound $N \leq 3 \dim Y_{(q-1)n/3}$.

Petrov’s method has an advantage not seen in the other proofs: It generalizes well to the case that $\Gamma$ is non-abelian. For any finite group $\Gamma$, let $I$ be a one-sided ideal in $\mathbb{F}_p[\Gamma]$ obeying $I^3=0$. In our case, this is the ideal generated by $\prod_f (\tau_f-1)^{m_f}$ with $\sum m_f = (q-1)n/3+1$. Then we obtain a bound $N \leq 3 \dim \mathbb{F}_p[\Gamma]/I$ for sum free sets in $\Gamma$.

## What’s going on?

I find Petrov’s proof immensely clarifying, because it explains why the arguments all give the same bound. We are all working with functions $C_q \to \mathbb{F}_p$. I write them as polynomials in $r$ variables $x_e$, Naslund and Sawin use binomial coefficients $\binom{x}{m}$. The formulas to translate between our variables are a mess: For example, my $x_1$ is their $\tfrac{1}{p} (x-x^p) \bmod p$. However, we both agree on what it means to be a polynomial of degree $\leq d$: It means to be annihilated by $(\tau-1)^{d+1}$.

In both cases, we take the indicator function of the identity and pull it back to $\Gamma^3$ along the addition map. The first two proofs use explicit identities to see that the result has degree $\leq (q-1)n$. The third proof points out this is an abstract property of functions pulled back along addition of groups, and has nothing to do with how we write the functions as explicit formulas.

I sometimes think that mathematical progress consists of first finding a dozen proofs of a result, then realizing there is only one proof. My mental image is settling a wilderness — first there are many trails through the dark woods, but later there is an open field where we can run wherever we like. But can we get anywhere beyond the current bounds with this understanding? All I can say is not yet…

2. Thanks! Yes, these papers are definitely using similar tricks to get between characteristic $p$ and $p$-th powers.