That trick where you embed the free group into a Lie group

The Banach-Tarski theorem states that a three dimensional ball can be chopped into finitely many pieces, which can then be rotated and translated to form two balls of the same volume as the first one. In the course of proving this theorem, one needs the following lemma:

The free group on two generators has an embedding into SO_3(\mathbb{R}).

In other words, there are two rotation matrices, A and B, so that the only way for a product of A‘s, B‘s, A^{-1}‘s and B^{-1}‘s to be the identity is if that product is formally forced to be the identity by cancelling elements with their inverses. To see how you prove the Banach-Tarski theorem from here, read any number of excellent expositions, such as this one (PDF). In this post, I’m going to explain how you might find such an A and a B. The proof I give isn’t the shorest, but I think it is the best motivated and it will lead into a follow up post where I talk about the Grand Picard Theorem, groups acting on trees, Schottky groups and Mumford curves.

The first thing one might think of when trying to prove this theorem is to remember places we’ve seen the free group on two generators before. We’ll write G for the free group on two generators. The canny reader might recall that G is isomorphic to \Gamma(2). Here \Gamma(2) is the subgroup of PSL_2(\mathbb{Z}) consisting of matrices \left( \begin{smallmatrix} a & b \\ c & d \end{smallmatrix} \right) where a and d are odd while b and c are even. So G is a subgroup of PSL_2(\mathbb{R}). The group \Gamma(2) even acts on the sphere by Mobius transformations: \left( \begin{smallmatrix} a & b \\ c & d \end{smallmatrix} \right) takes z to (az+b)/(cz+d). Using this, we can prove a Banach-Tarski theorem for the sphere — if we allow ourselves to change pieces by a Mobius transformation. Unfortunately, Mobius trasformations are not area preserving, so no one will be impressed. Hmmm. Back to the drawing board.

How different is PSL_2(\mathbb{R}) from SO_3(\mathbb{R}) anyway? Not very! The Lie group PSL_2(\mathbb{R}) is isomorphic to SO(2,1)(\mathbb{R}); the group of determinant one transformations of three-space preserving the quadratic form x_1^2+x_2^2-x_3^2. We are off from SO_3(\mathbb{R}) by just a sign. If only i were a real number, we’d be done.

Well, i isn’t in \mathbb{R}. But there are plenty of fields which do contain i. For example, the field \mathbb{Q}_5 of 5-adic numbers which Scott told us about. (Why does \mathbb{Q}_5 contain a square root of -1? Well, in the 5-adic absolute value, | 2^2- (-1)|=1/5. Now, start with x_1=2 and repeatedly apply the recurrence x_{n+1} \mapsto (x_n-1/x_n)/2. Basic estimates show that (x_n) is a Cauchy sequence, and its limit must be a square root of -1. This is a simple case of Hensel’s lemma.) So, here is the idea. We’ll try to embed G into PSL_2(\mathbb{Q}_5), which is isomorphic to the subgroup of SL_3(\mathbb{Q}_5) preserving x_1^2+x_2^2-x_3^2, which in turn is isomorphic to SO_3(\mathbb{Q}_5). If we get lucky, our matrices A and B in SO_3(\mathbb{Q}_5) will have rational entries, so we can also consider them as elements of SO_3(\mathbb{R}).

At this point, it will help a great deal to remember how we prove that G embeds into PSL_2(\mathbb{R}). I
started out by referring to the fact that G \cong \Gamma(2), but it will actually be easier to show that G is isomorphic to the subgroup H of PGL_2(\mathbb{R}) generated by

A=\begin{pmatrix} \sqrt{3} & 0 \\ 0 & 1/\sqrt{3} \end{pmatrix} and B=\begin{pmatrix} 2 & -3 \\ -1 & 2 \end{pmatrix}.

We’ll denote the upper half plane by \Delta. A fundamental domain for the action of H on \Delta is the region U bounded by the four semicircles with diameters (-3,3), (-3,-1), (-1,1) and (1,3). Let V_1, V_2, V_3 and V_4 be the regions of \Delta which are, respectively, outside the semicircle with diameter (-3,3), inside the semicircle with diameter (-3,-1), inside the semicircle with diameter (-1,1) and inside the semicircle with diameter (1,3). So (up to issues on the boundary, which won’t matter), \Delta is the disjoint union of U, V_1, V_2, V_3 and V_4.


The key computation is to check the following equations:

\begin{matrix} A(\Delta \setminus V_3)=V_1& A^{-1}(\Delta \setminus V_1)=V_3 \\ B(\Delta \setminus V_4)=V_2 & B^{-1}(\Delta \setminus V_2)=V_4 \end{matrix} \quad (*) .

Now, we show that H is the free group. Here is the point. Suppose that h_1 h_2 \ldots h_r is a reduced word in H. This means that each h_i is one of A, B, A^{-1}, B^{-1} and we do not have a generator and its inverse next to each other. Pick a point u in the interior of U. We will show that h_1 h_2 \ldots h_r(u) \neq u. More specifically, we will show that, if h_1 is (respectively) A, B, A^{-1}, B^{-1} then h_1 h_2 \ldots h_r(u) is in V_1, V_2, V_3 or V_4 respectively. The proof is just induction on r, using the equations above. Since u isn’t in any of V_1, V_2, V_3 or V_4, this shows that h_1 h_2 \ldots h_r(u) \neq u. So h_1 h_2 \ldots h_r is not the identity and we are done.

Now, let’s try to fit the free group on two generators into SO_3(\mathbb{Q}_5) and copy the above proof. Before, the action of PSL_2(\mathbb{R}) on the Riemman sphere \mathbb{C} \mathbb{P}^1 was extremely important. How do we see \mathbb{P}^1 in terms of the group SO(2,1)? Take the hypersurface x_1^2+x_2^2-x_3^2 in \mathbb{C}^3 \setminus (0,0,0) and take the quotient under scaling by \mathbb{C}^*. The resulting surface Q is isomorphic to \mathbb{C} \mathbb{P}^1, the action of SO(2,1)(\mathbb{C}) on Q corresponds to the action of PSL_2(\mathbb{C}) on \mathbb{C} \mathbb{P}^1 and the action of SO(2,1)(\mathbb{R}) corresponds to the action of PSL_2(R). (There is also a copy of SO(3)(\mathbb{R}) contained in SO(2,1)(\mathbb{C}), which gives us the ordinary rotational symmetry group of the sphere, but this is transverse to PSL_2(\mathbb{R}) and hence not helpful.)

Let’s try to mimic 5-adically the steps that worked in the complex world. Consider the hypersurface x_1^2+x_2^2+x_3^2 in \mathbb{Q}_5^3 \setminus (0,0,0), and mod out by the action of \mathbb{Q}_5^*; once again, I’ll call the result Q. While it is not strictly necessary for our argument, I find it extremely helpful to visualize Q as a toplogical space with the quotient topology. In this case, Q is isomorphic to \mathbb{Q}_5 \mathbb{P}^1, the space we get by attaching two copies of \mathbb{Q}_5 to each other, gluing u to u^{-1}. I fix the convention that i denotes the square root of -1 in \mathbb{Q}_5 such that |i-2|<1. We’ll take

A=\begin{pmatrix} 3/5 & 4/5 & 0 \\ -4/5 & 3/5 & 0 \\ 0 & 0 & 1 \end{pmatrix} and B=  \begin{pmatrix} 1 & 0 & 0 \\ 0 &  3/5 & 4/5 \\ 0 & -4/5 & 3/5 \end{pmatrix}.

Why did I choose these matrices? I wanted matrices whose action on Q had an attracting fixed point and a repelling fized point, just as with my previous choice that worked in PSL_2(\mathbb{R}) did. Both A and B are diagonalizable with eigenvalues ( (2+i)/(2-i), 1, (2-i)/(2+i) ). This means that the corresponding matrices in PSL_2(\mathbb{Q}_5) have eigenvalues ((2+i)/\sqrt{5},(2-i)/\sqrt{5}). Since, 5-adically, |(2+i)/\sqrt{5}| >1 and |(2-i)/\sqrt{5}|<1, this corresponds to an action with one repelling and one attracting fixed point. People who are used to Mobius transformations will call this a hyperbolic action on Q.

The above explains why we might think that A and B would work. Let’s actually show that they do. Let v_1 and v_3 be the attracting and repelling fixed points ofA on Q. Explicitly, v_1=(1,i,0) and v_3=(1,-i,0). Similarly, define v_2 and v_4 to be the attracting and repelling fixed points of B. We’ll now take V_1, V_2, V_3 and V_4 to be discs around v_1, v_2, v_3 and v_4. More specifically, let q=(x:y:z) be a point of Q. By scaling the homogenous coordinates (x:y:z), we can guarantee that x, y and z are all in \mathbb{Z}_5, and not all in 5 \mathbb{Z}_{5}. Let \overline{q} denote the reduction of q modulo 5; this is an element of \mathbb{F}_5^3, modulo scaling. Then we define q to be in V_i if \overline{q}=\overline{v_i}. (Explicitly, \overline{v_1}=(1:2:0) and we can give similar expressions for the other \overline{v_i}.) Now, check that the relations (*) still hold and mimic the above proof to show that the group generated inside SO(3)(\mathbb{Q}_5) by A and B is free. But, since the entries in A and B were rational, this also shows that the free group embeds into SO(3)(\mathbb{Q}) and into SO(3)(\mathbb{R}).

There are ways to write this proof purely in terms of the rational numbers and divisibility by 5, but I find them unnatural. Seeing an action on the 5-adic priojective line makes it clear to me what is happening.

In a comment on Tim Gowers’ blog, Terry Tao suggests that embedding the free group into SO(3)(\mathbb{R}) has a number of applications. I’m afraid that I don’t know any of them other than Banach-Tarski, although I imagine you could construct useful examples in dynamical systems by using this action. However, I know lots of reasons why you might want to embed the free group into PSL_2(\mathbb{C}) or PSL_2(\mathbb{Q}_p). This will be the subject of my follow up post.

In closing, here is an exercise for you. Many papers on the Banach-Tarski theorem state that the group generated by

\begin{pmatrix} 1/2 & \sqrt{3}/2 & 0 \\ - \sqrt{3}/2 & 1/2 & 0 \\ 0 & 0 & 1 \end{pmatrix} and \begin{pmatrix} 1 & 0 & 0 \\  0 & 1/2 & \sqrt{3}/2 \\ 0 & -\sqrt{3}/2 & 1/2 \end{pmatrix}

is free. Explain how to adapt the above proof to show this.

22 thoughts on “That trick where you embed the free group into a Lie group

  1. This can’t be an original idea, but my first thought would be very different from yours above: it would be to choose two random matrices A and B. For each expression in A and B that doesn’t cancel in the free group, one would have to show that it wasn’t identically zero in SO_3, which I presume can’t be that hard. And then a dimension argument ought to show that the probability that any one of these expressions equals zero is zero, in which case a random choice works with probability 1. Of course, it’s interesting to have an explicit example, and what you write is an interesting route to finding such an example.

  2. Dear David,

    This isn’t quite the same thing, but there is a variant of “A and B generate a free group inside a compact Lie group G” which has a number of applications, namely that “A and B enjoy a spectral gap inside G”. Indeed, if one views A and B as translation operators on L^2(G), then the operator \frac{1}{4}( A + A^{-1} + B + B^{-1} ) is clearly a self-adjoint operator on L^2(G) with an operator norm of 1. If one restricts to functions of mean zero, we say that there is a spectral gap if the operator norm now drops to 1 - \epsilon for some \epsilon > 0. Roughly speaking, this asserts that the only functions on G which are anywhere close to being invariant under both A and B are the constant functions (this implies, but is stronger than, the action of < A,B&\gt; on G being ergodic). Drinfeld showed that there exist pairs A, B in SU(2) (or SO(3), if you wish) which not only generate a free group, but enjoy a spectral gap; among other things, this implies that the only finitely additive rotation-invariant probability measure on the sphere is normalised Lebesgue measure (this is false for the unit circle!). There is a lot of recent work on this kind of thing (most recently by Bourgain and Gamburd); it has connections with Kazhdan’s property T and to some analytic number theory (such as the Ramanujan conjectures).

    One amusing way to show the existence of A, B that generate the free group is to show that for any non-trivial word w(A,B) of A and B, the variety { (A,B): w(A,B) = id} is a proper subvariety of SO(3) x SO(3), and thus has measure zero. Since the countable union of measure zero sets is measure zero, we see that if one selects A, B uniformly at random, one is going to get a free group with probability 1. (Alternatively, one could argue using the Baire category theorem.) To show that all non-trivial words define proper subvarieties, there are many ways; for instance, one can differentiate w(A,B) with respect to A near the identity and analyse what comes out.

  3. That’s pretty cool. I had been taught the "sufficiently generic rotations" argument and never bothered to think about explicit realizations.

    Minor correction: PSL_2(\mathbb{R}) is isomorphic to the isochronous subgroup SO(2,1)^+(\mathbb{R}), which is index two in SO(2,1)(\mathbb{R}).

  4. I just rescued the first two comments from the spam bucket. Maybe our filter detects Fields medalists.

    Incidentally, Tao’s last argument works over any uncountable field, if you replace measure zero with Zariski closed, positive codimension. Are there countably infinite fields (necessarily of positive characteristic, by the post) with no embeddings of F_2 in SO_3?

    Also, is there a word like proper that doesn’t also mean finite type and universally closed? I got very confused for a second, reading about proper subvarieties of an affine algebraic group, but I couldn’t think of another concise way to say the same thing.

  5. Ugh, sorry about that. I guess “proper” is right up there with “normal” and “regular” as a contender for the most overworked adjective in mathematics. Perhaps “strict subvariety” would have been better?

    The question about the countably infinite fields is interesting, but way outside of my own expertise; I had a random naive thought that the Lowenheim-Skolem theorem might be relevant, but that might be totally off-base.

  6. To answer my own question, SO_3(\overline{\mathbb{F}_q}) is a union of finite groups, so it can’t even have an embedding of \mathbb{Z}. It shouldn’t be hard to construct an embedding of F_2 for fields with transcendence degree at least two, but I wonder if it can be done for function fields of curves.

  7. Sorry to flood the thread. David’s proof above works almost verbatim over \mathbb{F}_q(t) (and any extension thereof) when q is odd, by substituting \{t, t^2-1, 2t, t^2+1\} for every appearance of \{2,3,4,5\}, although in the proof you will have to extend scalars on the symmetric space Q when q \equiv 3 \, (4). There are some subtleties when working with orthogonal groups in characteristic two that I don’ t know how to tackle.

    For some reason, people don’t seem to find t-adic Banach-Tarski particularly compelling…

  8. I’m afraid this is one of those places where physicists and mathematicians have to diverge. Far more interesting to me are finite groups embedded in Lie groups.

    The one I’m playing with at the moment is the “snuark” subset of SU(2) (say in the Pauli matrices) that is generated by the projection operators for spin in the +x, -x, +y, -y, +z, and -z directions.

    There are 6×6 = 36 products of these six elements, six of which are zero. The 30 non zero products provide a basis set for the group generated by the six. That is, any product of the six can be written as a complex multiple of the 30 nonzero products.

    The 30 non zero products are all themselves primitive projection operators, or complex multiples of projection operators. But only the six on the diagonal are Hermitian.

    More generally, any primitive projection operator in the Pauli algebra (Hermitian or not) can be written uniquely as the product of two primitive Hermitian projection operators.

  9. Carl-

    It’s not like we have to chose one or the other. Mathematicians are also very interested in finite subgroups of Lie groups. It just tends to go by the name of “the representation theory of finite groups.” That has a pretty long and storied history over the past century and a half or so.

  10. First of all, thank you for all the replies. I’m still digesting the notion of a spectral gap.

    Regarding the solution by choosing “random” rotations, suggested by both of the Field’s medalists above: I thought of this, but it was not obvious to me how to show that it worked. You can’t just differentiate near the identity: the equation

    e^A e^B e^{-A} e^{-B}

    is singular near (0,0). You could try expanding by Baker-Campbell-Hausdorff and showing that the lowest degree terms don’t vanish, but the combinatorics seems nasty. (Also, the free Lie algebra doesn’t embed in so_3, so terms that formally look like they are nonzero might not be.) I’m curious to see the details of the differentiation solution if anyone knows a reference.

  11. Dear David,

    Hmm, actually I retract my claim; I didn’t handle the non-vanishing issue correctly.

    Baker-Campbell-Hausdorff allows one to finish the job if one can embed the free Lie algebra over Q into so(3;R), though I am not sure if this implication is reversible. Since R has infinite transcendence degree, it suffices to show that, if X, Y, Z are the generators of so(3), that the generic elements aX+bY+cZ and dX+eY+fZ in so(3;Q(a,b,c,d,e,f)) generates a free Lie algebra over Q. But even with these reductions there is still some serious algebraic combinatorics to be done, unless there is some sort of lifting trick or something, or if one has a really good description of the free Lie algebra.

    Well, we can at least turn this around; you asked for a non-trivial application of the fact that the free group embeds into SO(3), and we found one, namely that no non-trivial word on SO(3) is identically equal to the identity matrix. :-)

  12. Sadly, one can not embed the free Lie algebra over Q into so(3,R) either. Any two vectors in so(3) have

    [ [a,b], [a,[a,[a,b]]] ]=0

    which is not true in the free Lie algebra.

    Actually, this brings up a few interesting questions, which I have only thought about very briefly:

    Any word in the free group produces a map from
    SO(3) x SO(3) –> SO(3). Is it always surjective?

    Any word w in the free group produces a subscheme of SO(r) \times SO(3) cut out by w=1. Is this always generically reduced?

  13. That’s a nice identity! It has a nice interpretation if you view the Lie bracket on so(3) as being isomorphic to the cross product on R^3.

    In retrospect, it’s clear that no finite-dimensional Lie algebra can contain the free Lie group, by doing a dimension count on the number of possible words of a sufficiently long length n; the dimension is polynomial in n in the former and exponential in the latter, and so Malthus forces us to have a non-trivial relation at some point. Thanks for clearing up my intuition there; for some reason I had the vague (but wrong) impression that semisimple Lie algebras behaved somewhat like free ones.

    Still, the fact that the words on SO(3) are never identically 1 does imply some very weird algebraic combinatorial statement about formal expansions in the Lie algebra of SO(3) generated by the Baker-Campbell-Hausdorff formula, namely that they do not lie in the ideal of the free Lie algebra generated by the relations that so(3) satisfies (is there a presentation of that ideal, I wonder?). It is still slightly puzzling to me that there is a purely algebraic statement which does not seem to be easily provable in a purely algebraic fashion, but now that this statement looks so contorted, I guess it doesn’t “offend” me as much.

    I can’t help you with the generically reduced problem, but for the surjectivity problem, one can observe at least that the image is a connected subvariety which is invariant under conjugation and contains the identity, which seems to cut down the possibilities a bit.
    This might possibly be enough to handle the SO(3) problem, but more general Lie groups a more powerful approach would be needed.

  14. Dear David,

    I asked around here at UCLA for applications of “embedding free groups into other groups”. I got two responses that you might find interesting:

    1. My colleague, Sorin Popa, points out to me a result of his student, Adrian Ioana,

    which asserts that if a countable discrete group G happens to contain a copy of the free group, then there are a uncountable number of “different” possible actions of that group on a standard probability space, where two actions are considered the “same” (or more precisely, “orbit equivalent”) if there is a measure isomorphism that converts orbits of one action into orbits of another. At the other extreme, if instead the group G is abelian or at least amenable, then all actions are orbit equivalent, thanks to work of Dye, Ornstein, and Weiss.

    2. More generally, there is some sort of philosophy in the theory of discrete group actions that “non-amenability” is like containing the free group as some sort of “virtual subgroup”, whatever that means.
    My colleague Dima Shylakhtenko mentions a number of deep conjectures in this direction. One of the shorter ones to state is that any non-amenable von Neumann algebra with a trace must contain a copy of the von Neumann algebra of the free group; I think the converse (that any tracial vNa which contains the vNa of the free group is non-amenable) is pretty easy.

    Incidentally, there are some deep connections between von Neumann algebras of groups and of orbit equivalence problems, though I’m really not an expert on these matters. Kazhdan’s property T also seems to play a big role; free groups have this property, amenable groups don’t, and it seems to make a decisive difference as to what actions of these groups look like.

  15. A small correction to the end of Comment 14: free groups (of rank >1) do not have Property T; indeed, the latter (for a discrete group) implies finite abelianization (because Property T goes to quotients by closed subgroups, and abelian groups have T if and only if they are compact; hence G/[G,G] is compact and discrete for a discrete G with Property T). In particular SL(2,Z) does not have T (it has a finite index subgroup which is free of rank 2); however, Kazdhan proved (among other things) that SL(n,Z), for n>2, has Property T.

    The recent work of Bourgain, Gamburd and Sarnak on points in orbits of discrete groups with almost prime coordinates uses embeddings of free groups for some results. There one useful property is that the Cayley graphs of free groups (with respect to the generators) is a tree, hence homogeneous and the harmonic analysis on it is pretty transparent. On the other hand, one can manage for the free subgroup to be Zariski dense, and a deep theorem implies that its reductions modulo large enough primes are the same as that of the ambient group, which is what is needed for sieving…

    Incidentally, for some purposes at least in their work, one can replace the use of combinatorial balls of growing radius in the free subgroup by considerations of random walks on the ambient group. I wonder if the same could be the case for other uses of free groups?

  16. Emmanuel,

    There one useful property is that the Cayley graphs of free groups (with respect to the generators) is a tree, hence homogeneous and the harmonic analysis on it is pretty transparent.

    All Cayley graphs are homogeneous. In fact, that’s pretty much the definition of a Cayley graph! (OK, it should also be connected.) They must be using some other properties of trees.

  17. Yes, I used “homogeneous” wrongly; the actual property which is important is that one can write down explicitly the eigenfunctions of the discrete laplacian and use harmonic analysis efficiently (this is also apparent in the discussion of Ramanujan graphs by Lubotzky, Phillips and Sarnak).

  18. I really like it when people get together and share opinions.
    Great website, stick with it!

Comments are closed.