A proof length challenge

Here is a question I thought of this morning, and have given no serious thought to — hence, perfect for a blog post.

In any commutative ring, if A and B are n \times n matrices, and AB = \mathrm{Id}, then BA=\mathrm{Id}.

For any fixed n, there should be a proof of this by direct algebraic manipulation: Start with the n^2 equations saying that AB=\mathrm{Id}, and add, multiply, expand and recombine them to produce the n^2 equations saying that BA=\mathrm{Id}. Our starting point is n^2 equations of size O(n), as is our destination. But every route I see involves writing down the adjoint of A or B, which is an expression of length \approx n!.

Can we show that a direct proof of this result must involve either an expression of exponential length, or an exponentially long proof?

15 thoughts on “A proof length challenge

  1. More off the top of the head speculation:

    It would seem that there should a short proof by some variant on Gaussian elimination, but the problem is that Gaussian elimination depends on the B-orbit you are in (i.e. the Gaussian elimination formula divides by 0 if you’re not in the appropriate B-orbit/choose the wrong pivots), and there are n! of those.

  2. I asked over a ring, not a field, so you don’t have a global Bruhat decomposition. But, yeah, that’s an excellent point. I think you wind up with a proof that has n! branches, though each branch has length something like O(n^3) or O(n^4).

  3. Could you make more precise your constraints on the proof? This result, as several others in linear algebra, can be solved by reduction to the “universal” case: A and B are defined over a polynomial ring over Z whose variables are the entries in A and B. This ring embeds in a field — the fraction field. The conclusion over this field implies the conclusion over the ring, hence the conclusion over all rings. Over this field we have Gaussian elimination and all the other tools from linear algebra, etc.. But presumably this is not the proof you are looking for.

  4. What I was thinking of was a proof by pure algebraic manipulation. So, for n=2, start with
    b_{11} a_{11} + b_{12} a_{21}, apply the commutative, associate and distributive laws, plus replacing a_{i1} b_{1j}+a_{i2} b_{2j} by 1 or 0, according to whether i=j or i \neq j, until finally we get 1 out the other side.

    It occurs to me, though, that a lot of the difficulty in this challenge is going to be in writing out distributions of long formulas. So maybe it’s better to ask the question like this:

    By the Nullstellansatz, we know that there are n^4 polynomials P_{ij}^{rs} in the 2 n^2 variables a_{ij}, b_{ij} such that

    \displaystyle{\sum_{m} b_{rm} a_{ms} - \delta_{rs} = \sum P_{ij}^{rs} \cdot \left( \sum_{k} a_{ik} b_{kj} - \delta_{jk} \right)}

    where \delta_{ij} is 1 if i=j and 0 otherwise.

    How big do the P_{ij}^{rs} have to be? Measured either in number of nonzero terms or in degree?

  5. Stated this way, I am tempted to change the problem to a related one. Replace all the 1’s by the appropriate polynomial of degree 2 (the first try would be $t^2$, but something involving traces might work out better). With this change the equations saying that $AB=I$ defines a projective variety, and one can try to figure out its Castelnuevo–Mumford regularity. Perhaps there is some hope that some inductive calculation with piles of exact sequences will work out.

  6. You’ll pick up some extra components at \infty. If A = \left( \begin{smallmatrix} 0 & 1 \\ 0 & 0 \end{smallmatrix} \right), B = \left( \begin{smallmatrix} 1 & 0 \\ 0 & 0 \end{smallmatrix} \right) and t=0, then AB=t^2 \mathrm{Id} but BA \neq t^2 \mathrm{Id}. I imagine that understanding those extra component will be important, but I’m not sure how.

    Also, this is the sort of thing I probably should know, but I don’t: How do you convert between Castelnuevo-Mumford regularity and the sort of explicit bounds I’m asking about?

  7. Sorry for the contentless message, but in case David was expecting a reply:

    This is the sort of thing I superficially understood for a brief time but have forgotten, and I don’t have easy access to Eisenbud’s book on the Geometry of Syzygies, which is what I would use to stir my memories (possibly in the Platonic sense).

  8. Here’s a suggestion as to why it might need exponential time. For fixed n let our commutative ring R be the product of n! copies of a field k. A matrix over R is then given by specifying the n! matrices over k. So, take A to be the matrix over R corresponding to the n! permutation matrices over the field k. Now any attempt to use Gaussian elimination, say, requires you to visit each of the n! branches (see comments 1 and 2), since each branch fails for one of the permutation matrices.

  9. Michael Soltys wrote his thesis at Toronto under the direction of Steve Cook on this topic. I published a paper with Michael in
    2004, showing exponential lower bounds for a certain proof system.

    “Matrix identities and the pigeonhole principle” (joint work with Michael Soltys). Archive for Mathematical Logic, Volume 43 (2004), pp. 351-357.

    There was some more recent work by people in the Prague proof complexity school, but I don’t have the reference at hand at the moment. I can see if I can find it if anybody is interested.

  10. A vaguely related proof-length challenge that perhaps people will have thoughts about. The inspiration for this is the problem of proving nontrivial lower bounds on tensor rank, a problem in comp sci. Given an m-index tensor, we define the tensor rank n to be the smallest number such that it can be written as a sum of n products of m 1-index tensors. For example, for a 3-index tensor, A_{ijk}, we would like to write A_{ijk}=\sum_{\alpha=1}^n x^\alpha_i y^\alpha_j z^\alpha_k. For a 2-index tensor (a matrix), this agrees with the usual definition of matrix rank.

    The claim that a given 3-index tensor does not have a certain rank is a claim that the above equations have no solution. I thought about seeing how well methods of Grobner bases could be applied to this problem, but decided to start with the simpler case of matrices. So, we know that an D-by-D identity matrix has rank D, so the following D^2 different equations (D^2 different choices of i,j below) have no solution for n<D.

    \sum_{\alpha=1}^n x^_i^\alpha y_j^\alpha-\delta_{ij}=0

    So, I thought to use a package for finding Grobner bases to see whether it could be used to verify this fact. For D=4,N=3, in a couple seconds a computer verifies that the Grobner basis for the ideal generated by the l.h.s of the above equations is just {1}, verifying that the equations have no solution. For D=7,N=3, the job took so long that I killed it.

    So, the proof length challenge is: given those D^2 equations, for N<D is there a quick purely algebraic proof of a contradiction?

  11. Alasdair, does your result give a lower bound on the size of the P_{ij}^{rs} in David’s Nullstellensatz identity? It’s not immediately clear to me that small P_{ij}^{rs} would directly yield a bounded-depth Frege proof.

Comments are closed.