jump to navigation

A calculus free proof of the spectral theorem December 3, 2012

Posted by David Speyer in Uncategorized.
trackback

Let A be an n \times n real symmetric matrix. Let \lambda be an eigenvalue and v be a corresponding eigenvector. Then

\displaystyle{\bar{v}^T A v = \bar{v}^T \lambda v = \lambda \bar{v}^T v}.

But, also

\displaystyle{\bar{v}^T A v = (\bar{\lambda} \bar{v}^T) v = \bar{\lambda} \bar{v}^T v}.

We deduce that \lambda \bar{v}^T v  = \bar{\lambda} \bar{v}^T v. And \bar{v}^T v is clearly a positive real, so \lambda = \bar{\lambda}.

This immediately shows that the characteristic polynomial of A has only real roots.

It is also easy to see that eigenvectors for distinct eigenvalues are orthogonal. If A v = \lambda v and A w = \mu w, then w^T A v is both \lambda v^T w and \mu v^T w, so (\lambda - \mu) v^T w=0 and v^T w=0.

The remaining point is to rule out the possibility of nontrivial Jordan blocks or, in the language of intro linear algebra classes, the possibility that geometric multiplicity is less than algebraic multiplicity. For a class where the Jordan canonical form theorem is proved, this is simple enough. If there is a nontrivial Jordan block, then there is some v such that (A - \lambda) v \neq 0 but (A -\lambda)^2 v=0. But then

\displaystyle{ ( (A - \lambda) v)^T (A - \lambda) v = v^T (A - \lambda)^2 v = 0 }
so |(A-\lambda)v|^2 =0, contradicting that (A - \lambda)v \neq 0.

However, the Linear Algebra course that I am currently teaching doesn’t do Jordan canonical form. I haven’t figured out how to show that geometric mult. less than algebraic mult. implies that \mathrm{Ker} \ (A-\lambda)^2 \supsetneq \mathrm{Ker} \ (A - \lambda) simply enough to fit in a lecture where I also talk about the rest of this stuff.

Nonetheless, I am amazed that every textbook I have seen uses the "optimize a quadratic form on the unit ball" argument rather than this algebraic once. Lots of students don't remember multivariable calculus well, and existence of maxima of continuous functions on multidimensional bounded domains is complicated. Plus, I find a lot of students have trouble with an inductive process like getting one eigenvector and splitting off an orthogonal complement.

This argument is just shuffling algebra around, combined with the fact that a sum of squares is nonzero. It seems clearly easier to me.

I'm just a little too short on time to present the whole argument this term. I'll do the two easy parts above and hand wave at the algebraic versus geometric issue. Next time I teach this course, I'll try to plan my discussion of algebraic multiplicity better so that I get the key lemma in.

I'm excited! I had thought that the spectral theorem would just be way too hard to prove in this class, but I think this will at least do a large part of it.

About these ads

Comments

1. Aaron - December 3, 2012

Are you talking about the spectral theorem for finite dimensional Hermitian matrices? What’s wrong with the proof where you use the fundamental theorem of algebra to get the existence of an eigenvalue and proceed by induction (as on Wikipedia for example)?

2. Maksim - December 3, 2012

David, if I understand correctly, the calculus part is just for getting the first eigenvector, which, if you ever talked about complex numbers, indeed is easier done by observing that characteristic polynomial has real roots. After that, however, the calculus ends. I find the “if Av=kv then =0 implies ==k =0, so the orthogonal subspace is invariant” argument pretty short and the resulting proof quite natural. I additionally, think the reason many books use optimization to find the first eigenvector is that this does not use complex numbers which may not have been introduced.

3. Maksim - December 3, 2012

Sorry, that was supposed to be “if $Av=kv” then $\langle v, w \rangle$ implies $\langle v, Aw \rangle= \langle Av, w \rangle= k \langle v, w \rangle =0$”.

4. Maksim - December 3, 2012

Apparently I can’t read straight. “if Av=kv” then \langle v, w \rangle implies \langle v, Aw \rangle= \langle Av, w \rangle= k \langle v, w \rangle =0”.

5. Vladimir Dotsenko - December 4, 2012

This proof is cute, surely, but in all fairness I find it very strange that you discard the other argument so easily. First of all, saying that “existence of maxima of continuous functions on multidimensional bounded domains is complicated” might be true in general, but for the sphere there is absolutely no problem at all to furnish a quick inductive argument that even students who don’t remember much of calculus would love (cut the sphere by the plane x_n=c, you will get a sphere of dimension one less, let F(c) be the maximal value of our function on that spherical cut, it can easily be seen to be a continuous function of c, QED). Second, this kind of reasoning helps students to put linear algebra into broader context, and see how, done in the reverse direction, this could help to use linear algebra to maximise something. Third, splitting of the orthogonal complement is something students need to learn. If you don’t rub their nose into that in this course, when will it be done?

6. Lior Silberman - December 4, 2012

The Jordan canonical form is overkill for this proof. Do you teach the abstract notion of an inner product space and a self-adjoint map? If so, there’s an easy proof by induction: the orthogonal complement to an eigenvector is again an inner product space, is invariant by the map, and the restriction of the map there is still self-adjoint. That the algebraic multiplicity equals the geometric multiplicity is a corollary of the theorem rather than a tool in the proof.

This proof depends on the claim that every such matrix has at least one eigenvector. There are two natural proofs: either apply the fundamental theorem of algebra to find a complex eigenvalue and then show that it is real, or use the Rayleigh quotient and Lagrange multipliers to specifically show that real symmetric matrices have eigenvectors. When I discussed the spectral theorem in my course last week I gave both proofs.

The advantage of the second approach to finding an eigenvector is that it is purely a real-variable method, so you don’t need to discuss complex numbers. Also, in North America many universities have the multivariable course precede the linear algebra course (or at least teach them in parallel), so Lagrange multipliers are something your students have probably seen before.

Finally, note that there can’t be an analysis-free proof that the complex numbers are algebraically closed.

7. jessemckeown - December 4, 2012

Having read David’s argument from the beginning, he is clearly taking as basic that every polynomial over \mathbb{R} has a complex root at least, and arguing from constructed things that all the polynomials he has have only real roots. The existence of enough eigenvalues is a non-issue. The issue he’s admitting, and for which he would want to use Jordan forms, is for measuring the dimension of each eigenspace.

Of course, Jordan forms per-se are not strictly necessary for this — my own preference would be to use rational canonical form, but in either case the key lemma is the Cayley-Hamilton theorem, \chi_A(A) = 0. Whether a proof of that fits or doesn’t in David’s course, it’s the sort of thing that is well worth mentioning anyways; it’s why generalized eigenspaces are the right generalization of eigenspaces, and it’s why showing (A - \lambda)^2 v =0 \implies (A-\lambda) v = 0 is the right thing here.

8. abel - December 4, 2012

we can establish that if a symmetric matrix $A$ has only zero eigenvale, then $A$ must be identically zero by using the $range(A) = 0$ because otherwise there is an eigenvector in $range(A).$ implying a non zero real eigenvalue for $A.$ now, we use $R^n = ker(A) + range(A)$ to conclude $R^n = ker(A)$ and $A = 0.$

does this not imply that all the jordan blocks are simple?

9. James Tener - December 4, 2012

In Friedberg, Insel and Spence, they get the spectral theorem as a corollary of Schur upper triangularization, which is just straightforward induction once you know the characteristic polynomial splits.

10. Louigi - December 5, 2012

The proof suggested by the combination of David’s post and Lior’s comment (using the fundamental theorem of algebra) is the route taken to prove the spectral theorem in Chapter 7 of Axler’s “Linear Algebra Done Right”.

11. anonymous - December 5, 2012

Seconding Louigi’s comment: if every textbook you look at gives the same argument for that, you aren’t looking at enough textbooks! (IMVHO, Axler’s “Linear Algebra Done Right” deserves its title, although it is not the last word on the subject. If you want to teach a lot of “matrix theory”, or feel that it’s irresponsible to only do linear algebra over the real and complex fields, you need to supplement it with another text.)

In defense of the Rayleigh quotient: outside of purely algebraic contexts (in the infinite dimensional setting, even in situations where eigenvectors might not exist a priori), it is helpful to know you can look at things this way. You might have compactness (or enough of something like it) to make something of the method, when there is no “characteristic polynomial” to speak of and no finite dimensionality to induct on.

12. Yemon Choi - December 5, 2012

Having recently finished teaching a 2nd course in linear algebra, my initial reaction was the same as Aaron’s.

The ‘route taken to prove the spectral theorem in Chapter 7 of Axler’s “Linear Algebra Done Right” ‘ is, if memory serves right, the proof I was shown as an undergraduate in a baby course on inner products and bilinear maps and quadratic forms. I don’t know where the lecturer got it from.

(I have not actually read Axler’s book but I am assuming he follows the same approach as in the book’s precursor, “Down With Determinants” http://www.axler.net/DwD.html.)

13. Pace Nielsen - December 10, 2012

David, I taught linear algebra this semester using Lay, and he does the same proof as you. I too like this proof.

For the algebraic multiplicity = geometric multiplicity, this is worked out in exercises (for the advanced student). The basic idea is similar to Lior’s argument, except that Lay puts it in the language of Schur decompositions. (And once you know a Schur decomposition exists, and everything is symmetric, your upper triangular matrix must be diagonal. Done.)

14. teobanica - January 31, 2013

Hello world! Just to come up with a MO link http://mathoverflow.net/questions/118626/real-symmetric-matrix-has-real-eigenvalues-elementary-proof/118640#118640 containing some interesting related stuff. I was actually personally involved in “saving” that question, under the war nickname Z254R at that time, when I first retagged it it was severly downvoted, just waiting to be closed – but then I fought and fought, got it alive, and now has a whole series of interesting answers, with a funny comment thread as well :)

15. S. Carnahan - February 9, 2013

@Teo B: I don’t mean to start a fight on David’s blog post, but it looks to me like the question was never in danger of being closed. For example, there are no comments like “I’m afraid this question is not appropriate here” or “please ask your question at math.stackexchange.com”. In fact, it looks more like people had valid reasons for requesting clarification, and you responded to their requests by attacking them.


Sorry comments are closed for this entry

Follow

Get every new post delivered to your Inbox.

Join 722 other followers

%d bloggers like this: