Sadly, this blog is still dead. I have a list of things I would like to write on it, but I have no idea when or if I will find the time.

]]>Got a draft of the course schedule for next year. Looks like I might get to teach real analysis.

I probably need someone to talk me out of trying to do everything in R^n.

A subsequent update indicates that the more standard alternative is teaching one variable analysis.

This is my second go around teaching rigorous multivariable analysis — key points are the multivariate chain rule, the inverse and implicit function theorems, Fubini’s theorem, the multivariate change of variables formula, the definition of manifolds, differential forms, Stokes’ theorem, the degree of a differentiable map and some preview of de Rham cohomology. I wouldn’t say I’m doing a great job, but at least I know why it’s hard to do. I haven’t taught single variable, but I have read over the day-to-day syllabus and problem sets of our most experienced instructor.

Here is the conceptual difference: It is quite doable to start with the axioms of an ordered field and build rigorously to the Fundamental Theorem of Calculus. Doing this gives students a real appreciation for the nontrivial power of mathematical reasoning. I don’t want to say that it is actually impossible to do the same for Stokes’ theorem (according to rumor, Brian Conrad did it), but I never manage — there comes a point where I start waving my hands and saying “Oh yes, and throw in a partition of unity” or “Yes, there is an inverse function theorem for maps between -folds just like the one for maps between open subsets of .” I think most students probably benefit from seeing things done carefully for a term first.

Below the fold, a list of specific topics much harder in more than one variable. If you have found ways not to make them harder, please chime in in the comments!

• No need for linear algebra. Just defining the multivariate derivative uses the concept of a linear map, and stating the chain rule requires you to compose them. If you want your students to ever be able to check the hypotheses of the inverse function theorem, they have to be able to check if matrices are invertible.

• One variable Riemann sums are so nice! If is an increasing bijection, and is a partition of , then is a partition of ; the -substitution formula for integrals follows immediately. If is a smooth bijection, and we have a partition of into rectangles, its image in is quite hard to describe. This is why the change of variables formula is such a pain.

• Regions of integration: In one variable we always integrate over an interval. In many variables, we integrate over complicated regions, so we need to describe the geometry of complicated regions. If you want to give a region up into simpler pieces, you need to introduce some rudimentary notion of "measure zero", to make sure the boundaries you cut along aren't too large.

• Improper integrals: In one variable, we always take limits as the bounds of the integral go somewhere. In many variables, there are uncountably many different limiting processes which could define .

And that's not even getting into manifolds, or multilinear algebra…

]]>While I’m shamelessly plugging stuff for early career people, Dave Penneys, Julia Plavnik, and I are running an MRC this summer in Quantum Symmetry. It’s aimed at people at -2 to +5 years from Ph.D. working in tensor categories, subfactors, topological phases of matter, and related fields, and the deadline is Feb 15.

]]>

As many of you know, the US House and Senate have passed revisions to the tax code. According to the House, but not the Senate draft, graduate tuition remissions are taxed as income. Thus, here at U Michigan, our graduate stipend is 19K and our tuition is 12K. If the House version takes effect, our students would be billed as if receiving 31K without getting a penny more to pay it with.

It is thus crucial which version of the bill goes forward. The first meeting of the reconciliation committee is TONIGHT, at 6 PM Eastern Time. Please contact your congress people. You can look up their contact information here. Even if they are clearly on the right side of this issue, they need to be able to report how many calls they have gotten about it when arguing with their colleagues. Remember — be polite, make it clear what you are asking for, and make it clear that you live and vote in their district. If you work at a large public university in their district, you may want to point out the effect this will have on that university.

I’ll try to look up information about congress people who are specifically on the committee or otherwise particularly vulnerable. Jordan Ellenberg wrote a “friends only” facebook post relevant to this, which I encourage him to repost publicly, on his blog or in the comments here.

**UPDATE** According to the Wall Street Journal, the grad tax is out. Ordinarily, I thank my congress people when they’ve been on the right side of an issue and won. (Congress people are human, and appreciate thanks too!) In this case, I believe the negotiations happened largely in secret, so I’m not sure who deserves thanks. If anyone knows, feel free to post below.

Unfortunately, sometime in the intervening period they have quietly withdrawn some of the rights they gave to that content. In particular, they no longer give the right to redistribute on non-commercial terms. Of course, the 2013 licence is no longer available on their website, but thankfully David Roberts saved a copy at https://plus.google.com/u/0/+DavidRoberts/posts/asYgXTq9Y2r. The critical sentence there is

“Users may access, download, copy,

display, redistribute, adapt, translate, text mine and data mine the articles provided that: …”

The new licence, at https://www.elsevier.com/about/our-business/policies/open-access-licenses/elsevier-user-license now reads

“Users may access, download, copy, translate, text and data mine (

but may not redistribute, display or adapt) the articles for non-commercial purposes provided that users: …”

I think this is pretty upsetting. The big publishers hold the copyright on our collective cultural heritage, and they can deny us access to the mathematical literature at a whim. The promise that we could redistribute on a non-commercial basis was a guarantee that we could preserve the literature. If this is to be taken away, I hope that mathematicians will go to war again.

Hopefully Elsevier will soon come out with a “oops, this was a mistake, those lawyers, you know?” but this will only happen if we get on their case.

What to do:

- Elsevier journal editors: please contact your Elsevier representations, and ask that the licence for the open archives be restored to what it was, to assure the mathematical community that we have ongoing access to the old literature.
- Elsevier referees and authors: please contact your journal editors, to ask them to contact Elsevier. If you are currently refereeing or submitting, please bring up this issue directly.
- Everyone: contact Elsevier, either by email or social media (twitter facebook google+).
- Happily, as we have a copy of the 2013 licence, all the Elsevier open mathematics archive up to 2009 is still available for non-commercial redistribution under their terms. You can find these at https://tqft.net/misc/elsevier-oa/.

Here is the result we are proving: Let be a prime power and let be the cyclic group of order . Let be a set which does not contain any three term arithmetic progression, except for the trivial progressions . Then

The exciting thing about this bound is that it is exponentially better than the obvious bound of . Until recently, all people could prove was bounds like , and this is still the case if is not a prime power.

All of our bounds extend to the colored version: Let be a list of triples in such that , but if are not all equal. Then the same bound applies to . To see that this is a special case of the previous problem, take . Once the problem is cast this way, if is odd, one might as well define , so our hypotheses are that but if are not all equal. We will prove our bounds in this setting.

I must admit, this is the least slick of the three arguments. The reader who wants to cut to the slick versions may want to scroll down to the other sections.

We will put an abelian group structure on the set which is isomorphic to , using formulas found by Witt. I give an example first: Define an addition on by

The reader may enjoy verifying that this is an associative addition, and makes into a group isomorphic to . For example, and .

In general, Witt found formulas

such that becomes an abelian group isomorphic to . If we define and to have degree , then is homogenous of degree . (Of course, Witt did much more: See Wikipedia or Rabinoff.)

Write

.

and set

.

For example, when , we have

.

So if and only if in .

We now work with variables, , and , where and . Consider the polynomial

.

Here each is a polynomial in variables.

So is a polynomial on . We identify this domain with . Then if and only if in the group .

We define the degree of a monomial in the , and by setting . In this section, “degree” always has this meaning, not the standard one. The degree of is ; the degree of is and the degree of is .

From each monomial in , extract whichever of , or has lowest degree. We see that we can write

where , and are monomials of degree .

The now-standard argument (I like Terry Tao’s exposition) shows that is bounded by three times the number of monomials of degree . One needs to check that the argument works when the “degree” of a variable need not be , but this is straightforward.

Except we have a problem! There are too many monomials. To solve this issue, let be the polynomial obtained from by replacing every monomial by where with if and if . So coincides with as a function on , but uses smaller monomials. For example, the reader who multiplies out the expression for when will find a term . In , this is replaced by . The polynomial does not have the nice factorization of , but it is much smaller. For example, when , has nonzero monomials and has . Replacing by can only lower degree, so . Now rerun the argument with . Our new bound is three times the number of monomials of degree , **with the additional condition** that all exponents are .

Now, the monomial has degree . Identify with by sending to . We can thus think of as . We get the bound , just as in the prime case.

Let’s be much slicker. Here is how Naslund and Sawin do it (original here).

Notice that, by Lucas’s theorem, the function is a well defined function when . Moreover, using Lucas again,

Define a function by

.

Here we have expanded by Vandermonde’s identity and used .

Define a function by just as before. So if and only if in the abelian group . Expanding gives a sum of terms of the form . Considering such a term to have “degree” , we see that has degree .

As in the standard proof, factor out whichever of , or , has least degree. We obtain

where , and are products of binomial coefficients and, taking , we have , and .

We derive the bound , exactly as before.

I have taken the most liberties in rewriting this argument, to emphasize the similarity with the other arguments. The reader can see the original here.

Let . Let be the ring of functions with pointwise operations, and let be the group ring of . We think of acting on by .

Let , , …, be generators for . Let the functions annihilated by the operators where . For example, is the functions which obey for any , and . We think of as polynomials of degree , and the dimension of is the number of monomials in variables of total degree where each variable has degree .

Define by and otherwise. Define by .

We write , and for the generators of the three factors in .

Then we have

So, if , then as a function on .

On the other hand, we can expand for , and in . We see that, if , then

.

We make the familiar deduction: We can write in the form

where , and run over a basis for .

Once more, we obtain the bound .

Petrov’s method has an advantage not seen in the other proofs: It generalizes well to the case that is non-abelian. For any finite group , let be a one-sided ideal in obeying . In our case, this is the ideal generated by with . Then we obtain a bound for sum free sets in .

I find Petrov’s proof immensely clarifying, because it explains why the arguments all give the same bound. We are all working with functions . I write them as polynomials in variables , Naslund and Sawin use binomial coefficients . The formulas to translate between our variables are a mess: For example, my is their . However, we both agree on what it means to be a polynomial of degree : It means to be annihilated by .

In both cases, we take the indicator function of the identity and pull it back to along the addition map. The first two proofs use explicit identities to see that the result has degree . The third proof points out this is an abstract property of functions pulled back along addition of groups, and has nothing to do with how we write the functions as explicit formulas.

I sometimes think that mathematical progress consists of first finding a dozen proofs of a result, then realizing there is only one proof. My mental image is settling a wilderness — first there are many trails through the dark woods, but later there is an open field where we can run wherever we like. But can we get anywhere beyond the current bounds with this understanding? All I can say is not yet…

]]>The preprint is here.

Let me first explain the problem. Let be an abelian group. A subset of is said to be free of three term arithmetic progressions if there are no solutions to with , , , other than the trivial solutions . I’ll write for the cyclic group of order . Ellenberg and Giswijt, building on work by Croot, Lev and Pach have recently shown that such an in can have size at most , which is . This was the first upper bound better than , and has set off a storm of activity on related questions.

Robert Kleinberg pointed out the argument extends just as well to bound colored-sets without arithmetic progressions. A colored set is a collection of triples in , and we see that it is free of arithmetic progressions if we have if and only if . So, if , then this is the same as a set free of three term arithmetic progressions, but the colored version allows us the freedom to set the three coordinates separately.

Moreover, once , and are treated separately, if is odd, we may as well replace by and just require that if and only if is odd. This is the three-colored sum-free set problem. Three-colored sum-free sets are easier to construct than three-term arithmetic-progression free sets, but the Croot-Lev-Pach/Ellenberg-Giswijt bounds apply to them as well^{*}.

Our result is a matching of upper and lower bounds: There is a constant such that

(1) We can construct three-colored sum-free subsets of of size and

(2) For a prime power, we can show that three-colored sum-free subsets of have size at most .

So, what is ? We suspect it is the same number as in Ellenberg-Giswijt, but we don’t know!

When is prime, Ellenberg and Giswijt establish the bound . Petrov, and independently Naslund and Sawin (in preparation), have extended this argument to prime powers.

In the set , almost all the -tuples have a particular mix of components. For example, when , almost all tuples have roughly zeroes, ones and twos. The number of such tuples is roughly , where is the entropy .

In general, let be the probability distribution on which maximizes entropy, subject to the constraint that the expected value is . Then almost all -tuples in have roughly copies of , and the number of such -tuples is grows like . I’ll call the EG-distribution.

So Robert and I set out to construct three-colored sum-free sets of size

. What we were actually able to do was to construct such sets whenever there was an -symmetric probability distribution on such that was the marginal probability that the first coordinate of was , and the same for the and coordinates. For example, in the case, if we pick , and with probability and , and with probability , then the resulting distribution on each of the three coordinates is the EG-distribution , and we can realize the growth rate of the EG bound for .

Will pointed out to us that, if such a probability distribution on does not exist, then we can lower the upper bound! So, here is our result:

Consider all -symmetric probability distributions on . Let be the corresponding marginal distribution, with the probabilty that the first coordinate of will be . Let be the largest value of for such a . Then

(1) There are three-colored sum-free subsets of of size and

(2) If is a prime power, such sets have size at most .

Any marginal of an -symmetric distribution on has expected value , so our upper bound is at least as strong as the Ellenberg-Gisjwijt/Petrov-Naslund-Sawin bound. We suspect they are the same: That their optimal probability distribution is such a marginal. But we don’t know!

Here are a few remarks:

(1) The restriction to -symmetric distributions is a notational convenience. Really, all we need is that all three marginals are equal to each other. But we might as well impose -symmetry because, if all the marginals of a distribution are equal, we can just take the average over all permutations of that distribution.

(2) Our lower bound does not need to be a prime power. I’d love to know whether the upper bound can also remove that restriction.

(3) If the largest entropy of a marginal comes from a distribution on with all , then the marginal distribution is the EG distribution. The problem is about the distributions at the boundary; it seems hard to show that it is always beneficial to perturb inwards.

(4) For , there is more than one distribution on with the required marginal. One canonical choice would be the one which has largest entropy given that marginal. If the optimal solution has all , then one can show that it factors as for some function .

* One exception is that Fedor Petrov has lowered the bound for AP free sets in to , whereas the bound for sum-free is still . But, as you will see, I am chasing much rougher bounds here.

]]>

Let me first say that the cc-by-0 license is no problem at all as it allows for other publications without restrictions. Second, our copyright statement of course only talks about the version published in one of our journals, with our copyright line (or the copyright line of a partner society if applicable, or the author’s copyright if Open Access is chosen) on it.

At least if you are publishing in a Springer journal, and more generally, I would strongly encourage you to post your papers to the arXiv under the more permissive CC-BY-0 license, rather than the minimal license the arXiv requires.

As a question to any legally-minded readers: does copyright law genuinely distinguish between “the version published in one of our journals, with our copyright line”, and the “author-formatted post-peer-review” version which is substantially identical, barring the journals formatting and copyright line?

]]>

According to the Talmud, in order for the Sanhedrin to sentence a man to death, the majority of them must agree to it. However

R. Kahana said: If the Sanhedrin unanimously find [the accused] guilty, he is acquitted. (Babylonian Talmud, Tractate Sanhedrin, Folio 17a)

Scott Alexander has a devious mind and considers how he would respond to this rule as a criminal:

[F]irst I’d invite a bunch of trustworthy people over as eyewitnesses, then I’d cover all available surfaces of the crime scene with fingerprints and bodily fluids, and finally I’d make sure to videotape myself doing the deed and publish the video on YouTube.

So, suppose you were on a panel of judges, all of whom had seen overwhelming evidence of the accused’s guilt, and wanted to make sure that a majority of you would vote to convict, but not all of you. And suppose you cannot communicate. With what probability would you vote to convict?

Test your intuition by guessing an answer now, then click below:

My gut instincts were that (1) we should choose really close to , probably approaching as and (2) there is no way this question would have a precise round answer. As you will see, I was quite wrong.

Tumblr user lambdaphagy is smarter than I was and wrote a program. Here are his or her results:

As you can see, it appears that is not approaching , or even coming close to it, but is somewhere near . Can we explain this?

We want to avoid two events: unanimity, and a majority vote to acquit. The probability of unanimity is .

The probability of a majority vote to acquit is . Assuming that , and it certainly should be, almost all of the contribution to that sum will come from terms where . In that case, . And we’ll roughly care about such terms. So the odds of acquittal are roughly .

So we roughly want to be as small as possible. For large, one of the two terms will be much larger than the other, so it is the same to ask that be as small as possible.

Here is a plot of :

Ignore the part with below ; that’s clearly wrong and our approximation that is dominated by won’t be good there. Over the range , the minimum is where .

Let’s do some algebra: , , (since is clearly wrong), . Holy cow, is actually right!

First of all, actually do some computations.

Secondly, I was wrongly thinking that failing by acquittal would be much more important than failing by unanimity. I think I was mislead because one of them occurs for values of and the other only occurs for one value. I should have realized two things (1) the bell curve is tightly peaked, so it is really only the very close to which matter and (2) exponentials are far more powerful than the ratio between or and anyway.

Finally, for the skeptics, here is an actual proof. Assuming , we have

The main step is to replace each by the largest it can be.

But also,

Here we have lower bounded the sum by one of its terms, and then used the easy bound since it is the largest of the entries in a row of Pascal’s triangle which sums to .

So the odds of failure are bounded between

and . We further use the convenient trick of replacing a with a , up to bounded error to get that the odds of failure are bounded between and .

Now, let be a probability greater than other than . We claim that choosing conviction probability will be better than for large. Indeed, the -strategy will fail with odds at least , and the strategy will fail with odds at most . Since , one of the two exponentials in the first case is larger than , and the -strategy is more likely to fail, as claimed.

Of course, for a Sanhedrin of members, , so our upper bound predicts only a one percent probability of failure. More accurately computations give . So the whole conversation deals with the overly detailed analysis of an unlikely consequence of a bizarre hypothetical event. Fortunately, this is not a problem in the study of Talmud!

]]>This is the corner of a crystal of salt, as seen under an electron microscope. (I took the image from here, unfortunately I couldn’t find better information about the sourcing.) As you can see, the corner is a bit rounded, where some of the molecules have rubbed away. They ask the question: “What is the shape of that rounded corner?”

The molecules of a salt crystal form a cubical lattice. We can index them as . But it’s not the ones from the interior that are missing – if is missing and , and , then is missing as well.

A finite subset of with the property that implies for , , is what I’ll call a **three dimensional partition**. (Here I am deliberately rebelling against the awful classical terminology, which is a “plane partition“.)

The opening of Kenyon, Okounkov and Sheffield’s paper discusses the question: “What is the shape of a random three dimensional partition of size ?”

It was only after I had worked through this paper, and its sequels 1 2 fairly carefully that I realized they hadn’t actually answered their motivating question. They certainly implied what the answer should be, and they laid out all the necessary tools, but they never came back and said “let’s do the salt crystal example”.

In this post, I want to lay out in outline how this question is answered. The previous posts on Legendre transforms in statistical mechanics, and on random partitions, were meant as warm ups, where it is easier to make complete arguments.

Let me acknowledge right out that I am making no attempt at rigor here. What I want to do is sketch the argument, and hope this encourages some of you to read the amazing sequence of papers I have linked to.

In the previous posts, our first step was to analyze partitions of a given slope. Namely, we showed that there are roughly partitions that fit in a box, where . We would now like to similarly analyze three dimensional partitions with a given slope.

We could specify a normal vector and ask for the partition to have slope roughly , but there is an approach which turns out to be equivalent and is a bit easier to formulate. Let be positive real numbers with . The boundary of a plane partition is made up of squares which lie either in the plane, the plane or the plane. Let us suppose there were some function such that the number of partitions with some specified boundary, using roughly squares of the first type, of the second type and of the third type, is roughly . We would like to know what this function is.

I said “some specified boundary”. What boundary shall we use? As is often the case in mathematical physics, the nicest thing to do is to use “periodic boundary conditions” — which is to say, to avoid the issue of boundaries by wrapping the problem on a torus.

A perspective drawing can give us inspiration. The boundary of a three dimensional partition looks like a tiling of the plane by rhombi with angles and . The three planes which squares can lie in turn into the three possible orientations of the rhombi (colored red, black and blue below).

(Image taken from Eventually Almost Everywhere, who has further nice discussion of the relation between rhombus tilings and plane partitions.)

Tile the plane by equilateral triangles in the standard manner and then quotient that plane by some lattice to produce a torus. Let that torus contain upward pointing and downward pointing triangles; so we can hope to tile it with rhombi.

Let be the number of such tilings which use , and rhombi of the three potential types. So our goal is to determine .

As in the case of two dimensional partitions, the best strategy is to form a generating function. Let .

There is an amazing explicit formula for this generating function.

To tantalize you, I will state it without proof, and simply point you to the search term “Kasteleyn’s method” if you want to learn more.

Let us suppose our torus has a fundamental domain which is , so , and let be even. Set

.

We have (up to possible sign errors on my part)

.

Asymtotically, all four terms contributing to are about the same size, so . And how big is ? We can approximate that sum by an integral:

.

Set . This is known as the “Rankin function”. Then, as before, we conclude that is the Legendre transform of the Rankin function. And we conclude that a random three dimensional partition has the shape for some constant .

Something interesting happens. Suppose that . So cannot be zero for any complex numbers and with , . So is a harmonic function when we restrict and to those discs. The average of a harmonic function over the boundary of a disc is the value at the center of the disc. So simply equals when .

Thus, the surface contains, in particular, the planar region , , and similar planar regions in the and planes. This is a major difference between the two dimensional and three dimensional limit shapes — no part of the two dimensional limit curve is contained in the coordinate axes. While I'm not sure how seriously this picture should really be taken, it might do a bit to explain why those salt crystals above look so cubical.

There is another very important difference between the two and three dimensional pictures. In the two dimensional case, the only solutions of the variational problem were of the form . But, in the three dimensional case, is only one of many solutions to the resulting PDE. There is so much more to say about all of this. If you want to read more, I refer you to Chapter One of Kenyon and Okounkov.

]]>