# Random partitions I

With trepidation, I am attempting to start another series of mathematical blog posts. The goal of this one will be to present random partitions as a toy model for thinking about random dimer configurations. Ideally, these posts will help prepare the reader to take on the ground breaking papers of Kenyon and Okounkov.

I say ideally because I have an infant daughter at home — her name is Josephine, she is adorable and demanding — so the installments of this series may be spaced very far apart. I make no plans to apologize for this. But I’ve been feeling the lack of mathematical blogging the last few months, so I’m giving it a shot.

The motivating question for today’s post will be: For $\alpha \in (0,1)$, approximately how many paths are there from $(0,0)$ to $(\alpha n, (1-\alpha) n)$, made up of steps in direction $(1,0)$ and $(0,1)$?

This may seem like a silly question. The exact number of such paths is $\binom{n}{\alpha n }$. There are $n$ steps, $\alpha n$ of which are north and $(1-\alpha) n$ of which are east. So the count is the binomial coefficient.

If we need an approximation, we can haul out Stirling’s approximation. In a weak form, this says $n! = \exp(n \log n - n + o(n))$ So, after a little work we get

$\displaystyle{\binom{n}{\alpha n} = \exp\left(n [ \alpha \log(1/\alpha) + (1-\alpha) \log(1/(1-\alpha)) ] + o(n) \right).}$

We’ll define $s(\alpha) = \alpha \log(1/\alpha) + (1-\alpha) \log(1/(1-\alpha))$, so this says that the number of paths of (taxicab) length $n$ is $\exp(n s(\alpha) + o(n))$. Conceptually, one can think that, for every unit of length of the diagonal, there are $s(\alpha)$ ways to configure that section of the partition.

Even with this crude bound, we can deduce a nonobvious statement: Almost all paths from $(0,0)$ to $(\alpha n, (1-\alpha)n)$ are within $o(n)$ of a straight line path.

Proof“: Suppose, to the contrary, that there were a positive proportion of paths that passed through a point which was significantly off a straight line. These paths can be divided into two shorter paths, one of size $(\alpha' n') \times (1-\alpha') n'$ and one of size $(\alpha'' n'') \times (1-\alpha'') n''$, with $n=n'+n''$ and $\alpha n = \alpha' n' + \alpha'' n''$.

The number of paths of these two types is the product of the number of paths of each type. In other words,

$\displaystyle{ \exp( n' s(\alpha') + n'' s(\alpha'') + o(n) )}.$

But $s$ is easily seen to be concave, so $n' s(\alpha') + n'' s(\alpha'')$ $\leq (n'+n'') s((n' \alpha' + n'' \alpha'')/(n'+n''))$ $= n s(\alpha)$. Moreover, the only way that $n' s(\alpha') + n'' s(\alpha'')$ can even be close to $n s(\alpha)$ is if $\alpha = \alpha' = \alpha''$, meaning that the point is on the main diagonal. In other words, if the point is off the straight line route, the number of paths passing through that point is exponentially less then the number of total paths from $(0,0)$ to $(\alpha n, (1-\alpha)n)$. So almost all paths in the box stay near the straight line. “QED”

By the way, that is the level of rigor you should expect throughout this series. I want to focus on concepts, not on precise bounds.

## Another approach to deriving $s(\alpha)$

Now, let’s forget binomial coefficients and Stirlings formula. Suppose that we suspected there was a function $s(\alpha)$ such that the number of paths was $\exp(n s(\alpha) + o(n))$, but we didn’t know what $s$ was. One can see that $s$ is concave by thinking about concatenating paths, as above. I now explain how, using generating functions, we can understand the properties of $s$.

Let $g_n(x,y)$ be the generating function for all north-easterly paths of length $n$, where we weight a northern step by $e^{x}$ and an eastern step by $e^y$. Of course, $g_n(x,y)$ is simply $(e^x + e^y)^n$. Suppose that we didn’t know this exact formula, but we did know that there was a function $r(x,y)$ such that $g_n(x,y) = \exp( n r(x,y) + o(n))$. So $r(x,y) = \log(e^x + e^y)$, but we are pretending not to know this. We now have two functions $r$ and $s$, which we are pretending not to know, and we would like to figure out how they are related.

We can break the generating function $g_n(x,y)$ up according to how many northern and how many eastern steps we take. So

$\displaystyle{ g_n(x,y) \approx \sum_{k=0}^n \exp(n s(k/n)) e^{kx + (n-k) y}}$

or

$\displaystyle{ \exp(n r(x,y)) \approx \sum_{k=0}^n \exp\left( n \cdot [s(k/n) + (k/n) x + (n-k)/n y] \right)}$

When $n$ gets large, only the largest term will contribute to the sum. So we should have

$\displaystyle{ r(x,y) = \max_{\alpha} \left( s(\alpha) + \alpha x + (1-\alpha) y \right). }$

Since $s$ is concave, this maximum will be achieved at a unique $\alpha$. Assuming that $s$ is sufficiently differentiable, this $\alpha$ will obey $s'(\alpha) = y-x$. In short we have shown:

 If $s'(\alpha) = y-x$, then $s(\alpha) + \alpha x + (1-\alpha) y = r(x,y)$.

The reader may enjoy verifying that our $r$ and $s$ do in fact verify this relation. The standard term is that $r$ and $s$ are Legendre dual.

## Coloring paths

Let’s do a slightly harder problem. We’ll count paths from $(0,0)$ to $(\alpha n, (1-\alpha) n)$ again only, this time, when we take a step from $(x,y)$ to $(x,y+1)$, and $x+y$ is even, we can color the step either black or white. Approximately how many such colored paths are there?

This time, there is no simple formula for $s(\alpha)$. But the same method works to obtain a differential equation. The generating function which used to be $(e^x+e^y)^n$ would now be $(e^x + e^y)^{n/2} (e^x + 2 e^y)^{n/2}$ and $r(x,y)$ would be $(1/2) \log(e^x+e^y) + (1/2) \log(e^x + 2 e^y)$. So $s$ is Legendre dual to $(1/2) \log(e^x+e^y) + (1/2) \log(e^x + 2 e^y)$ — whatever that may be.

## Next time

My goal for next time is to derive the shape of a randomly chosen partition of size $N$. See you then!

## 8 thoughts on “Random partitions I”

1. This looks promising but I got confused by some typos, there is a small s function and a big s function, one s I think should be an r (before “the reader may enjoy”), one g_n became r_n… if you fix these I will try again to read it.

Congratulations on your very young researcher at home!

2. Several typos fixed. Did I get them all?

3. Much improved! It is readable to me, I saw just one typo… after “there was a function r(x, y) such that” I guess the next formula should start with g_n.

I see how to compute r from s, in two ways. But how do we compute s from r? Or is that for a future episode?

4. @Daveagp Let me outline the basics, and point you to the Wikipedia article on Legendre transforms for the rest. To make things confusing, I will do this in multiple variables for concave functions (which will be the relevant set up for us), while Wikipedia does just one variable and does convex functions.

So, let $V$ be a real vector space. Let $r$ be a concave function $V \to \mathbb{R}$ defined on some convex subset $K$ of $V$. For $\lambda \in V^*$, define $s(\lambda) = \max_x(r(x)+\lambda(x))$. Let $K^*$ be the set of $\lambda$ for which this maximum is finite. Then $K^*$ is convex and $s$ is concave on $K^*$. Moreover, this relation is self-dualizing up to getting signs right: we have $r(x) = \max_{\lambda}(s(\lambda) - \lambda(x))$. This is called Legendre duality. In particular, you get $r$ from $s$ in the same way (up to getting signs right) that you get $s$ from $r$.

In our setup, $r(x) = \log(e^x+1)$, with $K$ being all of $\mathbb{R}$, and $s(\alpha) = - \alpha \log(\alpha) - (1-\alpha) \log(1-\alpha)$ with $K^*$ being $[0,1]$.

Now, is there a simpler formula for the relation of $r$ to $s$? Sort of. Consider $dr$ as a map $V \to V^*$ and $ds$ as a map $V^* \to V$. Then the maps $ds$ and $dr$ are inverse. See “Another definition” at Wikipedia.

Let’s try using this in our case. $r(x) = \log(e^x+1)$. So $r'(x) = e^x/(e^x+1)$. Inverting $e^x/(e^x+1)$, we get $s'(\alpha) = \log(\alpha/(1-\alpha))$. Then integrating this should give the stated formula for $s$, possibly with a sign error.

Of course, actually getting a closed form expression this way usually won’t work. You have to invert a fairly unconstrained function from $V \to V^*$, then integrate the resulting $1$-form. But does this make it clear what you need to do?

5. As they say, “I see what you did there.” Thanks! I was originally caught off-guard by the fact that the original s and r took a different number of variables, which did not seem to jibe with the wikipedia definition. But I see that r can be thought of a single-input function in the sense that r(x, y)=r(x-y, 0)+y, then I also understand your comment about r of a single variable, and I also see this works in the 2nd setting you introduce at the end. Also, thanks for posting the pair of dual formulas.

6. David Speyer says:

The transition between the one variable and the two variable functions is one of these little awkwardnesses that keeps showing up — the two variable version manifests better the symmetries of the problem, but the one variable version makes easier technical statements.

Having just written this out has given me a nicer way to think about it. Forget about going to the one variable case. Look at $r(x,y) = \log(e^x+e^y)$, a function on all of $\mathbb{R}^2$. Then $K^*$ is the line segment $\{ (\alpha, \beta) : \alpha+\beta =1,\ \alpha, \beta \geq 0 \}$. Really, it is: $\max(r(x,y) - \alpha x - \beta y)$ is not finite for any other values of $(\alpha, \beta)$. So, if you start in the two variable set up, your formula tells you that $s$ should be defined on a one dimensional set.

7. jc says:

I’m quite looking forward to the rest of this series! I learned the stuff in this post as “taking the thermodynamic limit”; In a statistical mechanics course, the function s in the first model of counting paths (equivalently, a lattice gas with particles named “North” and “East”) is known as the entropy of mixing http://en.wikipedia.org/wiki/Entropy_of_mixing , and treating x and y as chemical potentials of the two species, r is a grand potential (I think). I’m having more trouble making sense of the second model in these terms, though. My advisor is fond of saying that “Laplace is the same as Legendre”, referring to the relationship between “thermodynamic potentials” r and s (Legendre transform) and the relationship between “partition functions” g and the total number of paths (Laplace transform) in different statistical mechanical ensembles.