There are a number of sums over primes whose asymptotic behavior is easy to determine. For example, we know that
and also . A bit of rearrangement and some elementary bounds yields
On the other hand, the estimate
is the Prime Number Theorem, whose proof was a major triumph of nineteenth century mathematics.
If you spend a lot of time reading about the Prime Number Theorem, you’ll see a lot of messing around with sums over primes, and it isn’t clear which ones are elementary and which ones are deep. This post is about a heuristic I found a while ago to separate the two: Would the given asymptotic still hold if primes hated to have first digit nine?
Acknowledgements: Much of this appeared earlier in an answer I left on MO. There is also a lot of similarity to Terry Tao’s post on conspiracies and the prime number theorem. One can think of “primes hate to start with nine” as a “conspiracy” whose consequences I find easy to work with.
Seeing that (1) will not prove PNT
To be more precise, suppose that
There are no primes between and , for large.
There are primes between and for (and and not too close together).
This would not be consistent with the Prime Number Theorem. PNT gives
which goes to as .
So the proposed conspiracy is very inconsistent with the PNT, and any asymptotic property of primes which is strong enough to imply PNT should violate the conspiracy. But is perfectly compatible with this conspiracy.
Let's see that the conspiracy is consistent with (1): For of the form , the sum would be
For , the sum would be equal to the same quantity, because there are no additional terms between and , so we also have
. So and . Actually, if you are careful about the bounds, you’ll see that the right hand sides should be and , where is a constant that has to do with how large the contributions from small primes are. So this conspiracy would imply where the term oscillates between and .
So should not be powerful enough to rule out the possibility that primes hate to start with , and therefore should not be good enough to prove PNT.
On the other hand, if we had a proof that , that would NOT be compatible with the possibility that primes hate to start with , so it might be good enough to imply PNT and, indeed, it is.
To point out that people really do ask questions where this heuristic helps, see this math.SE question.
Connection to the zeroes of the zeta function
As you’ve probably heard, the key to most proofs of the PNT is to show that for any real . Let’s see how this is related to the possibility of -avoidance. Remember that we have so . So . Now, if were , then would would expect to be . (In practice, one usually prefers to work with rather than , but either will work for the current discussion.)
Looking at the sum , we expect it to be absolutely divergent, because diverges, and there is no reason to expect the cosine to be particularly small. On the other hand, the cosine will have both positive and negative signs, so as long as both signs occur about equally often, the sum probably will converge conditionally. The challenge of proving PNT is showing that there is enough cancellation of signs that we do indeed get conditional convergence.
Suppose that primes hated to start with , and consider . Then never lies between and for any sufficiently large integer . Since is positive on such an interval, a whole lot of positive terms are missing from the sum .
So one can imagine that, if primes hated to start with , then would vanish at . This argument isn’t rigorous (one could imagine the primes had other peculiar properties which made the sum converge after all) but it starts to make it clear why the function comes up. Of course, if you want to actually read a rigorous proof of the PNT, their are plenty of good ones available, and you’ll see that they work with from the start, without going through any thoughts about writing primes in base ten.
Of course, does have zeroes, they just lie on rather than . The first such zero is at where . Speaking vaguely, this means that should have a bias towards having one sign over another, although this preference only creates an imbalance of size for primes near . In other words, primes do prefer to have certain “leading digits” in “base” .
In the following graphic, the blue points show how prefers to avoid the left and right end points of the interval, so the tends to be negative. The red points show an analogous chart computed with in place of , where no such trend appears. See this post for details of what I am plotting.