Here is a fun problem, with a great story and a surprising answer.
According to the Talmud, in order for the Sanhedrin to sentence a man to death, the majority of them must agree to it. However
R. Kahana said: If the Sanhedrin unanimously find [the accused] guilty, he is acquitted. (Babylonian Talmud, Tractate Sanhedrin, Folio 17a)
Scott Alexander has a devious mind and considers how he would respond to this rule as a criminal:
[F]irst I’d invite a bunch of trustworthy people over as eyewitnesses, then I’d cover all available surfaces of the crime scene with fingerprints and bodily fluids, and finally I’d make sure to videotape myself doing the deed and publish the video on YouTube.
So, suppose you were on a panel of judges, all of whom had seen overwhelming evidence of the accused’s guilt, and wanted to make sure that a majority of you would vote to convict, but not all of you. And suppose you cannot communicate. With what probability
would you vote to convict?
Test your intuition by guessing an answer now, then click below:
My gut instincts were that (1) we should choose really close to
, probably approaching
as
and (2) there is no way this question would have a precise round answer. As you will see, I was quite wrong.
Tumblr user lambdaphagy is smarter than I was and wrote a program. Here are his or her results:
As you can see, it appears that is not approaching
, or even coming close to it, but is somewhere near
. Can we explain this?
Heuristic solution
We want to avoid two events: unanimity, and a majority vote to acquit. The probability of unanimity is .
The probability of a majority vote to acquit is . Assuming that
, and it certainly should be, almost all of the contribution to that sum will come from terms where
. In that case,
. And we’ll roughly care about
such terms. So the odds of acquittal are roughly
.
So we roughly want to be as small as possible. For
large, one of the two terms will be much larger than the other, so it is the same to ask that
be as small as possible.
Here is a plot of :
Ignore the part with below
; that’s clearly wrong and our approximation that
is dominated by
won’t be good there. Over the range
, the minimum is where
.
Let’s do some algebra: ,
,
(since
is clearly wrong),
. Holy cow,
is actually right!
Lessons learned
First of all, actually do some computations.
Secondly, I was wrongly thinking that failing by acquittal would be much more important than failing by unanimity. I think I was mislead because one of them occurs for values of
and the other only occurs for one value. I should have realized two things (1) the bell curve is tightly peaked, so it is really only the
very close to
which matter and (2) exponentials are far more powerful than the ratio between
or
and
anyway.
Rigorous computation
Finally, for the skeptics, here is an actual proof. Assuming , we have
The main step is to replace each by the largest it can be.
But also,
Here we have lower bounded the sum by one of its terms, and then used the easy bound since it is the largest of the
entries in a row of Pascal’s triangle which sums to
.
So the odds of failure are bounded between
and
. We further use the convenient trick of replacing a
with a
, up to bounded error to get that the odds of failure are bounded between
and
.
Now, let be a probability greater than
other than
. We claim that choosing conviction probability
will be better than
for
large. Indeed, the
-strategy will fail with odds at least
, and the
strategy will fail with odds at most
. Since
, one of the two exponentials in the first case is larger than
, and the
-strategy is more likely to fail, as claimed.
Of course, for a Sanhedrin of members,
, so our upper bound predicts only a one percent probability of failure. More accurately computations give
. So the whole conversation deals with the overly detailed analysis of an unlikely consequence of a bizarre hypothetical event. Fortunately, this is not a problem in the study of Talmud!
Here’s a faster rigorous proof. Consider the derivative of the success probability with respect to a single judge’s probability of convicting. If we show this is >0 for p< x and x then it follows that the derivative with respect to all judge’s probability of convicting is the same, and the correct value is x. But the function of a single judge’s probability is linear, so we just have to consider the difference between if the judge convicts and if the judge acquits. (This is a Nash equilibrium-type argument. In the equilibrium, each judge must be indifferent between his two choices. Normally this is done for competitive or partially cooperative sitatuations, but it works equally well for fully competitive.)
Say there are n+1 judges and n is even. Otherwise just add some floor and ceiling functions.
It only helps if the judge convicts if the other ones are exactly tied, with probability p^{n/2} (1-p)^{n/2}times n choose n/2 . It only helps if the judge acquits if the other ones are exactly unanimous, with probability p^{n}. It’s sufficient to show that the ratio of these is >1 for p<x and x. The ratio is just ((1-p)/p )^{n/2} times n choose n /2, so it clearly satisfies this where x is the ( n choose n/2 )^{2/n}/ ( 1+ ( n choose n/2 )^{2/n} ) n choose n/2 is 2^n divided by a function that grows slower than exponential, so its nth root is 2 – o(1), so the asymptotic is 2^2/ (1 + 2^2) =4/5.
Oh, wow. So you are saying the exact solution is when we have equality in
, not just that this is asymptotically right. That’s very cool, but almost seems to good to be true.
For example, if
, you seem to be saying that the exact solution comes from solving
, which gives the root
, whereas a direct solution would involve solving
, or
.
I suggest plotting the exact answer for the conviction probability for the (canonical value) of N=71 (or, really any sufficiently large N), as a function of p. The answer is essentially flat (and almost exactly 1) for p between 0.6 and 0.9. The maximum is at p=0.79, but it hardly matters what value you choose for p in that range. Scott Alexander is screwed.
I should have said: in the limit of large N, the conviction probability becomes essentially a step function: vanishing below p=0.5, rising rapidly to 1 above p=0.5, and then dropping to zero near p=1. Knowing that the true maximum is at p=0.8 is actually irrelevant.
For small N, things are more interesting
No, I reduced n by 1 to simplify the formulas because there is no tex here. I said that there were n+1 judges. So for 3 judges I’m saying we want 2p(1-p)=p^2, which gives p=2/3.
Gotcha, thanks. For the record, LaTeX is available in comments by writing Slatex 2p(1-p) = p^2S, and replacing the S’s by dollar signs.
Look this problem:
A succinct representation of a set of (distinct) b-bits positive integers is a Boolean circuit C with b input gates. The set represented by C, denoted S_{C}, is defined as follows: Every possible integer of S_{C} should be between 0 and (2^{b} – 1). And j is an element of S_{C} if and only if C accepts the binary representations of the b-bits integer j as input. The problem SUCCINCT MAXIMUM is now this: Given the succinct representation C of a set S_{C} and a b-bits integer x, where C is a Boolean circuit with b input gates, is x the maximum in S_{C}?
It is very easy to show this problem is not in P, because we should need n comparisons to know whether x is the maximum in a set of n (distinct) positive integers when the set is arbitrary. And this number of comparisons will be optimal. This would mean we cannot always accept every instance (C; x) of SUCCINCT MAXIMUM in polynomial-time, because we must use at least n = |S_{C}| comparisons for infinite amount of cases, where |S_{C}| is the cardinality of S_{C}. However, n could be exponentially more large than the size of (C; x).
But, at the same time, it is so easy to show this problem is in coNP. Certainly, given a b-bits integer y, we can check whether C accepts the binary representation of y (which means that y is an element of S_{C}) and x < y in polynomial-time, and thus, we could verify whether (C; x) is a "no" instance SUCCINCT MAXIMUM in polynomial-time.
However, the existence of a problem in coNP and not in P is sufficient to show that P is not equal to NP, because if P would be equal to NP, then P = coNP.
Basically is this, but you could see more in
https://hal.archives-ouvertes.fr/hal-01281254/document