In previous post “YAPP (Yet Another Prisoners Problem)” we considered the problem of 100 prisoners each opening 50 out of 100 boxes, surviving iff all 100 prisoners found their unique id in one of the 50 boxes.
Remarkably there was a strategy enabling the prisoners to survive with probability > 30%. Even more remarkably we established that the survival probability continues to stay over 30% as the number of prisoners and boxes tend to infinity.
In this post I wanted to consider a variation of the problem, similar spirit, yet seemingly more difficult than the previous.
Let’s state the problem I have in mind (in it’s full generality):
Let’s first make a few easy remarks:
- The independent probability of all prisoners surviving is now , as each prisoners individual chance of success is . To see this, each person chooses out of boxes, so the probability that the first of the two instances of id is not among the boxes is . The same goes for the second instance of id . Therefore the probability that they are both not among the chosen boxes is i.e the probability that at least one is chosen is . Selections are independent so total survival probability among prisoners is hence
- When we compare the problems, we compare the settings where we have n prisoners over n boxes in the original problem vs n prisoners over 2n boxes in this problem.
- The previous strategy no longer works as we are no longer dealing with classical permutations. Because of the double multiplicity of each number, paths may self-intersect prematurely (below picture illustrates this phenomenon).
A few Questions:
- Is this problem more or less hard than the previous problem? After all, we have twice as many boxes, but on the other hand each prisoner has the opportunity to find their number in two boxes instead of one and is allowed to open twice as many boxes.
- Can we salvage anything from the previous strategy, i.e “can we fix the argument?”
- Is there a correspondence between the two problems? (can we translate a configuration (or a group of configurations) in one problem to a configuration in the second problem?)
- What is the total number of configurations (permutations) in this problem?
- If we can find an alternative argument, how would we go about proving optimality? (can we adapt/reuse the proof by Curtin and Warshauer(see previous post)?)
I should probably mention at this point that I have unfortunately not managed to find an argument that works as well as the argument for the previous problem (please feel free to comment if you have any nice ideas). Nevertheless I have a partial remediation of the previous strategy that is somewhat better than the default independent strategy. I will begin discussing this strategy in what remains of this post.
Let’s first look at the easiest question – Question 4.
The number of ways of permuting distinct numbers is as familiar
In this problem we are dealing with a so called multiset of n distinct numbers, each of multiplicity 2.
In general a multiset with multiplicities can be permuted in ways. To see this just note that there are ways of permuting as distinct objects. In each rearrangement we can permute the objects with multiplicity in ways among eachother, each leading to the same rearrangement. Therefore there are only as many true rearrangements. The same argument inductively for each establishes the formula.
In this problem we therefore have different permutations.
As we might have expected this space is vastly larger than the space of permutations in the previous problem.
Let’s get to the proposed strategy.
As we saw, the obvious issue with pointer following in the new problem setting is that paths no longer partition into disjoint cycles. The paths have a tendency to curl back into themselves leading to degenerate cycles.
To straighten out the curls we can impose a new rule:
If we encounter number for the second time, then visit box number .
What are the implications of this rule?
We certainly won’t visit each box more than once, since each number has multiplicity 2.
As the rule implies, box is visited the first time number is encountered, then box the second time and thereafter we will never see number again, so both boxes are henceforth off limits.
Because of this fact the path will eventually lead to the discovery of every number id (of course we will stop once we have discovered the number id we are looking for).
This is somewhat good news, we have a procedure for exploiting structure in the permutation.
We are interested in the proportion of permutations s.t every path taken by the prisoners have length at most under the rules of the strategy.
The next natural question is if we can quantify how well this strategy ought to work?
In the previous problem we counted the number of permutations containing a -cycle () by indirectly expanding the following recursion:
In this problem we have more choice.
Let denote the number of permutations including a cycle of length where denotes #(remaining numbers with multiplicity 1) and denotes #(remaining numbers with multiplicity 2). At each stage, we can either choose a number with multiplicity 2 (i.e a new number) or we choose a number of multiplicity 1 (i.e a number already discovered). Finally once we have our cycle of length we can permute the remaining multiset at will.
This leads to the following recursion:
We are interested in .
Unfortunately unwinding this recursion into a closed form expression looks too difficult (don’t hesitate to let me know if you can think of a nice combinatorial argument for counting k-cycles).
Things are also made further complicated by the fact that cycles of length greater than need not be unique occurrences in any given permutation, making the probability of failure difficult to evaluate.
Resorting to Monte-Carlo methods we obtain the following simulated survival curve.
Unfortunately it seems the combinatorial explosion of long cycles take over fairly quickly, rendering the strategy far from ideal.
Below you can try running a few simulations yourself, using the Flash implementation hacked together for this purpose.
(Click here if you see a blank space area below)
I’ll admit defeat for now and come back to this problem again some time in the future.