Tuesday, 27 November 2007

pr.probability - How many trial picks expectedly sufficient to cover a sample space?

The expected number of picks needed equals the sum of the probabilities that at least t picks are needed, which means that t1 subsets left at least one value uncovered. We can use inclusion-exclusion to get the probability that at least one value is uncovered.



The probability that a particular set of k values is uncovered after t1 subsets are chosen is



Bigg(fracnkchoosernchooserBigg)t1



So, by inclusion-exclusion, the probability that at least one value is uncovered is



sumk=1nnchoosek(1)k1Bigg(fracnkchoosernchooserBigg)t1



And then the expected number of subsets needed to cover everything is



sumt=1inftysumk=1nnchoosek(1)k1Bigg(fracnkchoosernchooserBigg)t1



Change the order of summation and use s=t1:



sumk=1nnchoosek(1)k1sums=0inftyBigg(fracnkchoosernchooserBigg)s



The inner sum is a geometric series.



sumk=1nnchoosek(1)k1fracnchoosernchoosernkchooser



nchoosersumk=1n(1)k1fracnchooseknchoosernkchooser



I'm sure that should simplify further, but at least now it's a simple sum. I've checked that this agrees with the coupon collection problem for r=1.



Interestingly, Mathematica "simplifies" this sum for particular values of r, although what it returns even for the next case is too complicated to repeat, involving EulerGamma, the gamma function at half-integer values, and PolyGamma[0,1+n].

No comments:

Post a Comment