pr.probability - How many trial picks expectedly sufficient to cover a sample space?

Tuesday, 27 November 2007

pr.probability - How many trial picks expectedly sufficient to cover a sample space?

The expected number of picks needed equals the sum of the probabilities that at least $t$ picks are needed, which means that $t-1$ subsets left at least one value uncovered. We can use inclusion-exclusion to get the probability that at least one value is uncovered.

The probability that a particular set of $k$ values is uncovered after $t-1$ subsets are chosen is

$$Bigg(frac{n-k choose r}{n choose r}Bigg)^{t-1}$$

So, by inclusion-exclusion, the probability that at least one value is uncovered is

$$ sum_{k=1}^n {n choose k}(-1)^{k-1}Bigg(frac{n-k choose r}{n choose r}Bigg) ^{t-1} $$

And then the expected number of subsets needed to cover everything is

$$ sum_{t=1}^infty sum_{k=1}^n {n choose k}(-1)^{k-1} Bigg(frac{n-k choose r}{n choose r}Bigg)^{t-1} $$

Change the order of summation and use $s=t-1$:

$$ sum_{k=1}^n {n choose k}(-1)^{k-1} sum_{s=0}^infty Bigg( frac{n-k choose r}{n choose r}Bigg)^s$$

The inner sum is a geometric series.

$$ sum_{k=1}^n {n choose k} (-1)^{k-1}frac{n choose r}{{n choose r}-{n-k choose r}}$$

$$ {n choose r} sum_{k=1}^n (-1)^{k-1}frac{n choose k}{{n choose r}-{n-k choose r}}$$

I'm sure that should simplify further, but at least now it's a simple sum. I've checked that this agrees with the coupon collection problem for $r=1$.

Interestingly, Mathematica "simplifies" this sum for particular values of $r$, although what it returns even for the next case is too complicated to repeat, involving EulerGamma, the gamma function at half-integer values, and PolyGamma[0,1+n].

Answer Desk

Tuesday, 27 November 2007

pr.probability - How many trial picks expectedly sufficient to cover a sample space?

No comments:

Post a Comment