Wednesday 28 November 2007

pr.probability - What's the standard name for sets of a given size with maximal probability (or a given probability and minimal size)?

The definition I'm going to give isn't quite the concept I really want, but it's a good approximation. I don't want to make the definition too technical and specific because if there's a standard name for a slightly different definition, then I want to know about it.



Let $(X,mu)$ be a measure space, and let $rho$ be a probability measure on $X$. I call a subset $A$ of $X$ special if for all measurable $Bsubseteq X$,



  1. $mu(B)leqmu(A)$ implies $rho(B)leqrho(A)$, and

  2. $mu(B)=mu(A)$ and $rho(B)=rho(A)$ implies $B=A$ up to measure zero (with respect to both $mu$ and $rho$).

What is the standard name for my "special" sets? Equivalently, one could stipulate $mu(A)leqbeta$ and call $A$ "special" if it is essentially the unique maximizer of $rho(A)$ given that constraint.



Also equivalently, we could stipulate a particular $rho$-measure and consider sets achieving that $rho$-measure having the smallest possible $mu$-measure. That's probably the most intuitive way to think about this: we're looking for sets that contain a certain (heuristically: large) fraction of of the mass of $rho$ but are as small as possible (with respect to $mu$). That seems like a completely natural and obvious concept, which is why I think it should have a standard name. But I have almost no training in statistics, so I don't know what the name is.



This example might be far-fetched, but just to illustrate: suppose the FBI has knowledge that somebody is going to attempt a terrorist attack in a certain huge city at a particular hour. They might not know where, but they might have (some estimate of) a probability distribution for the location of the attack. They want to distribute agents strategically throughout the city, but they probably don't have enough agents to cover the entire city. Let's say every agent can forestall an attack if it occurs within a certain radius of his/her position (which is unrealistic, since the number of nearby agents surely also matters, but ignore that); then, to maximize the probability that the attack will be stopped, to an approximation, they should distribute their agents uniformly over a special subset of the city's area. To approach this from the other perspective, it could be the case that 99% of the mass of their probability distribution is contained in a region with very small area. (The one with the smallest area will be a special set.) Then, to save resources, if they're okay with 99:1 odds (c'est la vie), they might only distribute a relatively small number of agents to that small special region.



If $rho$ has a density $f$ with respect to $mu$ (when it makes sense to talk about such), then special sets are closely related to the superlevel sets of $f$, i. e., sets of the form ${x:f(x)geq c}$ for $cgeq 0$. (I think they're basically the same, but specialness of $A$ is unaffected by changing $A$ by a set of measure zero, so a superlevel set actually corresponds to an equivalence class of special sets.) I mention this here because (1) the connection to superlevel sets is one of my reasons for caring about specialness, and (2) "superlevel sets of the density" is not the answer I'm looking for.


Example 1

Here's a very simple example in which special sets can be completely characterized. Let $X={x_1,ldots,x_n}$ be a finite set, and let $mu$ be counting measure on $X$. Let $rho$ be any probability distribution on $X$, which necessarily has a density function $f:Xtomathbb{R}_+$, so by definition, $f(x_1) + ldots + f(x_n) = 1$ and $rho(A) = sum_{xin A} f(x)$. Suppose that no two points have the same $f$-value; then, without loss of generality, $f(x_1) > f(x_2) > ldots > f(x_n)$. It's easy to see that the special sets in this setup are exactly the sets $A_k = {x_1,x_2,ldots,x_k}$, i. e., which contain the largest $k$ points as measured by $rho$, for $k=0,ldots,n$. (Why: if you have some other candidate special set $B$, then $A_{\#B}$ has the same $mu$-measure as $B$ but higher $rho$-measure, so $B$ can't be special.) It's easy to generalize this example to the case in which $f$ isn't necessarily one-to-one: you have to treat all points with the same $f$-value as a block: either all of them are in the special set, or none of them are. (Otherwise, there's no way to satisfy the "uniqueness" part (point 2) of the definition.)


Example 2

Here's a generalization of the first example that hopefully clarifies what I said above. Let $(X,mu)$ be some nice measure space on which integration of functions makes sense (like a Riemannian manifold, or just $mathbb{R}^d$). Let $f:Xtomathbb{R}_+$ be a nonnegative integrable function with $int_X f(x) dmu = 1$, and let $rho$ be the probability measure $rho(Y) = int_Y f(x) dmu$, so $f$ is the density of $rho$ with respect to $mu$. Fix some $cgeq 0$ and let $A={x:f(x)geq c}$.




Claim: $A$ is a special set.




Proof: It suffices to show that if $mu(B) = mu(A)$, then $rho(B)leq rho(A)$, with equality if and only if $B$ and $A$ differ by a set of measure zero. If $mu(B) = mu(A)$, then $mu(B-A) = mu(A-B)$. Now we write
$begin{align*} rho(A) - rho(B) &= int_A f(x) dmu - int_B f(x) dmu \\
&= int_{A-B} f(x) dmu - int_{B-A} f(x) dmu \\
&= int_{A-B} f(x) dmu - int_{A-B}c\,dmu - int_{B-A} f(x) dmu + int_{B-A}c\,dmu\\
&= int_{A-B} (f(x)-c) dmu - int_{B-A} (f(x)-c) dmu.
end{align*}$



By construction, $f(x) geq c$ on $A$ and $f(x) < c$ on $B-A$, so the first integral is nonnegative and the second integral is nonpositive, and is in fact negative unless $mu(B-A)=0$, in which case $mu(A-B)=0$ as well. Thus, $rho(A)-rho(B)geq 0$, with strict inequality unless $A$ and $B$ differ by measure zero, QED.

No comments:

Post a Comment