The definition I'm going to give isn't quite the concept I really want, but it's a good approximation. I don't want to make the definition too technical and specific because if there's a standard name for a slightly different definition, then I want to know about it.
Let be a measure space, and let be a probability measure on . I call a subset of special if for all measurable ,
- implies , and
- and implies up to measure zero (with respect to both and ).
What is the standard name for my "special" sets? Equivalently, one could stipulate and call "special" if it is essentially the unique maximizer of given that constraint.
Also equivalently, we could stipulate a particular -measure and consider sets achieving that -measure having the smallest possible -measure. That's probably the most intuitive way to think about this: we're looking for sets that contain a certain (heuristically: large) fraction of of the mass of but are as small as possible (with respect to ). That seems like a completely natural and obvious concept, which is why I think it should have a standard name. But I have almost no training in statistics, so I don't know what the name is.
This example might be far-fetched, but just to illustrate: suppose the FBI has knowledge that somebody is going to attempt a terrorist attack in a certain huge city at a particular hour. They might not know where, but they might have (some estimate of) a probability distribution for the location of the attack. They want to distribute agents strategically throughout the city, but they probably don't have enough agents to cover the entire city. Let's say every agent can forestall an attack if it occurs within a certain radius of his/her position (which is unrealistic, since the number of nearby agents surely also matters, but ignore that); then, to maximize the probability that the attack will be stopped, to an approximation, they should distribute their agents uniformly over a special subset of the city's area. To approach this from the other perspective, it could be the case that 99% of the mass of their probability distribution is contained in a region with very small area. (The one with the smallest area will be a special set.) Then, to save resources, if they're okay with 99:1 odds (c'est la vie), they might only distribute a relatively small number of agents to that small special region.
If has a density with respect to (when it makes sense to talk about such), then special sets are closely related to the superlevel sets of , i. e., sets of the form for . (I think they're basically the same, but specialness of is unaffected by changing by a set of measure zero, so a superlevel set actually corresponds to an equivalence class of special sets.) I mention this here because (1) the connection to superlevel sets is one of my reasons for caring about specialness, and (2) "superlevel sets of the density" is not the answer I'm looking for.
Example 1
Here's a very simple example in which special sets can be completely characterized. Let be a finite set, and let be counting measure on . Let be any probability distribution on , which necessarily has a density function , so by definition, and . Suppose that no two points have the same -value; then, without loss of generality, . It's easy to see that the special sets in this setup are exactly the sets , i. e., which contain the largest points as measured by , for . (Why: if you have some other candidate special set , then has the same -measure as but higher -measure, so can't be special.) It's easy to generalize this example to the case in which isn't necessarily one-to-one: you have to treat all points with the same -value as a block: either all of them are in the special set, or none of them are. (Otherwise, there's no way to satisfy the "uniqueness" part (point 2) of the definition.)
Example 2
Here's a generalization of the first example that hopefully clarifies what I said above. Let be some nice measure space on which integration of functions makes sense (like a Riemannian manifold, or just ). Let be a nonnegative integrable function with , and let be the probability measure , so is the density of with respect to . Fix some and let .
Claim: is a special set.
Proof: It suffices to show that if , then , with equality if and only if and differ by a set of measure zero. If , then . Now we write
By construction, on and on , so the first integral is nonnegative and the second integral is nonpositive, and is in fact negative unless , in which case as well. Thus, , with strict inequality unless and differ by measure zero, QED.
No comments:
Post a Comment