Thursday, 9 August 2007

Fourier transform for dummies

One of the main uses of Fourier transforms is to diagonalize convolutions. In fact, many of the most useful properties of the Fourier transform can be summarized in the sentence "the Fourier transform is a unitary change of basis for functions (or distributions) that diagonalizes all convolution operators." I've been ambiguous about the domain of the functions and the inner product. The domain is an abelian group, and the inner product is the L2 inner product with respect to Haar measure. (There are more general definitions of the Fourier transform, but I won't attempt to deal with those.)



I think a good way to motivate the definition of convolution (and thus eventually of the Fourier transform) starts with probability theory. Let's say we have an abelian group (G, +, -, 0) and two independent random variables X and Y that take values in G, and we are interested in the value of X + Y. For simplicity, let's assume G = {x1, ..., xn} is finite. For example, X and Y could be (possibly biased) six-sided dice, which we can roll to get two independent elements of Z/6Z. The sum of the die rolls mod 6 gives another element of the group.



For x ∈ G, let f(x) be the probability P(X = x), and let g(x) = P(Y = x). What we care about is h(x) := P(X + Y = x). We can compute this as a sum of joint probabilities:



h(x) = P(X + Y = x) = Σy+z=xP(X = y & Y = z)



However, since X and Y are independent, P(X = y & Y = z) = P(X = y)P(Y = z) = f(y)g(z), so the sum is actually



h(x) = Σy+z=xf(y)g(z) = Σy∈Gf(y)g(x-y).



This is called the convolution of f and g and denoted by f*g. In words, the convolution of two probability distributions is the probability distribution of the sum of two independent random variables having those respective distributions. From that, one can deduce easily that convolution satisfies nice properties: commutativity, associativity, and the existence of an identity. Moreover, convolution has the same relationship to addition and scalar multiplication as pointwise multiplication does (namely, bilinearity). In the finite setting, there's also an obvious L2 inner product on distributions, with respect to which, for each f, the transformation g -> f * g is normal. Since such transformations also commute, recalling a big theorem from finite-dimensional linear algebra, we know there's an orthonormal basis with respect to which all of them are diagonal. It's not difficult to deduce then that in such a basis, convolution must be represented by coordinatewise multiplication. That basis is the Fourier basis, and the process of obtaining the coordinates in the Fourier basis from coordinates in the standard basis (the values f(x) for x ∈ G) is the Fourier transform. Since both bases are orthonormal, that transformation is unitary.



If G is infinite, then much of the above has to be modified, but a lot of it still works. (Most importantly, for now, the intuition works.) For example, if G = Rn, then the sum Σy∈Gf(y)g(x-y) must be replaced by the integral ∫y∈Gf(y)g(x-y)dy to define convolution, or even more generally, by Haar integration over G. The Fourier "basis" still has the important property of representing convolution by "coordinatewise" (or pointwise) multiplication and therefore of diagonalizing all convolution operators.



The fact that the Fourier transform diagonalizes convolutions has more implications than may appear at first. Sometimes, as above, the operation of convolution is itself of interest, but sometimes one of the arguments (say f) is fixed, and we want to study the transformation T(g) := f*g as a linear transformation of g. A lot of common operators fall into this category. For example:



  • Translation: T(g)(x) = g(x-a) for some fixed a. This is convolution with a "unit mass" at a.

  • Differentiation: T(g)(x) = g'(x). This is convolution with the derivative of a negative unit mass at 0.

  • Indefinite integration (say on R): T(g)(x) = ∫x-infinityg(t)dt. This is convolution with the Heaviside step function.

In the Fourier basis, all of those are therefore represented by pointwise multiplication by an appropriate function (namely the Fourier transform of the respective convolution kernel). That makes Fourier analysis very useful, for example, in studying differential operators.

No comments:

Post a Comment