Measure Theory/Fourier and the Need for Swaparoo

Fourier and the Need for Swaparoo
Note: This lesson is mostly to provide context and meaning to measure theory. Therefore, it is not stated with full and complete rigor. It is intentionally written in a kind of "discovery phase" where ideas become progressively clearer. This lesson is meant only to set us on the start of the road to measure theory, after which statements, theorems, and proofs will become more formal.

Fourier Series in Very Brief
A shockingly large amount of modern mathematics has its roots in Fourier series. For measure theory, the ancestry could not be more direct.

Since this is not a course on Fourier series we will try to know only the absolute minimum of Fourier analysis which will still allow us to appreciate the reason why measure theory became necessary.

From a physics perspective, Fourier series gives tools to take a complicated wave form, and decompose it into simple sinusoidal waves. It is easy to go the other way, and take a few simple sinusoidal waves like
 * $$2\cos x$$
 * $$-3\sin x$$
 * $$\cos(2x)$$

and then just sum them to obtain a "complex" wave


 * $$2\cos x-3\sin x+\cos(2x)$$

But going the other way around is tougher. Imagine observing a wave (or, mathematically, just having any periodic function), and then trying to discover its representation as a combination of sinusoids.

We will not try to do this with any real depth. Still, it is worth appreciating how foundational waves are in physics. Sinusoids describe the motion of the waves that we see daily. But also waves describe sound, light, springs, and much more physical phenomena. Therefore, there was and is good reason to study Fourier series.

We define the Fourier series of a function f to be the infinite series


 * $$ f(x) = a_0 + \sum_{k=1}^\infty a_k\cos(kx) + \sum_{k=1}^\infty b_k\sin(kx) $$

if there are coefficients $$a_0,a_k,b_k$$ which make this equation correct.

For now we can assume that the quantities here are all real numbers. Modern studies in Fourier series extend these ideas, but that effort won't be necessary for us.

Also, we will restrict the domain of all functions to the interval $$[0,2\pi]$$. The reasons for this need not concern us, although you're likely to learn about it in a course on Fourier analysis.

In the expression


 * $$2\cos x-3\sin x+\cos(2x)$$

identify the indexed coefficients of its Fourier series. That is to say, identify


 * $$a_0,a_1,a_2,\dots$$

and


 * $$b_1,b_2,\dots$$

When we think of "finding" the Fourier series of f, we are interested in determining the coefficients $$a_0,a_k,b_k$$ by using our knowledge of the function $$f(x)$$.

Fourier (the eponymous discoverer of Fourier series) realized that one could find $$a_n$$ by computing a certain integral. Namely, $$\int_0^{2\pi}f(x)\cos(nx)\ dx$$.

Now why does computing this help at all? The following exercises try to illuminate this progressively.

Show that $$\int_0^{2\pi} \cos(kx)\sin(nx)\ dx = 0$$ for every choice of $$k,n\in\Bbb N^+$$.

It will be easiest to recall the sinusoidal product formula for an arbitrary $$\cos\alpha\sin\beta$$.

Then show that if $$k,n\in\Bbb N$$ and $$k\ne n$$ then $$\int_0^{2\pi} \cos(kx)\cos(nx)\ dx = 0$$.

Then show that if $$k=n$$ then $$\int_0^{2\pi}\cos(kx)\cos(nx)\ dx =\pi$$.

Then show the same for sine.

Suppose that the function $$f:[0,2\pi]\to\Bbb R$$ has a valid Fourier series, which is to say, there exist coefficients for which the equation


 * $$f(x) = a_0+\sum_{k=1}^\infty a_k\cos(kx)+\sum_{k=1}^\infty b_k\sin(kx)$$

is true. Then if we multiply both sides by cosine and integrate, we obtain


 * $$\int_0^{2\pi} f(x)\cos(nx) \ dx = \int_0^{2\pi} \left(a_0+\sum_{k=1}^\infty a_k\cos(kx) + \sum_{k=1}^\infty b_k\sin(kx)\right)\cos(nx) \ dx$$

Fourier inferred that this equals


 * $$ \int_0^{2\pi} a_0\cos nx \ dx + \sum_{k=1}^\infty a_k\int_0^{2\pi}\cos(kx)\cos(nx) \ dx + \sum_{k=1}^\infty b_k \int_0^{2\pi} \sin(kx)\cos(nx) \ dx $$

Using all of the assumptions above, compute one more integral, and use the earlier exercises, to show that the above reduces to just


 * $$ \int_0^{2\pi}f(x)\cos(nx)\ dx = a_n\pi $$

Then by merely dividing by $$\pi$$, obtain a formula for $$a_n$$.

The Swaparoo
The above is beautiful, simple, and powerful ... but ultimately flawed.

The problem comes from the step at which we go from


 * $$\int_0^{2\pi} \left(a_0+\sum_{k=1}^\infty a_k\cos(kx) + \sum_{k=1}^\infty b_k\sin(kx)\right)\cos(nx) \ dx$$

to


 * $$ \int_0^{2\pi} a_0\cos nx \ dx + \sum_{k=1}^\infty a_k\int_0^{2\pi}\cos(kx)\cos(nx) \ dx + \sum_{k=1}^\infty b_k \int_0^{2\pi} \sin(kx)\cos(nx) \ dx $$

This is a bit messy, so let's focus only on what's important, in a cleaner and more abstract setting.

In general, consider any sequence of integrable functions $$\langle f_k\rangle$$. Is the following justified?


 * $$ \int_a^b\sum_{k=1}^\infty f_k(x) \ dx =\sum_{k=1}^\infty \int_a^b f_k(x)\ dx$$

The equation is certainly true when the sum is finite. Here is how that looks:


 * $$ \int_a^b\sum_{k=1}^m f_k(x) \ dx = \sum_{k=1}^m \int_a^b f_k(x) \ dx $$

The above is valid simply because we know that integration distributes over a sum: $$\int_a^b(f+g)\ dx = \int_a^b f\ dx+\int_a^b g \ dx$$. Applying this $$m-1$$ times justifies the interchange of $$\int_a^b$$ and $$\sum_{k=1}^m$$ above.

But if the sum is infinite, note that the infinite sum is the same as a certain limit. Therefore if we write the finite equation which we know to be true,


 * $$\int_a^b \sum_{k=1}^m f_k(x)\ dx=\sum_{k=1}^m\int_a^b f_k(x) \ dx$$

and then take the limit as $$m\to\infty$$ then the right-hand side of the equation looks like the object that we want.


 * $$\lim_{m\to\infty}\int_a^b\sum_{k=1}^m f_k(x) \ dx = \sum_{k=1}^\infty\int_a^b f(x)\ dx$$

If we know that the limit were free to distribute into an integral, then we could write the thing that we actually want,


 * $$\int_a^b\sum_{k=1}^\infty f_k(x)\ dx = \sum_{k=1}^\infty\int_a^b f(x)\ dx$$

So really, the heartache comes from not knowing if we are justified in swapping a limit and an integral. The fundamental problem, from which measure theory originated, was trying to make the move


 * $$\lim_{m\to\infty}\int_a^b g_m(x)\ dx = \int_a^b \lim_{m\to\infty} g_m(x)\ dx$$

Although the original interest was in functions of the form $$g_m(x)=\sum_{k=1}^mf_k(x)$$, nothing much is lost by ignoring the issue of the sum and focusing abstractly on just any sequence of functions, $$\langle g_m\rangle$$.

In this exercise, we will see that the concern about the swaparoo is not merely theoretical. It really is not valid for some sequences of functions.

Set $$g_m(x) = \frac{1}{1+mx}$$.

1. Show that for any fixed $$x\in[0,\infty)$$, the limit $$\lim_{m\to\infty}g_m(x) = 0$$.  Infer that    $$\int_0^\infty\lim_{m\to\infty}g_m(x)\ dx = 0$$

2. Show that for a fixed $$m\in\Bbb Z^+$$, the integral $$\int_0^\infty g_m(x)\ dx = \infty$$. Infer that $$\lim_{m\to\infty}\int_0^\infty g_m(x)\ dx = \infty$$

3. Reflect upon what this exercise was meant to demonstrate.

A Brief Prehistory of the Swaparoo
The invalidity of the "swaparoo" launched a program among mathematicians, starting near the end of the 19th century. They sought to find conditions that the functions $$g_m$$ could satisfy, which would then guarantee the equation


 * $$\lim_{m\to\infty}\int_a^b g_m(x)\ dx = \int_a^b \lim_{m\to\infty}g_m(x)\ dx$$

These mathematicians found one very significant result. If the functions converge uniformly to an integrable function, and if they all share the same compact domain, then this swaparoo is valid.

This was a nice and productive result! However, it was imperfect because there were some sequences of functions for which the convergence was not uniform, and yet we still desired to know if the swaparoo is valid. The condition of uniform convergence was, after all, a sufficient but not a necessary condition.

Mathematicians later made progress in specifying further and further conditions on functions, which would ensure the validity of the swaparoo.

But the results continued to be weak and not capture classes of functions that mathematicians wanted to study. Moreover, the proofs became so long and subtle that simply managing the complexity of the forest of theorems became burdensome.

Lebesgue's New Integral
It was in this context that the mathematician Lebesgue had an idea about how to solve these difficulties.

It will be useful to contrast it with the standard Riemann integral, so recall how that is constructed.

Summary of Riemann Integration
Consider a function on a closed bounded interval,


 * $$a,b\in\Bbb R, \quad a<b,\quad f:[a,b]\to\Bbb R$$.

First we partition the interval with any $$\mathcal P = \{a=x_0 < x_1 < x_2 < \cdots < x_n=b\}$$.

On each interval $$[x_i,x_{i+1}]$$ we use f (either at the left end-point, or the right, or anywhere else) to determine a height.

Then we compute the area of the rectangle determined by this interval and height. Then we sum all of the rectangles to get an approximation of the area under the curve. Then we take finer and finer partitions $$\mathcal P$$ resulting in more accurate approximations.

Then we consider some kind of limiting procedure, or perhaps take a supremum and infimum. (There are several different equivalent constructions of these integrals, and it's not important to focus on exactly which strategy we use here.)

The important summary of how this integral is constructed is:
 * Partition the domain,
 * use the partition to determine heights,
 * use these to determine rectangle areas,
 * then take a limit of the procedure as the partition goes to zero.

Let $$f(x)=x^2+1$$ on $$[-1,1]$$. Use the partition $$\mathcal P = \{-1,0,1\}$$ to compute the corresponding lower Riemann sum.

Lebesgue Integration
Lebesgue integration reverses part of the Riemann process.

Rather than first partitioning the domain, this time we start from partitioning the range of the function. It will be useful to represent this partition as a collection of intervals.

Using the same function and domain as in ''Exercise 5. Do a Quick Riemann'', we may choose the partition


 * $$\mathcal P=\{(-\infty,0],(0,1],(1,2],(2,\infty)\}$$

(This partition is chosen arbitrarily, merely for demonstration purposes.)

Now from this partition of the range, we would like to find corresponding regions of the domain. Then we will construct rectangles that approximate the area under the curve.

To each cell of the partition of the range, there corresponds a preimage in the domain.


 * $$E_1 = f^{-1}((-\infty,0])$$
 * $$E_2 = f^{-1}((0,1])$$
 * $$E_3 = f^{-1}((1,2])$$
 * $$E_4 = f^{-1}((2,\infty))$$

We can compute these:


 * $$ E_1 = \emptyset $$
 * $$ E_2 = \{0\} $$
 * $$ E_3 = [-1,0)\sqcup (0,1] $$
 * $$ E_4 = \emptyset $$

The preimages are like a sides of rectangles, which approximate the area under f.

If we take each preimage and write $$\lambda(E_1),\dots,\lambda(E_4)$$ to denote their lengths (understood somewhat intuitively for now), then


 * $$\lambda(E_1)=0$$
 * $$\lambda(E_2)=0$$
 * $$\lambda(E_3) = 1+1=2$$
 * $$\lambda(E_4) = 0$$

We will use the infimum of the function on each preimage, to obtain rectangle heights. This will cause us to under-approximate the area under the function.


 * $$\inf_{x\in E_1} f(x)$$ does not exist.
 * $$\inf_{x\in E_2} f(x)=1$$
 * $$\inf_{x\in E_3} f(x)=1$$
 * $$\inf_{x\in E_4} f(x)$$ does not exist.

Where the domain set is empty, we simply discard the set.

Or if you prefer to think of it this way, we set the corresponding rectangle area to 0.

Therefore we obtain the approximation


 * $$ \overbrace{\lambda(E_1)}^{\text{width}_1}\overbrace{\left(\inf_{x\in E_1}f(x)\right)}^{\text{height}_1}+\lambda(E_2)\left(\inf_{x\in E_2}f(x)\right)+\lambda(E_3)\left(\inf_{x\in E_3}f(x)\right)+\lambda(E_4)\left(\inf_{x\in E_4}f(x)\right)$$

If an infimum is actually undefined then, as we said above, we set the corresponding term to zero.

Then we obtain


 * $$\overbrace{\lambda(E_1)\left(\inf_{x\in E_1}f(x)\right)}^{0}+\overbrace{\lambda(E_2)\left(\inf_{x\in E_2}f(x)\right)}^{0\cdot 1}+\overbrace{\lambda(E_3)\left(\inf_{x\in E_3}f(x)\right)}^{2\cdot 1}+\overbrace{\lambda(E_4)\left(\inf_{x\in E_4}f(x)\right)}^0$$


 * $$ = 2$$

Predictably, now that we have a system for constructing approximations, we then take the integral to be, in some sense, the limit as the partition widths tend to zero.

At this point, the whole idea of this new approach to integration is just a "shot in the dark". We have no reason to think that it will be any better, or really any different from Riemann integration.

For the same example as in ''Exercise 5. Do a Quick Riemann'', this time use the partition $$\{(-\infty,1),[1,1.5],(1.5,2),[2,\infty)\}$$ to compute an under-estimate of the area under the curve.

Measuring Preimages
The reason why Lebesgue invented this new form of integration was with the hope that it might somehow ensure the interchange of limit and integral.

Let us summarize how one computes a single approximation in Lebesgue's way.

1. Take any partition of the range of f.

2. Use this partition to determine preimages of each cell.

3. The "length" of the preimage is a "rectangle" width.

4. The "height" of the "rectangle" is the infimum of the function, over the preimage.

5. Sum the "rectangle" areas to get an approximate area under the curve.

6. Take a limit of this procedure as the partitions go to zero.

Every step of this procedure is straight-forward and does not require too much conversation, with the exception of step 3.

But that exception turns out to be a somewhat monumental challenge.

If we imagine that the preimage is just some set of real numbers, then we need to develop a system of measuring sets of real numbers. Whatever system we choose must be consistent with the measurement of intervals being equal to their length. That will hopefully ensure that we obtain something like rectangle areas.

For certain sets, we can make reasonable guesses about what the measure should be. Of course, if I is an interval with finite bounds $$a<b$$, then the measure of I should probably be its length, $$b-a$$.

What is a reasonable measure of $$[0,1]\sqcup [3,4]$$? Just add the individual lengths. So it should be $$1+1=2$$.

But sets of real numbers can get quite weird. What about the measure of the set of all rational numbers? It is a countable set, but it is infinite. Maybe its measure should be infinity? Maybe zero? Maybe somewhere in-between?

We have now done enough "reasonable guessing" and we need to start coming up with a rigorous system of measurement. That is the focus of the first section of this course.

Outline of the Course
The general structure of this course will be to

1. Find a rigorous system of measuring sets of real numbers.

2. Use this system of measurement to build a new theory of integration.

3. Show that this integration has the desired limit-integral exchange properties.

4. Study how the new system of integration interacts with differentiation (just to make sure everything's on the up-and-up, and maybe learn a thing or two).

5. Study how this system of integration can be used to measure the "distance between functions", which becomes the study of $$L^2(E)$$ function spaces.

6. Generalize $$L^2(E)$$ to $$L^p(E)$$ for $$1\le p\le \infty$$.