Measure Theory/Convexity and the Product-to-Sum Bound

Convexity
Recall from Lesson 0: Generalizing to Lp that we are interested in convexity because it may help us to find a bound for the product of any two real numbers. In particular, we would like to prove the inequality


 * $$w_1\ln a+w_2\ln b \le \ln(w_1a+w_2b)$$ for arbitrary $$a,b\in\Bbb R^+$$ and weights $$w_1,w_2\in\Bbb R^+$$ such that $$w_1+w_2=1$$

Why is this inequality an instance of convexity? It will help to think about convexity in the context of an abstract vector space.

Intuitively, convexity means that for any two points in the shape, the line-segment between them lies inside the shape. For example, any triangle is always convex, a circle is convex, but there does exist a rectangle which is not convex. A shape that is not convex is called concave.

Draw a convex shape, just with the intuitive definition.

More rigorously, let's consider any abstract vector space V. Naturally you want to primarily imagine $$\Bbb R^n$$ for $$n=1,2,3$$.

In this exercise, consider the vector spaces $$\Bbb R^n$$ for positive natural numbers n.

Prove that the open unit circle is convex (i.e. $$\{\vec v: \|\vec v\|< 1\}$$).

Also prove convex shapes are closed under positive scalar multiplication and translation. That is to say, if $$c\in \Bbb R^+$$ and $$S\subseteq V$$ is convex, then


 * $$cS = \{c\vec v:\vec v\in S\}$$

is convex. And if $$\vec w\in \Bbb R^n$$ then


 * $$\vec w+S = \{\vec w +\vec v: \vec v\in S\}$$

is convex.

Now trying to relate this all back to our mission of proving the inequality at the start, we want to have a notion of the convexity of a function. We then home that this notion relates to the logarithm and the inequality at the beginning.

Therefore the following definition seems reasonable:

One may wonder, as I did, why we define a function to be convex if its epigraph is convex, rather than, say, its "hypograph" (i.e. the set of points below the graph)? As far as I can tell this is a mere convention -- a coin was tossed and I guess we live with the legacy of defining convexity of functions by their epigraph rather than hypograph.

Nothing much hinges on this. Although the logarithm is not convex, its negative is convex, i.e. $$-\ln x$$ is a convex function, and that is enough to use the results that we will find about convex functions.

Show that a function's convexity is determined by its graph, in the sense that one does not need to consider any points not on the graph. More precisely, a function is convex if and only if for any points on the graph, the segment between them is contained in its epigraph.

Even more rigorously, let $$f:I\to\Bbb R$$ be a real function on an interval I. Also let $$a,b\in I$$. Note that any point on the segment between $$(a,f(a))$$ and $$(b,f(b))$$ is identified with a value of $$t\in[0,1]$$ such that the point is given by


 * $$t\begin{bmatrix}a\\f(a)\end{bmatrix}+(1-t)\begin{bmatrix}b\\f(b)\end{bmatrix}$$

(You will need to identify points with vectors, which is no big whoop, you just understand that $$(a,f(a))$$ corresponds to the vector $$\begin{bmatrix}a\\f(a)\end{bmatrix}$$.)

For the function to be convex, the point must be contained in the epigraph. For the point to be contained in the epigraph, this is the same as


 * $$\overbrace{f(ta+(1-t)b)}^{\text{on the graph}}\le \overbrace{tf(a)+(1-t)f(b)}^{\text{on the segment}}$$

So what you must show is that f is convex if and only if, for all $$a,b\in I$$ and $$t\in [0,1]$$,


 * $$ f(ta+(1-t)b)\le tf(a)+(1-t)f(b)$$

Suppose that f is twice differentiable on an interval I. Now show that f is convex if and only if the second derivative is positive throughout I.

Infer that $$x^2$$ and $$-\ln x$$ are both convex.

Product-to-Sum Bound
Finally we are in a position to deliver on what put us down this path so long ago!

Infer


 * $$w_1\ln a+w_2\ln b \le \ln(w_1a+w_2b)$$ for all $$a,b\in\Bbb R^+$$ and $$w_1,w_2\in\Bbb R^+$$ such that $$w_1+w_2=1$$

Infer further the weighted AM-GM inequality.


 * $$a^{w_1}b^{w_2} \le w_1a+w_2b$$

Recall that the point of proving the weighted AM-GM inequality was in the hopes that it will give us a useful bound on an arbitrary product of real numbers. Since we are interested in values $$1\le p$$, this seems primarily useful only if we consider $$1/p$$ in order to meet the condition on the weights, and correspondingly use the equation


 * $$\frac 1 p +\frac 1 q = 1$$

in order to have another weighted value, q, for which the theorem applies.

But note that if we consider the equality $$p=1$$ then there is no corresponding q. Could we take $$q=\infty$$ in some sense? We will return to this idea in the last section, so for now only consider $$1<p$$.

Use this to infer a product-to-sum bound for the product of any two real numbers.

Hint: We said in Lesson 0: Generalizing to $$L^p$$ that we could expect to have powers of p in our bound, since it is fine if our result is related to the $$\|\cdot\|_p$$ norm. So replace a in the weighted AM-GM inequality by $$a^{p}$$ and so on.