Applied linear operators and spectral methods/Lecture 2

Norms in inner product spaces
Inner product spaces have $$L_p$$ norms which are defined as

\lVert\mathbf{x}\rVert_{p} = \langle \mathbf{x}, \mathbf{x} \rangle^{1/p}, \quad p=1, 2, \dots\infty $$ When $$p = 1$$, we get the $$L_1$$ norm

\lVert\mathbf{x}\rVert_{1} = \langle \mathbf{x}, \mathbf{x} \rangle $$ When $$p = 2$$, we get the $$L_2$$ norm

\lVert\mathbf{x}\rVert_{2} = \sqrt{\langle \mathbf{x}, \mathbf{x} \rangle} $$ In the limit as $$p \rightarrow \infty$$ we get the $$L_\infty$$ norm or the sup norm

\lVert\mathbf{x}\rVert_{\infty} = max|x_k| $$ The adjacent figure shows a geometric interpretation of the three norms.

If a vector space has an inner product then the norm

\lVert\mathbf{x}\rVert = \sqrt{\langle \mathbf{x}, \mathbf{x} \rangle} = \lVert\mathbf{x}\rVert_2 $$ is called the  induced norm. Clearly, the induced norm is nonnegative and zero only if $$\mathbf{x} = \mathbf{0}$$. It is also linear under multiplication by a positive vector. You can think of the induced norm as a measure of length for the vector space.

So useful results that follow from the definition of the norm are discussed below.

Schwarz inequality
In an inner product space

|\langle \mathbf{x}, \mathbf{y} \rangle| \le \lVert\mathbf{x}\rVert~\lVert\mathbf{y}\rVert $$

 Proof

This statement is true if $$\mathbf{y} = \mathbf{0}$$.

If $$\mathbf{y} \ne \mathbf{0}$$ we have

0 < \lVert\mathbf{x} - \alpha~\mathbf{y}\rVert^2 = \langle (\mathbf{x} - \alpha~\mathbf{y}), (\mathbf{x} - \alpha~\mathbf{y}) \rangle = \langle \mathbf{x}, \mathbf{x} \rangle - \langle \mathbf{x}, \alpha~\mathbf{y} \rangle - \langle \alpha~\mathbf{y}, \mathbf{x} \rangle + |\alpha^2|~\langle \mathbf{y}, \mathbf{y} \rangle $$ Now

\langle \mathbf{x}, \alpha~\mathbf{y} \rangle + \langle \alpha~\mathbf{y}, \mathbf{x} \rangle = \overline{\alpha}~\langle \mathbf{x}, \mathbf{y} \rangle + \alpha~\langle \mathbf{x}, \mathbf{y} \rangle = 2~\text{Re}(\alpha)~\langle \mathbf{x}, \mathbf{y} \rangle $$ Therefore,

\lVert\mathbf{x}\rVert^2 - 2~\text{Re}(\alpha)\langle \mathbf{x}, \mathbf{y} \rangle + |\alpha^2|~\lVert\mathbf{y}\rVert^2 > 0 $$ Let us choose $$\alpha$$ such that it minimizes the left hand side above. This value is clearly

\alpha = \cfrac{\langle \mathbf{x}, \mathbf{y} \rangle}{\lVert\mathbf{y}\rVert^2} $$ which gives us

\lVert\mathbf{x}\rVert^2 - 2~\cfrac{|\langle \mathbf{x}, \mathbf{y} \rangle|^2}{\lVert\mathbf{y}\rVert^2} + \cfrac{|\langle \mathbf{x}, \mathbf{y} \rangle|^2}{\lVert\mathbf{y}\rVert^2} > 0 $$ Therefore,

\lVert\mathbf{x}\rVert^2~\lVert\mathbf{y}\rVert^2 \ge  |\langle \mathbf{x}, \mathbf{y} \rangle|^2 \qquad \square $$

Triangle inequality
The triangle inequality states that

\lVert\mathbf{x} + \mathbf{y}\rVert \le \lVert\mathbf{x}\rVert + \lVert\mathbf{y}\rVert $$

Proof



\lVert\mathbf{x} + \mathbf{y}\rVert^2 = \lVert\mathbf{x}\rVert^2 + 2\text{Re}\langle \mathbf{x}, \mathbf{y} \rangle + \lVert\mathbf{y}\rVert^2 $$ From the Schwarz inequality

\lVert\mathbf{x} + \mathbf{y}\rVert^2 < \lVert\mathbf{x}\rVert^2 + 2\lVert\mathbf{x}\rVert\lVert\mathbf{y}\rVert + \lVert\mathbf{y}\rVert^2 = (\lVert\mathbf{x}\rVert + \lVert\mathbf{y}\rVert)^2 $$ Hence

\lVert\mathbf{x} + \mathbf{y}\rVert \le \lVert\mathbf{x}\rVert + \lVert\mathbf{y}\rVert \qquad \square $$

Angle between two vectors
In $$\mathbb{R}^2$$ or $$\mathbb{R}^3$$ we have

\cos\theta = \cfrac{\langle \mathbf{x}, \mathbf{y} \rangle}{\lVert\mathbf{x}\rVert \lVert\mathbf{y}\rVert} $$ So it makes sense to define $$\cos\theta$$ in this way for any real vector space.

We then have

\lVert\mathbf{x} + \mathbf{y}\rVert^2 = \lVert\mathbf{x}\rVert^2 + 2~\lVert\mathbf{x}\rVert~\lVert\mathbf{y}\rVert\cos\theta + \lVert\mathbf{y}\rVert^2 $$

Orthogonality
In particular, if $$\cos\theta = 0$$ we have an analog of the Pythagoras theorem.

\lVert\mathbf{x} + \mathbf{y}\rVert^2 = \lVert\mathbf{x}\rVert^2 + \lVert\mathbf{y}\rVert^2 $$ In that case the vectors are said to be orthogonal.

If $$\langle \mathbf{x}, \mathbf{y} \rangle = 0$$ then the vectors are said to be orthogonal even in a complex vector space.

Orthogonal vectors have a lot of nice properties.

Linear independence of orthogonal vectors

 * A set of nonzero orthogonal vectors is linearly independent.

If the vectors $$\boldsymbol{\varphi}_i$$ are linearly dependent

\alpha_1~\boldsymbol{\varphi}_1 + \alpha_2~\boldsymbol{\varphi}_2 + \dots + \alpha_n~\boldsymbol{\varphi}_n = 0 $$ and the $$\boldsymbol{\varphi}_i$$ are orthogonal, then taking an inner product with $$\boldsymbol{\varphi}_j$$ gives

\alpha_j~\langle \boldsymbol{\varphi}_j, \boldsymbol{\varphi}_j \rangle = 0 \quad \implies \quad \alpha_j = 0 ~\forall j $$ since

\langle \boldsymbol{\varphi}_i, \boldsymbol{\varphi}_j \rangle = 0 \quad \text{if}~ i \ne j ~. $$ Therefore the only nontrivial case is that the vectors are linearly independent.

Expressing a vector in terms of an orthogonal basis
If we have a basis $$\{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\}$$ and wish to express a vector $$\mathbf{f}$$ in terms of it we have

\mathbf{f} = \sum_{j=1}^n \beta_j~\boldsymbol{\varphi}_j $$ The problem is to find the $$\beta_j$$s.

If we take the inner product with respect to $$\boldsymbol{\varphi}_i$$, we get

\langle \mathbf{f}, \boldsymbol{\varphi}_i \rangle = \sum_{j=1}^n \beta_j~\langle \boldsymbol{\varphi}_i, \boldsymbol{\varphi}_j \rangle $$ In matrix form,

\boldsymbol{\eta} = \boldsymbol{B}~\boldsymbol{\beta} $$ where $$B_{ij} = \langle \boldsymbol{\varphi}_i, \boldsymbol{\varphi}_j \rangle$$ and $$\eta_i = \langle \mathbf{f}, \boldsymbol{\varphi}_i \rangle$$.

Generally, getting the $$\beta_j$$s involves inverting the $$n \times n$$ matrix $$\boldsymbol{B}$$, which is an identity matrix $$\boldsymbol{I_n}$$, because $$\langle \boldsymbol{\varphi}_i, \boldsymbol{\varphi}_j \rangle = \boldsymbol{\delta}_{ij}$$, where $$\boldsymbol{\delta}_{ij}$$ is the Kronecker delta.

Provided that the $$\boldsymbol{\varphi}_i$$s are orthogonal then we have

\beta_j = \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi}_j \rangle}{\lVert\boldsymbol{\varphi}_j\rVert^2} $$ and the quantity

\mathbf{p} = \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi}_j \rangle}{\lVert\boldsymbol{\varphi}_j\rVert^2}~\boldsymbol{\varphi}_j $$ is called the  projection of $$\mathbf{f}$$ onto $$\boldsymbol{\varphi}_j$$.

Therefore the sum

$$\mathbf{f} = \sum_j \beta_j~\boldsymbol{\varphi}_j$$

says that $$\mathbf{f}$$ is just a sum of its projections onto the orthogonal basis.

Let us check whether $$\mathbf{p}$$ is actually a projection. Let

\mathbf{a} = \mathbf{f} - \mathbf{p} = \mathbf{f} - \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi} \rangle}{\lVert\boldsymbol{\varphi}\rVert^2}~\boldsymbol{\varphi} $$ Then,

\langle \mathbf{a}, \boldsymbol{\varphi} \rangle = \langle \mathbf{f}, \boldsymbol{\varphi} \rangle - \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi} \rangle}{\lVert\boldsymbol{\varphi}\rVert^2}~\langle \boldsymbol{\varphi}, \boldsymbol{\varphi} \rangle = 0 $$ Therefore $$\mathbf{a}$$ and $$\boldsymbol{\varphi}$$ are indeed orthogonal.

Note that we can normalize $$\boldsymbol{\varphi}_i$$ by defining

\tilde{\boldsymbol{\varphi}}_i = \cfrac{\boldsymbol{\varphi}_i}{\lVert\boldsymbol{\varphi}_i\rVert} $$ Then the basis $$\{\tilde{\boldsymbol{\varphi}}_1, \tilde{\boldsymbol{\varphi}}_2, \dots, \tilde{\boldsymbol{\varphi}}_n\}$$ is called an  orthonormal basis.

It follows from the equation for $$\beta_j$$ that

\tilde{\beta}_j = \langle \mathbf{f}, \tilde{\boldsymbol{\varphi} \rangle_j} $$ and

\mathbf{f} = \sum_{j=1}^n \tilde{\beta}_j~\tilde{\boldsymbol{\varphi}}_j $$ You can think of the vectors $$\tilde{\boldsymbol{\varphi}}_i$$ as orthogonal unit vectors in an $$ n $$-dimensional space.

Biorthogonal basis
However, using an orthogonal basis is not the only way to do things. An alternative that is useful (for instance when using wavelets) is the  biorthonormal basis.

The problem in this case is converted into one where, given any basis $$\{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\}$$, we want to find another set of vectors $$\{\boldsymbol{\psi}_1, \boldsymbol{\psi}_2, \dots, \boldsymbol{\psi}_n\}$$ such that

\langle \boldsymbol{\varphi}_i, \boldsymbol{\psi}_j \rangle = \delta_{ij} $$ In that case, if

\mathbf{f} = \sum_{j=1}^n \beta_j~\boldsymbol{\varphi}_j $$ it follows that

\langle \mathbf{f}, \boldsymbol{\psi}_k \rangle = \sum_{j=1}^n \beta_j~\langle \boldsymbol{\varphi}_j, \boldsymbol{\psi}_k \rangle = \beta_k $$ So the coefficients $$\beta_k$$ can easily be recovered. You can see a schematic of the two sets of vectors in the adjacent figure.

Gram-Schmidt orthogonalization
One technique for getting an orthogonal baisis is to use the process of  Gram-Schmidt orthogonalization.

The goal is to produce an orthogonal set of vectors $$\{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\}$$ given a linearly independent set $$\{\mathbf{x}_1, \mathbf{x}_2, \dots, \mathbf{x}_n\}$$.

We start of by assuming that $$\boldsymbol{\varphi}_1 = \mathbf{x}_1$$. Then $$\boldsymbol{\varphi}_2$$ is given by subtracting the projection of $$\mathbf{x}_2$$ onto $$\boldsymbol{\varphi}_1$$ from $$\mathbf{x}_2$$, i.e.,

\boldsymbol{\varphi}_2 = \mathbf{x}_2 - \cfrac{\langle \mathbf{x}_2, \boldsymbol{\varphi}_1 \rangle}{\lVert\boldsymbol{\varphi}_1\rVert^2}~\boldsymbol{\varphi}_1 $$ Thus $$\boldsymbol{\varphi}_2$$ is clearly orthogonal to $$\boldsymbol{\varphi}_1$$. For $$\boldsymbol{\varphi}_3$$ we use

\boldsymbol{\varphi}_3 = \mathbf{x}_3 - \cfrac{\langle \mathbf{x}_3, \boldsymbol{\varphi}_1 \rangle}{\lVert\boldsymbol{\varphi}_1\rVert^2}~\boldsymbol{\varphi}_1 - \cfrac{\langle \mathbf{x}_3, \boldsymbol{\varphi}_2 \rangle}{\lVert\boldsymbol{\varphi}_2\rVert^2}~\boldsymbol{\varphi}_2 $$ More generally,

\boldsymbol{\varphi}_n = \mathbf{x}_n - \sum_{j=1}^{n-1} \cfrac{\langle \mathbf{x}_n, \boldsymbol{\varphi}_j \rangle}{\lVert\boldsymbol{\varphi}_j\rVert^2}~\boldsymbol{\varphi}_j $$ If you want an orthonormal set then you can do that by normalizing the orthogonal set of vectors.

We can check that the vectors $$\boldsymbol{\varphi}_j$$ are indeed orthogonal by induction. Assume that all $$\boldsymbol{\varphi}_j, ~ j \le n-1$$ are orthogonal for some $$j$$. Pick $$k < n$$. Then

\langle \boldsymbol{\varphi}_n, \boldsymbol{\varphi}_k \rangle = \langle \mathbf{x}_n, \boldsymbol{\varphi}_k \rangle - \sum_{j=1}^{n-1} \cfrac{\langle \mathbf{x}_n, \boldsymbol{\varphi}_j \rangle}{\lVert\boldsymbol{\varphi}_j\rVert^2}~\langle \boldsymbol{\varphi}_j, \boldsymbol{\varphi}_k \rangle $$ Now $$\langle \boldsymbol{\varphi}_j, \boldsymbol{\varphi}_k \rangle = 0$$ unless $$j = k$$. However, at $$j = k$$, $$\langle \boldsymbol{\varphi}_n, \boldsymbol{\varphi}_k \rangle = 0$$ because the two remaining terms cancel out. Hence the vectors are orthogonal.

Note that you have to be careful while numerically computing an orthogonal basis using the Gram-Schmidt technique because the errors add up in the terms under the sum.

Linear operators
The object $$\boldsymbol{A}$$ is a  linear operator from $$\mathcal{S}$$ onto $$\mathcal{S}$$ if

\boldsymbol{A}~\mathbf{x} \equiv \boldsymbol{A}(\mathbf{x}) \in \mathcal{S} \quad \forall~\mathbf{x}\in\mathcal{S} $$

A linear operator satisfies the properties


 * 1) $$\boldsymbol{A}~(\alpha~\mathbf{x}) = \alpha~\boldsymbol{A}(\mathbf{x})$$.
 * 2) $$\boldsymbol{A}~(\mathbf{x}+\mathbf{y}) = \boldsymbol{A}(\mathbf{x}) + \boldsymbol{A}(\mathbf{y})$$.

Note that $$\boldsymbol{A}$$ is independent of basis. However, the action of $$\boldsymbol{A}$$ on a basis $$\{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\}$$ determines $$\boldsymbol{A}$$ completely since

\boldsymbol{A}~\mathbf{f} = \boldsymbol{A}~\left(\sum_j \beta_j~\boldsymbol{\varphi}_j\right) = \sum_j \beta_j~\boldsymbol{A}(\boldsymbol{\varphi}_j) $$ Since $$\boldsymbol{A}~\boldsymbol{\varphi}_j \in \mathcal{S}$$ we can write

\boldsymbol{A}~\boldsymbol{\varphi}_j = \sum_i A_{ij}~\varphi_i $$ where $$A_{ij}$$ is the $$n \times n$$ matrix representing the operator $$\boldsymbol{A}$$ in the basis $$\{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\}$$.

Note the location of the indices here which is not the same as what we get in matrix multiplication. For example, in $$\text{Re}^2$$, we have

\boldsymbol{A}~\mathbf{e}_2 = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} A_{12} \\ A_{22} \end{bmatrix} = A_{12}~ \begin{bmatrix} 1 \\ 0 \end{bmatrix} + A_{22}~ \begin{bmatrix} 0 \\ 1 \end{bmatrix} = A_{12}~\mathbf{e}_1 + A_{22}~\mathbf{e}_2 = A_{ij}~\mathbf{e}_i $$

We will get into more details in the next lecture.