User:SamHB/MVCalc3



This is the third and final lecture in multivariable calculus.

=Generalized Vectors=

In this lecture we will examine in detail the topic of vectors in generalized (curvilinear) coordinate systems. In the previous lecture we gave the definition for infinitesimally tiny vectors, that is, we showed the connection between the vector components (that is, the triple of numbers that are used to describe a vector when doing calculations) and the geometric meaning.

A few very important points to remember:
 * A given point in space may have different coordinates (that is, the triple of numbers that are used to describe the point) when different coordinate systems are used.
 * Furthermore, a given vector at a given point in space may have different components according to different coordinate systems.

The way the components of a vector are measured depends on the coordinate system, according to a fixed rule. The rule that we use is the rule of natural vector components derived in the previous lecture. Once a coordinate system is chosen, the correspondence between geometrical vectors and the components that describe them is fixed. The correspondence is that, for infinitesimal vectors, the components are the amounts by which the coordinates would have to change to go from the base of the vector to its end point.

The reason this definition requires that the vectors be infinitesimal is that, if an infinitesimal vector is extended to macroscopic size, the coordinates of its end point cease to be meaningful, since they depend on aspects of the coordinate system far away from the point at which the vector is defined. In fact, for a curved manifold, such as the surface of a sphere, that end point might not be in the manifold at all. So we define the geometric interpretation of vector components only for the infinitesimal case, but we still allow vectors to be large. That is, vectors form a genuine vector space; the space just isn't associated with the manifold except near zero.

A way to think of a large vector is that, at a point $$(u, v, w)\,$$, the vector $$[2, -3, 0]\,$$, is visualized as "one million times bigger than the tiny vector that points to $$(u+.000002, v-.000003, w)\,$$."

The notion of "geometrical universality" may be seen in the definitions of the dot product and the cross product. They have purely geometrical definitions that can be visualized without regard to any coordinate system. Then, for any Cartesian coordinate system, these operations can be defined in terms of the components of the vectors in that system, and it can be shown that those results match the purely geometrical definitions. What we are going to do next is develop the tools to define vectors, and the dot product and cross product, in any coordinate system at all. Later, we will introduce new operations&mdash;the divergence, the curl, and various integrals, and show how to calculate them in any coordinate system.

For example, Stokes' theorem makes a very powerful geometric statement about integrals of various vector fields over very general surfaces and lines. The concepts appearing in the theorem are purely geometric, and therefore independent of the coordinate system. Once we know how to manipulate those concepts (vector fields, integrals, and the "curl" operator) in any coordinate system, we will be able to choose a coordinate system that makes the surface simple. Specifically, we will change from the usual x/y/z Cartesian system to a u/v/w system for which the surface is defined by holding w=0. (Another way of looking at this is that the surface has been "parameterized" in terms of parameters u and v.) Once this is done, we will be able to calculate the curl operator, perform the integration, and prove Stokes' theorem in the u/v/w system.

As another example, Maxwell's equations make statements about the curl and divergence of the electric and magnetic field. While the formulas for curl and divergence are simpler in Cartesian coordinates than in spherical, some problems, such as the electric field in the vicinity of a point charge, have symmetry that makes spherical coordinates more natural. When we work out the divergence theorem in spherical coordinates, we will be able to solve problems of this sort.

Vectors in Arbitrary Coordinate Systems
A vector has different components in different coordinate systems. If the components of the vector $$\vec{A}$$, in the u/v/w coordinate system, are:
 * $$\vec{A} = \begin{bmatrix} A_u \\ A_v \\ A_w \end{bmatrix}$$

and, in the standard Cartesian system, they are:
 * $$\vec{A} = \begin{bmatrix} A_x \\ A_y \\ A_z \end{bmatrix}$$

then the transformation rule is:
 * $$\begin{bmatrix} A_u \\ A_v \\ A_w \end{bmatrix} = \begin{bmatrix} \displaystyle\frac{\partial u}{\partial x} & \displaystyle\frac{\partial u}{\partial y} & \displaystyle\frac{\partial u}{\partial z} \\ \\ \displaystyle\frac{\partial v}{\partial x} & \displaystyle\frac{\partial v}{\partial y} & \displaystyle\frac{\partial v}{\partial z} \\ \\ \displaystyle\frac{\partial w}{\partial x} & \displaystyle\frac{\partial w}{\partial y} & \displaystyle\frac{\partial w}{\partial z}\end{bmatrix} \begin{bmatrix} A_x \\ A_y \\ A_z \end{bmatrix}$$

In other words, the transformation of vector components from x/y/z to u/v/w is just a matrix multiplication by the Jacobian matrix
 * $$[J\langle uvw/xyz\rangle]$$

To go the other way, multiply by the matrix
 * $$[J\langle xyz/uvw\rangle]$$

We make a simplification to avoid having our notation get completely out of control. When we know that we are dealing with a specific general coordinate system and some "reference" Cartesian system: We will call [J] the "Jacobian of the coordinate system".
 * The matrix that we have been calling $$[J\langle xyz/uvw\rangle]$$ we will call just $$[J]\,$$.
 * The matrix that goes the other way, formerly called $$[J\langle xyz/uvw\rangle]$$, is just the inverse of the first matrix. We will call it $$[J]^{-1}\,$$.


 * The astute reader will notice that this is not a good definition. [J] depends on the choice of the "reference" Cartesian system, so we're not justified in calling it just "the Jacobian".  But it turns out that it won't make any difference.  All our answers will be in terms of the metric, which is indifferent to the choice of reference system.  So the vector transformation rules are these:

For polar coordinates in two dimensions, we have:
 * $$\begin{bmatrix} A_r \\ A_\theta \end{bmatrix} = [J]^{-1} \begin{bmatrix} A_x \\ A_y \end{bmatrix} = \begin{bmatrix} \displaystyle\frac{\partial r}{\partial x} & \displaystyle\frac{\partial r}{\partial y} \\ \\ \displaystyle\frac{\partial \theta}{\partial x} & \displaystyle\frac{\partial \theta}{\partial y}\end{bmatrix} \begin{bmatrix} A_x \\ A_y \end{bmatrix} = \begin{bmatrix} \cos \theta & \sin \theta \\ \\ \displaystyle\frac{- \sin \theta}{r} & \displaystyle\frac{\cos \theta}{r} \end{bmatrix} \begin{bmatrix} A_x \\ A_y \end{bmatrix}\,$$

And, going the other way:
 * $$\begin{bmatrix} A_x \\ A_y \end{bmatrix} = [J] \begin{bmatrix} A_r \\ A_\theta \end{bmatrix} = \begin{bmatrix} \cos \theta & - r \sin \theta \\ \sin \theta & r \cos \theta \end{bmatrix} \begin{bmatrix} A_r \\ A_\theta \end{bmatrix}$$

Let's look at some vectors whose components are given in natural general coordinates&mdash;there are some important and striking aspects of this.

In polar coordinates, a "radial basis vector" at any point has components:
 * $$(A_r=1, A_\theta=0)\,$$

Its Cartesian components (from the formulas above) are:
 * $$(A_x=\cos \theta, A_y=\sin \theta)\,$$

The actual (Cartesian) direction in which the vector points depends on its location. It always points radially outward. Its length is $$\sqrt{A_x^2 + A_y^2} = \sqrt{\cos^2 \theta + \sin^2 \theta} = 1\,$$. So the radial basis vector is a unit vector.

The "angular basis vector" has components:
 * $$(A_r=0, A_\theta=1)\,$$

Its Cartesian components are:
 * $$(A_x=- r \sin \theta, A_y=r \cos \theta)\,$$

It always points "laterally", at right angles to the radial direction. More surprisingly, its length is $$\sqrt{A_x^2 + A_y^2} = r\,$$. The angular basis vector is not a unit vector! It is bigger when farther away from the origin, even though the sum of the squares of its components is 1. In general curvilinear coordinates, the "square root of the sum of the squares of the components" rule is not correct for the length of a vector. The correct rule will be given below.

The Dot Product in Arbitrary Coordinates
We already have the formula for the dot product in Cartesian coordinates:
 * $$\vec{A} \cdot \vec{B} = A_x B_x + A_y B_y + A_z B_z\,$$

In the previous lecture we worked out the formula for the dot product of infinitesimal vectors, using the metric, which is, of course, derived from the Jacobian. That formula is:
 * $$\vec{A} \cdot \vec{B} = \sum_{i, j = 1}^3 g_{ij} A_i B_j\,$$

It can be calculated in practice using matrix manipulations:
 * $$\vec{A} \cdot \vec{B} = \begin{bmatrix} A & \text{as} & \text{row} \end{bmatrix} \begin{bmatrix} & & \\ & g & \\ & & \end{bmatrix} \begin{bmatrix} B \\ \text{as} \\ \text{column} \end{bmatrix}\,$$.

The reader can check that this is the same as the result that one would get by converting each vector to a Cartesian system (any Cartesian system) and calculating the dot product there.

Here are the formulas for two popular systems. In 2-dimensional polar coordinates:


 * $$[g] = \begin{bmatrix} 1 & 0 \\ 0 & r^2 \end{bmatrix}$$


 * $$\vec{A} \cdot \vec{B} = A_1 B_1 + r^2 A_2 B_2$$

or, going back to giving the coordinates names:
 * $$\vec{A} \cdot \vec{B} = A_r B_r + r^2 A_\theta B_\theta$$

For spherical coordinates in 3 dimensions:


 * $$[g] = \begin{bmatrix} 1 & 0 & 0 \\ 0 & r^2 & 0 \\ 0 & 0 & r^2 \sin^2 \theta \end{bmatrix}$$


 * $$\vec{A} \cdot \vec{B} = A_r B_r + r^2 A_\theta B_\theta + r^2 \sin^2 \theta A_\phi B_\phi$$

The Length of a Vector in Arbitrary Coordinates
The length of a vector (sometimes called the "norm") is the square root of the dot product of the matrix with itself. Note that this is defined in terms of something that we have defined in a purely geometrical way, independent of any coordinate system. Note also that it is consistent with the definition of the dot product of two vectors as the product of their lengths, times the cosine of the angle between them.

The length of a vector is usually written with a sort of double absolute value sign.

In polar coordinates, we have:
 * $$\Vert\vec{A}\Vert = \sqrt{A_r^2 + r^2 A_\theta^2}$$

In spherical coordinates:
 * $$\Vert\vec{A}\Vert = \sqrt{A_r^2 + r^2 A_\theta^2 + r^2 \sin^2 \theta A_\phi^2}$$

The Cross Product in Arbitrary Coordinates
The cross product formula is not simple. First, see the cross product page for a derivation that, in Cartesian coordinates, the geometric definition is equivalent to this formula:
 * $$\vec{A} \times \vec{B} =\begin{vmatrix}\hat{x}&\hat{y}&\hat{z}\\ A_x & A_y & A_z \\ B_x & B_y & B_z \end{vmatrix}$$

Now we can calculate the components of a cross product of two vectors in general coordinates by converting the components to Cartesian (use the Jacobian matrix), using the above formula, and converting back to general coordinates. When this is done, the formula involves the metric but not the Jacobian, showing that it was independent of the choice of Cartesian system. Here it is, as a column vector:

One rarely needs this formula in its most general form (the upper section of the above box), because common coordinate systems are usually orthogonal, with Lamé coefficients. In these cases, the diagonal entries of the metric matrix are the only ones that are nonzero, the Lamé coefficients are the square roots of the diagonal entries in the metric, and the inverse matrix has reciprocal values. So the formula, in "determinant form", takes the much simpler form shown in the lower section.

In spherical coordinates, this becomes:
 * $$\vec{A} \times \vec{B} = \begin{vmatrix}r^2 \sin \theta\ \hat{r}&\sin \theta\ \hat{\theta}&\displaystyle\frac{\hat{\phi}}{\sin \theta}\\[3ex] A_r & A_\theta & A_\phi \\[3ex] B_r & B_\theta & B_\phi \end{vmatrix}$$

Example: Suppose $$\vec{A}$$ is a radial basis vector:
 * $$\vec{A} = \hat{r} = \begin{bmatrix}1 \\ 0 \\ 0 \end{bmatrix}$$

(The length of $$\vec{A}$$ is 1.) Let $$\vec{B}$$ be a $$\theta$$-pointing basis vector:
 * $$\vec{B} = \hat{\theta} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}$$

(The length of $$\vec{B}$$ is r.) $$\vec{B}$$ is a vector at the surface of a sphere of radius r that points "south", tangential to the surface. $$\vec{A}$$ points outward from the surface, that is "up". $$\vec{A}$$ and $$\vec{B}$$ are orthogonal. (Prove it; caclulate $$\vec{A} \cdot \vec{B}$$.)

By the formula for the cross product, we have:
 * $$\vec{A} \times \vec{B} = \begin{vmatrix}r^2 \sin\theta\ \hat{r} & \sin\theta\ \hat{\theta} & \frac{\hat{\phi}}{\sin\theta} \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{vmatrix} = \frac{\hat{\phi}}{\sin\theta} = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sin \theta} \end{bmatrix}$$

This points "east" at the surface of the sphere of radius r, is perpendicular to the other two vectors, and has length r, which is the product of the lengths of the other two vectors. The geometrical properties of the cross product (perpendicular to each of the given vectors, and of length equal to the product of their lengths times the sine of the angle between them) are obeyed. The cross product, like all other vector operations, is a geometric invariant. It doesn't matter what coordinate system was used to calculate the components&mdash;the answer is the same.

Derivatives of Scalar and Vector Fields
A vector field is an assignment of a vector to every point on a manifold. Examples are the electric and magnetic fields in space. A scalar field is similarly an assignment of a scalar to every point on a manifold. A scalar is just a number, but we don't say "number field". "Scalar field" sounds nicer.

These fields typically meet requirements of continuity, differentiability, or integrability, as needed for whatever we are doing. When we perform manipulations of fields (addition, differentiation or integration, for example) we do them in terms of mathematical manipulation of the components. For example, the divergence operator will be expressed in terms of partial derivatives of the vector components (these are numbers, of course) relative to the coordinates on the manifold. This might make it seem as though the result is dependent on the choice of coordinate system. In fact, all of these operations are intrinsic, just as the dot product and cross product are. They have a geometrical meaning, independent of the choice of coordinate system. We will prove that the results that we get for these operations are independent of the coordinate system. We already did this for the vector dot and cross product, and, in an earlier lecture, for the integration of a scalar field.

The Gradient in Arbitrary Coordinates


The gradient is a vector field that is a type of derivative of a scalar field. It points in the direction in which the scalar field is increasing most rapidly, and its length indicates the rate of increase. If the scalar field is electric potential measured in volts, the gradient is the electric field strength in volts per meter. In this case the gradient vector field indicates the force acting on a charged particle.

The gradient is written with the symbol "$$\nabla\,$$", pronounced "del" :
 * $$\vec{A} = \nabla \Psi\,$$

It is sometimes written:
 * $$\vec{A} = \operatorname{grad}\ \Psi\,$$

The gradient, and all the formulas for it, work in 2, 3, or any number of dimensions.

In Cartesian coordinates, the components of the gradient are simply the partial derivatives of the scalar field:
 * $$\nabla \Psi = \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial x} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial y} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial z} \end{bmatrix}$$

or
 * $$(\nabla \Psi)_i = \displaystyle\frac{\partial \Psi}{\partial x_i}$$

It is often useful to think of $$\nabla$$ as being a fictional vector field with components $$\left[\frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z}\right]$$. Of course this isn't actually a vector field, but it makes it easier to remember the definitions of the gradient, and, later, the divergence and curl.

We have not shown that this gives a consistent vector, independent of the choice of Cartesian coordinate system; that will become clear shortly.

To get the expression for the gradient in arbitrary curvilinear coordinates, we first calculate it in Cartesian coordinates:
 * $$\begin{bmatrix} (\nabla \Psi)_x \\[2.5ex] (\nabla \Psi)_y \\[2.5ex] (\nabla \Psi)_z \end{bmatrix} = \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial x} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial y} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial z} \end{bmatrix}$$

Then we convert that vector to the general coordinate system, using the vector transformation rule given above:
 * $$\begin{bmatrix} (\nabla \Psi)_u \\[2.5ex] (\nabla \Psi)_v \\[2.5ex] (\nabla \Psi)_w \end{bmatrix} = [J]^{-1} \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial x} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial y} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial z} \end{bmatrix}$$

But, by the chain rule:
 * $$\displaystyle\frac{\partial \Psi}{\partial x} =

\displaystyle\frac{\partial \Psi}{\partial u}\ \displaystyle\frac{\partial u}{\partial x} + \displaystyle\frac{\partial \Psi}{\partial v}\ \displaystyle\frac{\partial v}{\partial x} + \displaystyle\frac{\partial \Psi}{\partial w}\ \displaystyle\frac{\partial w}{\partial x} = J^{-1}_{11}\displaystyle\frac{\partial \Psi}{\partial u} + J^{-1}_{21}\displaystyle\frac{\partial \Psi}{\partial v} + J^{-1}_{31}\displaystyle\frac{\partial \Psi}{\partial w}$$


 * $$\displaystyle\frac{\partial \Psi}{\partial y} =

\displaystyle\frac{\partial \Psi}{\partial u}\ \displaystyle\frac{\partial u}{\partial y} + \displaystyle\frac{\partial \Psi}{\partial v}\ \displaystyle\frac{\partial v}{\partial y} + \displaystyle\frac{\partial \Psi}{\partial w}\ \displaystyle\frac{\partial w}{\partial y} = J^{-1}_{12}\displaystyle\frac{\partial \Psi}{\partial u} + J^{-1}_{22}\displaystyle\frac{\partial \Psi}{\partial v} + J^{-1}_{32}\displaystyle\frac{\partial \Psi}{\partial w}$$


 * $$\displaystyle\frac{\partial \Psi}{\partial z} =

\displaystyle\frac{\partial \Psi}{\partial u}\ \displaystyle\frac{\partial u}{\partial z} + \displaystyle\frac{\partial \Psi}{\partial v}\ \displaystyle\frac{\partial v}{\partial z} + \displaystyle\frac{\partial \Psi}{\partial w}\ \displaystyle\frac{\partial w}{\partial z} = J^{-1}_{13}\displaystyle\frac{\partial \Psi}{\partial u} + J^{-1}_{23}\displaystyle\frac{\partial \Psi}{\partial v} + J^{-1}_{33}\displaystyle\frac{\partial \Psi}{\partial w}$$

so
 * $$\begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial x} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial y} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial z} \end{bmatrix} = {[J]^{-1}}^t \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial u} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial v} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial w} \end{bmatrix}$$

or
 * $$\begin{bmatrix} (\nabla \Psi)_u \\[2.5ex] (\nabla \Psi)_v \\[2.5ex] (\nabla \Psi)_w \end{bmatrix} = [J]^{-1} {[J]^{-1}}^t \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial u} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial v} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial w} \end{bmatrix}

= [J^t J]^{-1} \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial u} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial v} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial w} \end{bmatrix} = [g]^{-1} \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial u} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial v} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial w} \end{bmatrix}$$

When the general coordinates $$(u, v, w)\,$$ are just another set of Cartesian coordinates, we know that $$[g]\,$$ and $$[g]^{-1}\,$$ are identity matrices, so we get the same formula that we started with:
 * $$\nabla \Psi = \begin{bmatrix} \displaystyle\frac{\partial \Psi}{\partial u} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial v} \\[2.5ex] \displaystyle\frac{\partial \Psi}{\partial w} \end{bmatrix}$$

which shows that the choice of the "reference" Cartesian system didn't matter.

The Divergence in Arbitrary Coordinates
The divergence of a vector field is a number at each point, that is, a scalar field. Intuitively, it is a measure of the extent to which the vectors are "diverging", that is, pointing away from each other. (If the vectors point toward each other, the divergence is negative. ) If a vector field represents flow of a fluid (for example, the fluid is air and the vector field is wind velocity), the divergence is the rate at which matter is leaving a given region. If the divergence of the wind velocity is positive, air is leaving a given region, and the air pressure is decreasing. In fact, a principle of fluid flow says that, under the assumption of conservation of matter, the divergence of the fluid motion plus the rate of density increase is zero.

The divergence (and, later, the curl) can be tricky! The gravitational field around the Sun consists of vectors pointing inward, which would seem to make the divergence negative. But, because the vectors are smaller at greater distances from the Sun, the divergence is in fact zero. We will prove this shortly.



The divergence is written as though it were the dot product of the artificial vector-like "del" symbol "$$\nabla\,$$" and the vector field in question:
 * $$\nabla \cdot \vec{A}\,$$

This is pronounced "det dot A". The divergence is sometimes written:
 * $$\operatorname{div} \vec{A}\,$$

The geometric definition of the divergence at a given point is the limit of a volume integral as that volume shrinks to that point:
 * $$\operatorname{div}\,\vec{F} = \lim_{V \rightarrow \{p\}} \iint_{\partial V} \frac{\vec{F} \cdot \vec{da}}{|V|} \; dS $$

Where the integral is over the boundary surface $$\partial V\,$$ of the tiny volume $$V\,$$. Since we haven't yet defined surface integrals (the "$$\vec{F} \cdot \vec{da}\,$$"), we can't really say much about the calculation of this limit. It is, in any case, hard to show that the limit exists. So we will use a different definition of the divergence, in terms of partial derivatives of the vector components with respect to the coordinates. We will show that this definition is independent of the coordinate system, and takes the same well-known form in all Cartesian systems. We will show that it makes sense in terms of the above integral when the coordinate system is Cartesian and the volume $$V\,$$ is rectangular. Later, when we prove the divergence theorem, this definition as an integral will become clear.

In Cartesian coordinates in 3 dimensions, the divergence is:
 * $$\nabla \cdot \vec{A} = \frac{\partial A_x}{\partial x} + \frac{\partial A_y}{\partial y} + \frac{\partial A_z}{\partial z}$$

It works in any number of dimensions:
 * $$\nabla \cdot \vec{A} = \sum_{i} \displaystyle\frac{\partial A_i}{\partial x_i}$$

If one thinks of $$\nabla$$ as being a fictional vector field with components $$\left[\frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z}\right]$$, one can sort of see that the dot product notation makes sense. This is also useful for remembering how to calculate a divergence.

To the right are three vector fields. In the first, the x component is increasing as x increases. The vector arrows are clearly diverging away from each other, and this has positive divergence (but zero curl.) In the second, the y component is increasing as x increases. This has a divergence of zero (but nonzero curl.) The third field could be the electric field around a point charge. (Or the negative of the gravitational field around the Sun.) The vectors all seem to be pointing away from each other, but notice that, along the x axis, the x components of the vectors get smaller as x increases. If the field is in 3 dimensions and is inversely proportional to the square of the distance, the two effects cancel, and the divergence is zero. (Its curl is also zero.)

The formula for the divergence in arbitrary curvilinear coordinates, while reasonably simple, requires a bit of work. We derive it by transformation from the vector field in Cartesian coordinates. The formula and its derivation work for any number of coordinates.

We will denote the components of the vector field $$\vec{A}\,$$ relative to the curvilinear coordinate system as $$A_i\,$$, so, in 3 dimensions, $$A_1 = A_u\,$$, $$A_2 = A_v\,$$, and $$A_3 = A_w\,$$. We will denote the components of the same vector relative to the "reference" Cartesian system with a tilde over the symbol: $$\tilde{A}_i\,$$, so $$\tilde{A}_1 = A_x\,$$, $$\tilde{A}_2 = A_y\,$$, and $$\tilde{A}_3 = A_z\,$$. This way, we can use numerical indices for both the curvilinear components and the reference Cartesisan components.

The transformation formulas, derived above, are:
 * $$A_i = \sum_j J^{-1}_{ij}\ \tilde{A}_j\qquad\text{and}\qquad \tilde{A}_i = \sum_j J_{ij}\ A_j$$

Now the divergence is defined relative to the reference system as:
 * $$\nabla \cdot \vec{A} = \sum_{i} \displaystyle\frac{\partial \tilde{A}_i}{\partial x_i}$$

We haven't shown that this gives a result that is independent of the choice of Cartesian coordinate system; that will become clear shortly.

To get the formula for the divergence in arbitrary curvilinear coordinates, we find the Cartesian components of the vector field and use the definition given above:
 * $$\nabla \cdot \vec{A} = \sum_{i} \displaystyle\frac{\partial \tilde{A}_i}{\partial x_i} = \sum_{i} \displaystyle\frac{\partial}{\partial x_i} \left(\sum_{k} J_{ik} A_k\right)$$

Using the chain rule:
 * $$\nabla \cdot \vec{A} = \sum_{i, j} \displaystyle\frac{u_j}{x_i}\ \displaystyle\frac{\partial}{\partial u_j} \left(\sum_{k} J_{ik} A_k\right)$$

But $$\frac{u_j}{x_i} = J^{-1}_{ji}\,$$, so:
 * $$\nabla \cdot \vec{A} = \sum_{i, j} J^{-1}_{ji}\ \displaystyle\frac{\partial}{\partial u_j} \left(\sum_{k} J_{ik} A_k\right) = \sum_{i, j, k} J^{-1}_{ji}\ \displaystyle\frac{\partial}{\partial u_j} \left(J_{ik} A_k\right)$$

Using the product rule:
 * $$\nabla \cdot \vec{A} = \sum_{i, j, k} J^{-1}_{ji}\ J_{ik}\ \displaystyle\frac{\partial A_k}{\partial u_j} + \sum_{i, j, k} J^{-1}_{ji}\ A_k\ \displaystyle\frac{\partial J_{ik}}{\partial u_j}$$
 * $$= \sum_{j, k} (J^{-1} J)_{jk}\ \displaystyle\frac{\partial A_k}{\partial u_j} + \sum_{i, j, k} J^{-1}_{ji}\ A_k\ \displaystyle\frac{\partial J_{ik}}{\partial u_j}$$
 * $$= \sum_{j}\displaystyle\frac{\partial A_j}{\partial u_j} + \sum_{i, j, k} J^{-1}_{ji}\ A_k\ \displaystyle\frac{\partial J_{ik}}{\partial u_j}$$

Now we note that, for all i, j, and k:
 * $$\displaystyle\frac{\partial J_{ik}}{\partial u_j} = \displaystyle\frac{\partial J_{ij}}{\partial u_k}\,$$

Why? Because
 * $$J_{ik} = \displaystyle\frac{\partial x_i}{\partial u_k}\qquad\text{and}\qquad J_{ij} = \displaystyle\frac{\partial x_i}{\partial u_j}\,$$

So the equation is actually
 * $$\displaystyle\frac{\partial^2 x_i}{\partial u_j \partial u_k} = \displaystyle\frac{\partial^2 x_i}{\partial u_k \partial u_j}\,$$

which is true because mixed partial derivatives commute.

So we have:
 * $$\nabla \cdot \vec{A} = \sum_{j}\displaystyle\frac{\partial A_j}{\partial u_j} + \sum_{i, j, k} J^{-1}_{ji}\ A_k\ \displaystyle\frac{\partial J_{ij}}{\partial u_k}\,$$

Now we use the fact that, for all k:
 * $$\displaystyle\frac{\partial \mathbf{J}}{\partial u_k} = \mathbf{J}\ \sum_{i, j} J^{-1}_{ji}\ \displaystyle\frac{\partial J_{ij}}{\partial u_k}\,$$

See the determinant page for a proof of this.

So we have:
 * $$\nabla \cdot \vec{A} = \sum_{i}\displaystyle\frac{\partial A_i}{\partial u_i} + \sum_{k} \displaystyle\frac{1}{\mathbf{J}}\ \displaystyle\frac{\partial \mathbf{J}}{\partial u_k}\ A_k\,$$

We can remove all mention of the Jacobian matrix, since we only need its determinant, and we can get that from the determinant of the metric. Therefore, we have removed all connection with the chosen reference Cartesian coordinate system.
 * $$\nabla \cdot \vec{A} = \sum_{i}\displaystyle\frac{\partial A_i}{\partial u_i} + \displaystyle\frac{1}{\sqrt{\mathbf{g}}} \sum_{i} \displaystyle\frac{\partial \sqrt{\mathbf{g}}}{\partial u_i}\ A_i\,$$

By the product rule for derivatives, we can put this into another form:
 * $$\nabla \cdot \vec{A} = \displaystyle\frac{1}{\sqrt{\mathbf{g}}} \sum_{i}\displaystyle\frac{\partial}{\partial u_i}\left(\sqrt{\mathbf{g}}\ A_i\right)\,$$

When the general coordinates $$u_i\,$$ are just another set of Cartesian coordinates, we know that $$\mathbf{g}\,$$ is 1, so we get the same formula that we started with:
 * $$\nabla \cdot \vec{A} = \sum_{i} \displaystyle\frac{\partial A_i}{\partial u_i}$$

which shows that the choice of the "reference" Cartesian system didn't matter.

Example&mdash;spherical coordinates.

In spherical coordinates, we have $$\sqrt{\mathbf{g}} = r^2 \sin\theta\,$$, so
 * $$\nabla \cdot \vec{A} = \displaystyle\frac{\partial A_r}{\partial r} + \displaystyle\frac{\partial A_\theta}{\partial \theta} + \displaystyle\frac{\partial A_\phi}{\partial \phi} + \frac{2}{r}\ A_r + \cot\theta\ A_\theta\,$$

The last two terms are the "correction terms" that deal with the fact that the coordinate system is curved.

Example&mdash;an inverse-square field.

An inverse-square field, such as the gravitational field around a point mass, or the electric field around a point charge, is:
 * $$F_r = \frac{K}{r^2}\,$$


 * $$F_\theta = 0\,$$


 * $$F_\phi = 0\,$$

We have:
 * $$\nabla \cdot \vec{F} = \displaystyle\frac{\partial F_r}{\partial r} + \frac{2}{r}\ F_r = - \frac{2K}{r^3} + \frac{2K}{r^3} = 0\,$$


 * Everything below this point needs to be rewritten.


 * Need to put in some examples, etc. I have them worked out, but typing them in is tedious.

The Curl in Arbitrary Coordinates
The curl of a vector field is a vector at each point, that is, another vector field. Intuitively, it measures the degree to which a vector field is "rotating" around in circles. It involves a right-hand rule, rather like the cross product. (In fact, the curl and the cross product are closely related.) If the vectors are rotating around in the direction of the curled fingers of the right hand, the curl is a vector pointing in the direction of the thumb.

Here's the definition of the curl operator in arbitrary curvilinear coordinates:

First, to find the curl of a vector field $$\vec{A}$$, let the matrix $$[g]$$ operate on $$\vec{A}$$, yielding $$\vec{B}$$:
 * $$\vec{B} = [g] \vec{A}$$
 * Special case: If the coordinate system is orthogonal, so that $$[g]$$ is diagonal, this is very easy:
 * $$B_u = g_{11} A_u\ \ \ B_v = g_{22} A_v\ \ \ B_w = g_{33} A_w$$

Then the curl is:
 * $$\nabla \times \vec{A} = \frac{1}{\sqrt{g}} \begin{vmatrix} \hat{u} & \hat{v} & \hat{w} \\ \\ \displaystyle\frac{\partial}{\partial u} & \displaystyle\frac{\partial}{\partial v} & \displaystyle\frac{\partial}{\partial v} \\ \\ B_u & B_v & B_w \end{vmatrix}$$


 * Or, if the coordinate system is orthogonal:


 * $$\nabla \times \vec{A} = \frac{1}{\sqrt{g}} \begin{vmatrix} \hat{u} & \hat{v} & \hat{w} \\ \\ \displaystyle\frac{\partial}{\partial u} & \displaystyle\frac{\partial}{\partial v} & \displaystyle\frac{\partial}{\partial v} \\ \\ g_{11}A_u & g_{22}A_v & g_{33}A_w \end{vmatrix}$$

Integration Over Parametrically Defined Regions
A variation of this method is used when integrating over curves in 2 or 3-dimensional spaces, or surface in 3-dimensional spaces. Recall that a parametric description of such a thing is closely related to a change of coordinate system, but with different numbers of coordinates in the two systems. This means that the matrix of partial derivatives is not square, so it has no determinant.

The general problem, in arbitrary dimensions, goes deeply into advanced topics of differential geometry. We will state the methods, without proof, for the cases of parametrically defined curves and surfaces.

Curve Integrals
For a curve in a 2-dimensional plane or 3-dimensional volume, let the parameter (the single coordinate in the 1-dimensional curve) be denoted by t. Work out the partial derivatives, and form the vector
 * $$\left(\frac{\partial x}{\partial t}, \frac{\partial y}{\partial t}\right)$$

for a curve in a plane, or
 * $$\left(\frac{\partial x}{\partial t}, \frac{\partial y}{\partial t}, \frac{\partial z}{\partial t}\right)$$

for a curve in a volume. Let J (not really the Jacobian, but we treat it that way) be the norm of that vector, that is, the square root of the sum of the squares of its 2 or 3 components.
 * $$J = \Bigg\Vert\left(\frac{\partial x}{\partial t}, \frac{\partial y}{\partial t}, \frac{\partial z}{\partial t}\right)\Bigg\Vert = \sqrt{\left(\frac{\partial x}{\partial t}\right)^2 + \left(\frac{\partial y}{\partial t}\right)^2 + \left(\frac{\partial z}{\partial t}\right)^2}$$

Then
 * $$\int f(t)\ J\ dt$$

is the integral over the curve, in terms of the parameter t.

Example:

Suppose a circle of radius R is described in terms of a parameter t, as
 * $$x = R \cos t\,$$


 * $$y = R \sin t\,$$

We have
 * $$\frac{\partial x}{\partial t} = - R \sin t$$


 * $$\frac{\partial y}{\partial t} = R \cos t$$


 * $$J = \sqrt{R^2 \sin^2 t + R^2 \cos^2 t} = \sqrt{R^2} = R$$

So, to integrate any function over the full circle (with t running from 0 to $$2 \pi$$), we have:
 * $$\int_0^{2\pi} f(t)\ R dt$$

The length of the curve (that is, the circumference of the circle) is:
 * $$\int_0^{2\pi} 1\ R\ dt = 2 \pi R$$

Surface Integrals
For a surface in a 3-dimensional volume, let the parameters be u and v. Work out the six partial derivatives, and form these two vectors:
 * $$\stackrel{\textstyle{\rightarrow}}{U} = \left(\frac{\partial x}{\partial u}, \frac{\partial y}{\partial u}, \frac{\partial z}{\partial u}\right)$$
 * $$\stackrel{\textstyle{\rightarrow}}{V} = \left(\frac{\partial x}{\partial v}, \frac{\partial y}{\partial v}, \frac{\partial z}{\partial v}\right)$$

Calculate their cross product, and let J be the norm of that. Use the cross product formula.
 * $$J = ||\stackrel{\textstyle{\rightarrow}}{U} \times\stackrel{\textstyle{\rightarrow}}{V}||$$

Then
 * $$\int\int f(u, v)\ J\ dv\ du$$

is the integral over the surface, in terms of the parameters u and v.

Unless one is careful to choose the order of the parameters so the "outward" direction of the surface follows a right-hand rule, and use that order in forming the cross product, there may be ambiguity in the sign of J. In practice, one just figures out what to do.

Example:

Parameterize a sphere of radius R in terms of $$\theta$$ and $$\phi$$. (These are two of the same coordinates that are used in 3-dimensional spherical coordinates&mdash;$$r$$, $$\theta$$ and $$\phi$$&mdash;but $$r$$ is held constant, so it is no longer a coordinate.)
 * $$x = R \sin \theta \cos \phi\,$$
 * $$y = R \sin \theta \sin \phi\,$$
 * $$z = R \cos \theta\,$$


 * $$\frac{\partial x}{\partial \theta} = R \cos \theta \cos \phi$$

etc.
 * $$\stackrel{\textstyle{\rightarrow}}{U} = \left(R \cos \theta \cos \phi, R \cos \theta \sin \phi, - R \sin \theta\right)$$
 * $$\stackrel{\textstyle{\rightarrow}}{V} = \left(- R \sin \theta \sin \phi, R \sin \theta \cos \phi, 0\right)$$
 * $$\stackrel{\textstyle{\rightarrow}}{U} \times\stackrel{\textstyle{\rightarrow}}{V} = R^2 \left(\sin^2 \theta \cos \phi, \sin^2 \theta \sin \phi, \sin \theta \cos \phi\right)$$
 * $$J = ||\stackrel{\textstyle{\rightarrow}}{U} \times\stackrel{\textstyle{\rightarrow}}{V}|| = R^2 \sin \theta$$

So an integral over the entire surface would be:
 * $$\int_0^{\pi}\int_0^{2\pi} f(\theta, \phi)\ R^2 \sin \theta\ d\phi\ d\theta$$

The area of a sphere is:
 * $$\int_0^{\pi}\int_0^{2\pi} R^2 \sin \theta\ d\phi\ d\theta = \int_0^{\pi} R^2 \sin \theta \left(\int_0^{2\pi} d\phi\right) d\theta$$


 * $$= 2 \pi R^2 \int_0^{\pi} \sin \theta\ d\theta = 4 \pi R^2$$

To find, for example, the area of a sphere only from $$\theta = 0\,$$ to $$\theta = T\,$$ (that is, the area of the Earth north of latitude $$\pi/2 - T\,$$), we have
 * $$\int_0^{T}\int_0^{2\pi} R^2 \sin \theta\ d\phi\ d\theta = 2 \pi R^2 \int_0^{T} \sin \theta\ d\theta = 2 \pi R^2\ (1 - \cos T)$$

=More About Coordinate Systems=

The study of various coordinate systems (sometimes called "curvilinear coordinate systems") is central to the subject of multivariable calculus. You are already familiar with some of the common coordinate systems such as polar coordinates in 2 dimensions, and spherical or cylindrical coordinates in 3 dimensions. In this section we will examine the general topic of coordinate systems, and how one can perform calculations&mdash;involving vectors, surfaces, integrals, and so on&mdash;directly in specialized coordinate systems. Physical problems can often be solved more easily in a coordinate system that matches the symmetries of the problem.

An extremely important consideration to be aware of is that geometrical or physical phenomena (points, vectors, regions, surfaces, etc.) have a fundamental existence and meaning that is independent of any coordinate system. Coordinate systems simply give us ways to attach numbers to them. Different coordinate systems will attach different numbers to a given vector field, for example, but the vector field has an underlying geometric meaning that does not change. The various vector field operations have an underlying geometrical or physical meaning, and coordinate systems simply let us perform mathematical calculations on them.

You have already seen the notion of "geometrical universality" in the definitions of the dot product and the cross product. They were given purely geometrical definitions that can be visualized without regard to any coordinate system. Than, for any Cartesian coordinate system, these operations were defined in terms of the components of the vectors in that system, and it was shown that those results matched the purely geometrical definitions. What we are going to do next is develop the tools to define vectors, and the dot product and cross product, in any coordinate system at all. Later, we will introduce new operations&mdash;the divergence, the curl, and various integrals, and show how to calculate them in any coordinate system.

For example, Stokes' theorem makes a very powerful geometric statement about integrals of various vector fields over very general surfaces and lines. The concepts appearing in the theorem are purely geometric, and therefore independent of the coordinate system. Once we know how to manipulate those concepts (vector fields, integrals, and the "curl" operator) in any coordinate system, we will be able to choose a coordinate system that makes the surface simple. Specifically, we will change from the usual x/y/z Cartesian system to a u/v/w system for which the surface is defined by holding w=0. (Another way of looking at this is that the surface has been "parameterized" in terms of parameters u and v.) Once this is done, we will be able to calculate the curl operator, perform the integration, and prove Stokes' theorem in the u/v/w system.

As another example, Maxwell's equations make statements about the curl and divergence of the electric and magnetic field. While the formulas for curl and divergence are simpler in Cartesian coordinates than in spherical, some problems, such as the electric field in the vicinity of a point charge, have symmetry that makes spherical coordinates more natural. When we work out the divergence theorem in spherical coordinates, we will be able to solve problems of this sort.

We will use "u" and "v" in many of our general examples of alternative coordinate systems in 2 dimensions, and u, v, and w in 3 dimensions. For the very common cases of polar coordinates in 2 dimensions and spherical in 3 dimensions, we will often use the more familiar r/$$\theta$$ or r/$$\theta$$/$$\phi$$.

More material: Integrate the moments of inertia (explain what this is about, and torque, and angular acceleration, and rotational energy) to show why a sphere rolls down an inclined plane faster than a cylinder.

=Footnotes and References=