Simplex Method

Learning project summary

 * Project code: Simplex Method
 * School: Computer Science
 * Department: Scientific_Computing

Content Summary
A brief introduction of the Simplex Method and related topics in Linear Programming.

Goals
The lessons in this learning project start with describing what Linear Programming is and lead up to demonstrating how the Simplex Method works. After completing these lessons, one should have a basic understanding of how to use the Simplex Method and why it works. Assignments are given at the end to reinforce the material.

Texts
[1] Numerical Mathematics and Computing Chapter 17

Lesson 1: Linear Programming
The purpose of the Simplex Method is to solve a Linear Programming problem, maximizing a linear function subject to certain constraints. Any linear programming problem can be written in what is called First Primal Form:

Maximize: $$\sum_{j=1}^{n}c_jx_j$$ Given the constraints: $$\sum_{j=1}^{n}a_{ij}x_j \le b_i$$ for i = 1,...,m and $$x_j \ge 0$$ for j = 1, ..., n Everything is known except for the xj's. There are n dimensions and m constraints.

This problem can also be written in matrix form: Maximize: $$\textbf{c}^T\textbf{x}$$ given constraints: $$\textbf{x} \ge \textbf{0}$$ and $$\textbf{Ax} \le \textbf{b}$$

c and x are vectors of length n, A is a matrix of dimensions $$m \times n$$, and b is a vector of length m.

Note that using this notation, for two vectors v and w, $$\textbf{v} \le \textbf{w}$$ means that each component of v is less than or equal to the corresponding component of w.



Example: Consider a 2-dimensional linear programming problem:

Maximize $$\left( \begin{array}{rcl} 1&2 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2 \end{array} \right)$$ given the constraints $$\left( \begin{array}{rcl} x_1\\x_2 \end{array} \right) \ge 0$$ and $$\left( \begin{array}{rcl} 11&2 \\ 1&3 \\ 5&6 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2 \end{array} \right) \le \left( \begin{array}{rcl} 44\\12\\30 \end{array} \right)$$ The image to the right, made with the help of Matlab, shows this example in graphic form. The goal is to maximize x1+2x2 in the region bounded by the axes and the three lines.

Lesson 2: Idea Behind the Simplex Method
Suppose that we are given a Linear Programming problem. We are given matrix A and vectors b and c. Let K be the set of all vectors that obey the constraints: $$\textbf{K} = \{\textbf{x} \epsilon \mathbb{R}^n: \textbf{x} \ge 0 \ and \ \textbf{Ax} \le \textbf{b}\}$$. We are trying to find an x in K such that $$\forall \textbf{y} \epsilon \textbf{K}, \textbf{c}^T\textbf{x} \ge \textbf{c}^T\textbf{y}$$. To guarantee that there is such a solution, let us for the moment assume that K is non-empty and bounded.

In this situation, the region K is a convex polytope. Convex means that if you draw a line between any two points of K, the line will be in K. Formally, $$\forall \textbf{x,y} \epsilon \textbf{K}, \forall c \epsilon [0,1], (c\textbf{x} + (1-c)\textbf{y}) \epsilon \textbf{K}$$. A "polytope" is an extension of a polygon to higher dimensions. The boundaries of a polytope are segments of hyperplanes, like the edges of a polygon are segments of lines.

Because the polytope is convex, it can be shown that one of the vertices of K is a solution to the linear programming problem maximizing $$\textbf{c}^T\textbf{x}$$. Also because the polytope is convex, if a vertex has a higher (or equal) $$\textbf{c}^T\textbf{x}$$ value than its neighboring vertices, it can be concluded that this vertex also has a higher (or equal) $$\textbf{c}^T\textbf{x}$$ value than all other vertices in the polytope. In other words, a local maximum is a global maximum.

The Simplex Method takes advantage of these features. In order to maximize $$\textbf{c}^T\textbf{x}$$ given an initial vertex, travel along the edges of the polytope to vertices with increasing $$\textbf{c}^T\textbf{x}$$ values. Once a local maximum is found, stop, because you know that is also a global maximum.

Lesson 3: Second Primal Form
The next question to answer is how to find the neighboring vertices of a vertex in a convex polytope. The simplex method solves this by redefining the problem. The form of this redefined problem is called second primal form. The main difference between second primal form and first primal form is that second primal form uses equalities instead of inequalities for the main constraints.

Consider a constraint in first primal form: $$\sum_{j=1}^{n}a_{ij}x_j \le b_i$$. This is the same as saying for some number $$p_i \ge 0$$, we know $$(\sum_{j=1}^{n}a_{ij}x_j) + p_i = b_i$$. Since we have m inequalities and each pi is an unknown, let us add m unknowns to the x vector to be our pi values. These new x variables are called slack variables. So our new constraints are, for i = 1,...,m, $$(\sum_{j=1}^{n}a_{ij}x_j) + x_{n+i} = b_i$$.

In matrix form, second primal form looks like: maximize: $$\textbf{c}^T\textbf{x}$$ given the constraints: $$\textbf{x} \ge 0$$ and $$\textbf{Ax} = \textbf{b}$$. A is an $$m \times (n+m)$$ matrix, which has the same values as the matrix from first primal form except with an $$m \times m$$ identity matrix added to the right side. x is an unknown vector of length n+m. b is a vector of length m which is the same as in first primal form. c is a vector of length n+m, which is the same as it was in first primal form except with m zeros added on the end (we want to add zeros, because the slack variables should not improve the solution).

Note that in this form, the number of dimensions (n+m) exceeds the number of constraints (m). This will make it easier to find the vertices. A solution x in second primal form is also a solution for first primal form if you ignore the last m values of x.

Recall our example linear programming problem in first primal form. Let us now write it in second primal form: Maximize: $$\left( \begin{array}{rcl}1&2\end{array} \begin{array}{rcl}0&0&0 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right)$$ given the constraints: $$\left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right) \ge 0$$ and $$\left( \begin{array}{rcl} 11&2 \\ 1&3 \\ 5&6 \end{array}            \begin{array}{rcl} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right) = \left( \begin{array}{rcl} 44\\12\\30 \end{array} \right)$$ Since there are so many dimensions to handle in even simple problems like this, it is difficult to picture the second primal form graphically.

Let K be the set of all vectors that obey the constraints. Notice that K is an intersection of hyperplanes, bounded so that no component of a vector in K is negative. Typically, an intersection of hyperplanes is a lower-dimensional hyperplane, but in the same space (for example, a line in 3-D space is the intersection of two planes in 3-D space). The region K is smooth until it hits the point where a component of the vector is zero, when it halts abruptly. Thus the vertices are at locations with one or more zeros in the vector. If you consider a plane in 3-D space, say x+y+z=1, bounded so that x>=0 and y>=0 and z>=0, it results in a triangle with one vertex on the x-axis, one vertex on the y-axis, and one vertex on the z-axis, so each vertex has two zero components. If you consider an appropriate line in 3-D space, bounded so that x>=0 and y>=0 and z>=0, it results in a line segment, where the endpoints typically have only one zero component.

There is a theorem which says for any x in K, the following two statements are equivalent: 1) x is a vertex of K 2) The set of columns {a(i): the ith component of x is non-zero} from A is linearly independent.

One interesting thing to note about this theorem is that it restricts the number of non-zero components of a vertex to be no larger than m. If it were larger than m, then the set described in #2 would have more columns than the length of a column, so it would be linearly dependent.

Lesson 4: The Algorithm of the Simplex Method
To start with, we are given a linear programming problem in second primal form: maximize: $$\textbf{c}^T\textbf{x}$$ given the constraints: $$\textbf{x} \ge 0$$ and $$\textbf{Ax} = \textbf{b}$$.

Since we are traveling from vertex to vertex along edges, we need a start vertex. Let k1, k2, ..., km be the indices of the non-zero components of our vertex. For a starting vertex, let k1 = n+1, k2 = n+2, ..., and km = n+m. These are the slack variables.

On each iteration of the simplex method, we do the following:

Put the k1th, k2th, ..., and kmth columns of A into its own matrix called B. Then solve $$\textbf{Bx} = \textbf{b}$$. Since B is an $$m \times m$$ matrix, this will have a unique solution as long as the columns of B are linearly independent. Using k1 = n+1, k2 = n+2, ..., km = n+m for the first iteration makes B be the identity matrix, which is definitely linearly independent.

In a sense, x and k together define a vertex of K. To demonstrate, let x' be an n+m component vector with values $$\ \ x'_i = \left\{\begin{array}{lr} x_j, & for \ some \ j, \ k_j = i \\ 0, & k_1,\ k_2,\ ...\, k_m \ne i \end{array}\right. \ \ $$. Then x' is a solution to Ax'=b, and by the theorem we know that x' is a vertex of K. Note that all components of x should be positive. If a component of x is negative, the algorithm has failed.

So we have a vertex. Next, we need to travel along an edge to get to another vertex which brings us closer to the solution. To travel along an edge, we want to swap one of the zero components of the vertex with a non-zero component. In other words, we want to change the value of ki for some i. We need to decide which ki to change, and what value to change it to. The value of cTx' should increase as a result of this swap.

Do the following to decide which component that is currently zero should be made non-zero: Let $$\textbf{e} = \left( \begin{array}{rcl} c_{k_1} & c_{k_2} \ ... & c_{k_m} \end{array} \right)^T$$, and solve $$\textbf{B}^T\textbf{y} = \textbf{e}$$. Let s (the component index to be made non-zero) be a number such that $$c_s - \textbf{y}^T\textbf{a}^{(s)}$$ is maximized (a(i) denotes the ith column of A). Candidates for s are any number between 1 and n+m, but not one of the current ki's. If $$c_s - \textbf{y}^T\textbf{a}^{(s)}$$ is very small or negative (essentially zero or less, but allowing for some computational error), it means that moving along an edge will no longer help; in this case, exit the algorithm, since the current vertex described by x and k is the solution.

Next, do the following to decide which component that is currently non-zero should be made zero: Solve $$\textbf{Bz} = \textbf{a}^{(s)}$$. If all components of z are very small or negative (essentially zero or less, but allowing for some computation error), this means K is unbounded, so exit the algorithm and there is no solution. Otherwise, look at the ratios of xi/zi for i = 1,...,m and zi > 0. Choose r (the non-zero component index to be made zero) so that xr/zr is the smallest such ratio.

Now, change kr to equal s, and move on to the next iteration. Keep on doing this until the algorithm finishes.

At the end, you will have two vectors x and k of length m. As shown above, this describes a vertex of K which is the solution to the linear programming problem.

Many modifications have been made to this basic algorithm to make it run more efficiently. You should not normally try to code your own simplex method. Rather, use a program that is already made.

Question 1: First Primal Form
Consider the following scenario:
 * A factory produces clocks, music boxes, and watches.
 * Each clock costs 5 units of wood and 3 units of metal to make.
 * Each music box costs 3 units of wood and 2 units of metal to make.
 * Each watch costs 3 units of metal to make.
 * The factory has 100 units of wood and 80 units of metal to work with.
 * The profit for each clock is $40, for each music box is $20, and for each watch is $35.

The goal is to maximize profit. Set up a linear programming problem in first primal form that represents this problem.

Answer to Question 1
Let x1 be the number of clocks produced, x2 be the number of music boxes produced, and x3 be the number of watches produced. We are trying to solve for these three variables. We can set up inequalities based on the available units of wood and metal: $$5x_1 + 3x_2 + 0x_3 \le 100$$ $$3x_1 + 2x_2 + 3x_3 \le 80$$ We are also trying to maximize profit, which is $$40x_1 + 20x_2 + 35x_3$$ In matrix form: Maximize: $$\left( \begin{array}{rcl} 40&20&35 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3 \end{array} \right)$$ Given the constraints: $$\left( \begin{array}{rcl} x_1\\x_2\\x_3 \end{array} \right) \ge 0$$ and $$\left( \begin{array}{rcl} 5&3&0\\3&2&3 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3 \end{array} \right) \le \left( \begin{array}{rcl} 100\\80 \end{array} \right)$$

Question 2: Second Primal Form
Convert the answer from Question 1 into second primal form.

Answer to Question 2
Maximize: $$\left( \begin{array}{rcl} 40&20&35 \end{array} \begin{array}{rcl} 0&0 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right)$$ Given the constraints: $$\left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right) \ge 0$$ and $$\left( \begin{array}{rcl} 5&3&0\\3&2&3 \end{array} \begin{array}{rcl} 1&0\\0&1 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right) = \left( \begin{array}{rcl} 100\\80 \end{array} \right)$$

Question 3: Matlab Example
Consider the example problem from the lessons: Maximize $$\left( \begin{array}{rcl} 1&2 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2 \end{array} \right)$$ given the constraints $$\left( \begin{array}{rcl} x_1\\x_2 \end{array} \right) \ge 0$$ and $$\left( \begin{array}{rcl} 11&2 \\ 1&3 \\ 5&6 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2 \end{array} \right) \le \left( \begin{array}{rcl} 44\\12\\30 \end{array} \right)$$ Solve this problem using the Simplex method in Matlab. Don't write you're own code for the Simplex method. Rather, use the "linprog" function in Matlab (type "help linprog" in Matlab for information).

Answer to Question 3
Use the following code in Matlab: options = optimset('LargeScale', 'off', 'Simplex', 'on'); lb = zeros(2,1); A = [11,2; 1,3; 5,6]; b = [44; 12; 30]; c = [1;2]; x = linprog(-c, A, b, [], [], lb, [], [], options); disp(x); The output is: Optimization terminated. 2.0000    3.3333 The solution vector is (2, 10/3). Note that you have to specify at the beginning that you prefer to use the Simplex method. Setting lb (lower bound) to zero requires that $$\textbf{x} \ge 0$$. Also, note that -c is input instead of c, because linprog finds a minimum value instead of a maximum value.

Question 4: Example Walkthrough
Consider the same example from question 3. Perform one iteration of the Simplex Method using the algorithm described in lesson 4.

Recall that the second primal form of this example is as follows: Maximize: $$\left( \begin{array}{rcl}1&2\end{array} \begin{array}{rcl}0&0&0 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right)$$ given the constraints: $$\left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right) \ge 0$$ and $$\left( \begin{array}{rcl} 11&2 \\ 1&3 \\ 5&6 \end{array}            \begin{array}{rcl} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{array} \right) \left( \begin{array}{rcl} x_1\\x_2\\x_3\\x_4\\x_5 \end{array} \right) = \left( \begin{array}{rcl} 44\\12\\30 \end{array} \right)$$

Answer to Question 4
Start out with k1 = 3, k2 = 4, and k3 = 5. In the first iteration of the algorithm, the matrix B is the identity matrix. This makes solving $$\textbf{Bx} = \textbf{b}$$ very simple: x = b = (44,12,30)T. Thus, the start vertex in the algorithm is (0,0,44,12,30)T. The 2-D vertex is just (0,0)T.



Next we need to solve $$\textbf{B}^T\textbf{y} = \textbf{e}$$, where e = (0,0,0)T. Since BT is still the identity matrix, y = (0,0,0)T is the solution. We want to choose s from the numbers 1 and 2 to maximize $$c_s - \textbf{y}^T\textbf{a}^{(s)}$$. Since y is zero, we just need to maximize cs. c1=1 and c2=2, so cs is maximized when s=2.

Next solve $$\textbf{Bz} = \textbf{a}^{(2)}$$, which again is a trivial system of equations. a(2) is the second column of A, which is (2,3,6)T. So z = (2,3,6)T. x1/z1 = 44/2 = 22 x2/z2 = 12/3 = 4 x3/z3 = 30/6 = 5 Thus, choosing r = 2 minimizes xr/zr.

Next we let kr=s, which means k2=2. Go to the next iteration with k1=3, k2=2, and k3=5.

The next vertex will be (0,4,36,0,6)T, which makes the 2-D vertex (0,4)T. The image at the right, made with the help of Matlab, shows the step that the first iteration takes. The eventual solution is marked with a circle.