LQR Control for an Inverted Pendulum on a Cart

The purpose of this page is to show the derivation of the linearized dynamics for the highly non-linear cart-pole system and to propose an algorithm to calculate the optimal balancing input using a LQR controller. This system is classically studied in non-linear controls.

A LQR controller is used to stabilize the cart-pole system around its unstable equilibrium with the cart at the origin and the pole in its upright position.The cart-pole is described by the following dynamics:

$$(m_c + m_p) \ddot{x} + m_p l \ddot{\theta} \cos \theta - m_p l \dot{\theta}^2 \sin \theta = u$$

$$m_p l \ddot{x} cos \theta + m_p l^2 \ddot{\theta} + m_p g l \sin \theta = 0$$

This may be put into standard form with $$q = [x,\theta]^T$$:

$$\mathbf{H} (\mathbf{q}) \mathbf{\ddot{q}} + \mathbf{C} (\mathbf{q}, \mathbf{\dot{q}}) \mathbf{\dot{q}} + \mathbf{G} (\mathbf{q}) = \mathbf{Bu}$$

Where the above matrices are defined as:



The linearized dynamics are found by performing a Taylor expansion around the fixed point of interest:



With the linearized dynamics a stable LQR controller may be calculated using the cost function:

$$g(x,u) = (x^T \textbf{Q}x + u^T \textbf{R}u)$$

$$\sum^\infty_{k=0} \gamma^k g(x_k,u_k)$$

The optimal solution to this cost function may be found using techniques in dynamic programming and is of the form:

$$\mathbf{K}_{i + 1} = -(\mathbf{R} + \gamma\mathbf{B}^T \mathbf{P}_i \mathbf{B})^{-1} \mathbf{B}^T \mathbf{P}_i \mathbf{A}$$

$$\mathbf{P}_{i + 1} = \mathbf{Q} + \mathbf{K}^T_{i + 1} \mathbf{R} \mathbf{K}_{i + 1} + \gamma (\mathbf{A} + \mathbf{B} \mathbf{K})^T \mathbf{P}_i (\mathbf{A} + \mathbf{B} \mathbf{K}_{i + 1})$$

The following code may be used in MATLAB to converge on the optimal gain matrix:



For the given system, the optimal gain matrix can be found to be:

$$\mathbf{K} = \begin{bmatrix} 21.13 & -320.74 & 30.23 & -70.18 \end{bmatrix}$$

With the Qand R weights as follows:

$$\mathbf{Q} = \begin{bmatrix} 500 & 0 & 0 & 0 \\ 0 & 100 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, R = 1$$

And the optimal control input can be calculated.

$$u(t) = \mathbf{K}x(t)$$

This policy can be empirically shown to balance the pendulum at its upright position: