Two Axis Cart Pole

Environement

Modeling

A frame of the system can be expressed as: \(\textbf{x} = \left[ x, y, \theta_x, \theta_y, \dot{x}, \dot y, \dot{\theta_x}, \dot{\theta_y} \right]\)

Variable	Description	minVal	maxVal
\(x\)	carts position along x-axis	-2	2
\(y\)	carts position along y-axis	-2	2
\(\theta_x\)	poles angle along the x axis about the cart in radians	\(0\)	\(2\pi\)
\(\theta_y\)	poles angle along the y axis about the cart in radian	\(0\)	\(2\pi\)
\(\dot x\)	be the linear velocity of cart along x-axis	\(-\infty\)	\(\infty\)
\(\dot y\)	be then linear velocity of cart along y-axis	\(-\infty\)	\(\infty\)
\(\dot \theta_x\)	the angular velocity of the pole along the x axis about the cart	\(-\infty\)	\(\infty\)
\(\dot \theta_y\)	the angular velocity of the pole along the y axis about the car	\(-\infty\)	\(\infty\)

Kinematrics

The position and velocity of of the mass at the end of the pendulum can be obtained as follow:

\[\textbf{x}_{p} = \begin{bmatrix} x_p \\ y_p \\ z_p \end{bmatrix} = \begin{bmatrix} x_c + \ell\sin(\theta_x) \\ y_c + \ell\sin(\theta_y) \\ \ell\cos(\theta_x)\cos(\theta_y) \end{bmatrix}\]

\[\dot{\textbf{x}}_{p} = \begin{bmatrix} \dot x_p \\ \dot y_p \\ \dot z_p \end{bmatrix} = \begin{bmatrix} \dot x_c + \ell \cos(\theta_x) \dot \theta_x \\ \dot y_c + \ell \cos(\theta_y) \dot \theta_y \\ -\ell \sin{(\theta_x)}\cos{(\theta_y)} \dot\theta_x - \ell \sin{(\theta_y)}\cos{(\theta_x)} \dot \theta_y \end{bmatrix}\]

Lagrangian

The first step in creating an LQR controller is the linearize the systems equation of motions about a fixed point. We can calculate these equations using the Lagrangian which state that, \[L = T - V\] where \(T\) and \(V\) represent the kinetic and potential energy of the system respectively, obtaining:

Kinetic Energy

Applying the equation for Kinetic Energy (\(KE = \frac{1}{2}mv^2\)) yields: \[T_p = \frac{1}{2} m_p(\dot x_p^2 + \dot y_p^2 + \dot z_p^2)\] \[T_c = \frac{1}{2} m_c(\dot x_c^2 + \dot y_c^2)\] We must also consider the angular movement or rotational kinetic energy of the pole in our equation (\(KE = \frac{1}{2} I \omega^2\)): \[T_{r} = \frac{1}{2} I (\dot \theta_x^2 + \dot\theta_y^2)\] where \(I = m_p \ell^2\) represents the moment of inertia of the pendulum mass. Thus the total kinetic energy defined as: \[T = T_p + T_c + T_{r}\]
Potential Energy

The potential energy of the effect of graviety on the pendulum mass (\(V = mgh)\) is the only source considered (energy of the cart is not considered).

\[V = m_p g \ell \cos(\theta_x) \cos(\theta_y) \]

Full Equations
\[L =\frac{1}{2} \left(m_p (\dot x_p^2 + \dot y_p^2 + \dot z_p^2) + I (\dot \theta_x^2 + \dot \theta_y^2) + m_c (\dot x_c^2 + \dot y_c^2) \right) - m_p g \ell \cos(\theta_x) \cos(\theta_y) \]

Equations of Motion

Using the methods of Legrange, we can write the equation of motion using generalized cordinates. The dynamics equations (\(Q_i\)) for the generalized coordinates (\(q_i\)) is obtained from the equation: \[\frac{d}{dt}\left(\frac{\partial L}{\partial \dot q_i}\right) - \frac{\partial L}{\partial q_i} = Q_i\] In this system, the generalized coordinates choosen will be \(q =\left[x_c , y_c, \theta_x, \theta_y \right]\). Since there are variables in \(L\) that are not in \(q\), subsituting them for their equivalent to contain only values from \(q\) reveals:

\[L = \frac{1}{2}\left( m_p\left[ \left(\dot x_c + \ell\cos(\theta_x)\dot \theta_x\right)^2 + \left(\dot y_c +\ell\cos(\theta_y)\dot\theta_y \right)^2 + \left(-\ell \sin{(\theta_x)}\cos{(\theta_y)} \dot\theta_x - \ell \sin{(\theta_y)}\cos{(\theta_x)} \dot \theta_y\right)^2\right] + I(\dot\theta_x^2 + \dot\theta_y^2) + m_c(\dot x_c^2 +\dot y_c^2)\right) -m_p g\ell\cos(\theta_x)\cos(\theta_y)\]

The followin is the grueling process of deriving all of the equations of motions and subsitution to contain only generalized cordinates and constants.

\(x_c\)
- \(\frac{\partial L}{\partial \dot x_c}\) \[\boxed{\dot x_c(m_c + m_p) + m_p \ell \cos(\theta_x)\dot \theta_x}\]
- \(\frac{d}{dt} \left( \frac{\partial L}{\partial \dot x_c}\right)\) \[\boxed{\ddot x_c( m_c + m_p)+ m_p \ell \left(\cos(\theta_x)\ddot \theta_x - \sin(\theta_x)\dot \theta_x^2 \right)}\]
- \(\frac{\partial L}{\partial x_c}\) \[\boxed{0}\]
- \(F_x\) \[\boxed{F_x = \ddot x_c(m_c+m_p) + m_p\ell\left(\cos(\theta_x)\ddot \theta_x-\sin(\theta_x)\dot \theta_x^2 \right)}\]
\(\theta_x\)
- \(\frac{\partial L}{\partial \dot \theta_x}\) \[m_p\ell\cos(\theta_x)\dot x_c+\dot{\theta_x}\left[m_p\ell^2\cos^2(\theta_x)+m_p\ell^2\sin^2(\theta_x)\cos^2(\theta_y)+ I\right] + m_p\ell^2\sin(\theta_x)\sin(\theta_y)\cos(\theta_x)\cos(\theta_y)\dot \theta_y\]
Since the LQR controller will be for stabalizing the system when near its unstable equalibrum, we can make the equatiosns simplier by applying small angle approximations such as:
- \(\cos^2(\theta) \approx 1\)
- \(\sin^2(\theta) \approx 0\)
- \(\sin(\theta_x)\sin(\theta_y)\cos(\theta_x)\cos(\theta_y) \approx 0\)
- \(\cos(\theta)\sin(\theta) \approx 0\)
producing: \[\boxed{m_p\ell\cos(\theta_x)\dot x_c + \dot \theta_x\left[m_p\ell^2 +I\right]}\]
- \(\frac{d}{dt}\left(\frac{\partial L}{\partial \dot \theta_x}\right)\) \[\boxed{m_p\ell\cos(\theta_x)\ddot x-m_p\ell\sin(\theta_x)\dot\theta_x\dot x+\ddot\theta_x[m_p\ell^2+I]}\]
- \(\frac{\partial L}{\partial \theta_x}\) \[{m_p\left[(\dot x + \ell\cos(\theta_x)\dot\theta_x)\cdot(-\sin(\theta_x)\dot\theta_x) + (-\ell\sin(\theta_x)\cos(\theta_y)\dot\theta_x-\ell\sin(\theta_y)\cos(\theta_x)\dot\theta_y)\cdot(\ell\sin(\theta_y)\sin(\theta_x)\dot\theta_y-\ell\cos(\theta_x)\cos(\theta_y)\dot\theta_x)\right]+m_pg\ell\sin(\theta_x)\cos(\theta_y)}\]

\[\boxed{-m_p \dot x \sin(\theta_x)\dot\theta_x}\]

\(F_{\theta_x}\)

After applying small angle approximations, \[\boxed{m_p\ell\cos(\theta_x)\ddot x_c+\ddot\theta_x[m_p\ell^2+I]+m_p\ell^2\sin(\theta_x)\cos(\theta_y)\dot\theta_x^2+m_pg\ell\sin(\theta_x)\cos(\theta_y)}\]

The others can be derived by symetry.

Linearization

LQR requires that the dyanmics of system be modeled as the linear equation \(\dot{\textbf{x}} = A\textbf{x} + Bu\) around a stable fixed point (the pendulum-up position) as \(\textbf{x}_{f} =[x_c, y_c, 0, 0]\) and \(\dot{\textbf{x}}_f = [0, 0, 0, 0]\) using only control inputs \(u = [F_x, F_y]\).

\(\cos(\theta) \approx 1\)
\(\sin(\theta) \approx \theta\)
\(\theta^2 \approx 0\)

We can pose the problem as a coupled equation and solve for \(F_x\).

\[\begin{aligned} \ddot x_c (m_c + m_p) + m_p \ell \ddot \theta_x = F_{x}\\ m_p \ell \ddot x_c + \ddot \theta_x (I + m_p\ell^2) + m_p g \ell \theta_x = 0 \end{aligned}\]

Which we can solve as:

\[\begin{bmatrix} m_c + m_p & m_p\ell \\ m_p\ell & I + m_p\ell^2 \end{bmatrix} \begin{bmatrix}\ddot x_c \\ \ddot \theta_x \end{bmatrix} = \begin{bmatrix} F_x \\ -m_p g\ell\theta_x \end{bmatrix}\]

The inverse of the \(M\) matrix is defined as \(\frac{1}{D} M = M^{-1}\) where \(D = det(M)\)

\[\begin{bmatrix}\ddot x_c \\ \ddot \theta_x \end{bmatrix} = M^{-1}\begin{bmatrix} F_x \\ -m_p g\ell\theta_x \end{bmatrix} = \frac{1}{D} \begin{bmatrix} F_x(m_c + m_p) - \theta_x m_p^2 g \ell^2\\ F_x m_p \ell - \theta_x (m_p g \ell )(I+ m_p \ell^2) \end{bmatrix}\]

Revealing the coeffiicnets to be:

\[ \ddot x_c = F_x \frac{m_c+m_p}{D} - \theta_x \frac{m_p^2 g \ell^2]}{D}\]

\[\ddot \theta_x = F_x \frac{m_p \ell}{D} - \theta_x \frac{(m_p g \ell)(I + m_p\ell^2)}{D}\]

Environement

Modeling

Kinematrics

Lagrangian

Kinetic Energy

Potential Energy

Full Equations

Equations of Motion

\(x_c\)

\(\theta_x\)

Linearization

LQR / Cost-2-Go