In this blog post I will explain how I achieved smooth altitude control of a quadrotor model in Mujoco using LQR. I will explain the math and code used to achieve this (see video). And hopefully you will learn something too!
Disclaimer: I am a total noob at this and have no idea what I am doing. This post is meant to give you an insight into how I did this, this is not meant to be viewed as a tutorial or to be replicated. This is also not a Mujoco tutorial so a basic understanding of Mujoco is expected
For this project I wanted to teach myself a new control method and after searching online for a while I discovered the linear quadratic regulator (LQR). On a high level, LQR controls a system by finding a set of control variables that minimizes a certain cost function. In a bit more fancier terms, LQR is an optimal control strategy.
The cost function in LQR consists of two parts, the accuracy cost and the efficiency/energy cost. These two are weighted then summed together to form the total cost. When designing the control system you decide how you want to weight each of these costs, in some cases you might care more about the accuracy/speed of the system, then you’ll probably want to weight the accuracy a bit more. In other case you might have actuators that require a lot of energy and that you might not have, in those cases you want to weight the energy/efficiency higher. You probably understand that a big part of designing an efficient LQR system is selecting the different cost weights wisely.
I must admit, the description of LQR uses a bit of “scary” math notations, however the general concept and intuition behind is not super hard.
Note: this is not an in-depth walkthrough of the math, a lot of things could be covered much more deeply. This is mostly to give you an overview of what it is and how it can be solved.
In LQR you model the system using a linear (duh) state space system which looks like this:
$$ \dot{x} = Ax(t)+Bu(t)
$$
$$ y = Cx(t) + Du(t) $$
Variables:
x: Represents the state of the system
u: Represents the control variables / inputs of the system
A: “system” matrix
B: “input” matrix