Dynamic Programming and Optimal Control: Session 7

Subsequent to the class I have expanded my solution to question 3.2.

The linear-quadratic regulator (LGR) that I discussed today is one part of the solution to the linear-quadratic-Gaussian control problem (LQG). The LQG problem is perhaps the most important problem in control theory. It has a full solution: given in terms of the Kalman Filter (a linear-quadratic estimator) and the linear-quadratic regulator.

Following today's lecture you can do the assignment of Questions 4-7 on Problem Sheet 3. Here is an important hint. Do not try to solve these problems by plugging in values to the general Riccati equation solution in equations. It is always better to figure out the solution from scratch, by the method of backwards induction. Conjecture for yourself that the optimal value function, $F(x,t)$, is of a form that is quadratic in the state variable, plus some $\gamma_t$ (if it is a problem with white noise). Then find the explicit form of the recurrences, working backwards inductively from a terminal time $h$.

In general, solutions to a Riccati equation can be computed numerically, but not algebraically. However, one can find an full solution in the special case that $x_t$ and $u_t$ are both one-dimensional (Question 4). There is also a fully-solvable special case in which $x_t$ is two-dimensional and $u_t$ is one-dimensional (Question 6). A useful trick in some problems is to observe that $\Pi_t^{-1}$ satisfies a recurrence relation (Question 6).

No one enjoys memorizing the Riccati equation. The important thing is to remember enough of what it is for, how it looks, and how it is derived, so that you could reconstruct its derivation in the context of a particular problem.

The idea of controllability is straightforward and we can understand it using ideas of linear algebra.

Consider again the broom balancing problem. I mentioned that two upright brooms are not controllable around their equilibrium simultaneously if they are the same lengths, but it is possible if the lengths are different. This is because of the forms:

$A=\begin{pmatrix}0 & 1 & 0 & 0\\ \alpha & 0 & 0& 0\\ 0 & 0 & 0& 1\\ 0 & 0& \beta & 0\end{pmatrix}\quad B=\begin{pmatrix}0 \\-\alpha\\0\\-\beta\end{pmatrix}\quad M_4=\begin{pmatrix}0 & -\alpha & 0 & -\alpha^2\\ -\alpha & 0 & -\alpha^2& 0\\ 0 & -\beta & 0& -\beta^2\\ -\beta & 0& -\beta^2 & 0\end{pmatrix}$

where $\alpha = g/L_1$, $\beta=g/L_2$. So $M_4$ is of rank 4 iff $\alpha$ and $\beta$ are different. However, this assumes that as you move your hand you must keep it at constant height from the ground. What do you think might happen if you can move your hand up and down also?

Dynamic Programming and Optimal Control

Monday, February 25, 2013

Session 7

Statcounter