In this repository, MATLAB and Python functions are provided for solving a general nonlinear optimal control problem using the gradient descent approach.
You can define your optimal control problem symbolically using MATLAB symbolic or Python sympy and the functions will solve the problem. Several demos are provided alongside the main functions.
In optimal control theory a standard optimal control is defined as
The solution to the problem above comes from calculus of variations. A Hamiltonian function is defined as
And the optimal control input, in absence of input limits, can be calculated using the relationships below:
Here p denotes the co-states. This set of equations is generally hard to solve, because they are two point boundary nonlinear equations. The initial values of x and the final values of p are known.
One way to solve this set of equations is using the gradient descent algorithm. An initial guess of the control input u is selected and the equations are solved for x and p, given the boundary values. Then u is corrected using the gradient of the Hamiltonian.
When the final time
For a problem with free
At the optimal solution, not only must the control satisfy
This condition arises because when
- the terminal cost changes as
$\frac{\partial \phi}{\partial t} \delta t_f$ - the integral cost changes by adding a time slice of width
$\delta t_f$ with integrand$g + p^T f = \mathcal{H}$ .
At optimality, these combined effects must sum to zero.
The solver implements this by performing alternating optimization: first updating the control u via gradient descent on
The exact logic explained in the theoretical background has been implemented in the function optimalControlSolver (and the same function for Python). Here, we go over the variables, inputs and outputs of the function.
I've only explained the MATLAB script, but the Python function is similar as well. I've even tried to use the exact same names in both functions.
Problem:
Usage:
[sol, info] = optimalControlSolver(symF, symG, symPhi, xSym, uSym, tGrid, x0, U0, opts)
Inputs:
symF: symbolic vector field f(x,u) of size [n x 1]symG: symbolic scalar running cost g(x,u)symPhi: symbolic scalar terminal cost Phi(x)xSym: symbolic state vector [x1; x2; ...; xn]uSym: symbolic control vector [u1; u2; ...; um]tGrid: time grid (column or row) of size [N x 1] or [1 x N], increasing, with tGrid(1) = 0x0: initial state (numeric) [n x 1]U0: initial control trajectory over tGrid [N x m]opts: options struct (all optional fields):maxIters(default 50)alpha(default 1.0) initial step size for gradient descentbeta(default 0.5) backtracking reduction factor (0<beta<1)c1(default 1e-4) Armijo condition constanttol(default 1e-6) stopping tolerance on ||grad_u||_FodeOptions(default []) options set by odesetinterp(default 'linear') 'linear' or 'zoh' for u/x interpolationuLower(default []) lower bounds on u (1x m) or scalaruUpper(default []) upper bounds on u (1x m) or scalarmaxLineSearch(default 10)verbose(default true)
Outputs:
sol.t: time grid [N x 1]sol.X: state trajectory along tGrid [N x n]sol.U: control trajectory along tGrid [N x m]sol.P: co-state trajectory along tGrid [N x n]sol.J: final cost value at solutionsol.J_hist: cost history per iterationsol.grad_norm_hist: gradient-norm history per iterationinfo.iters: number of iterations performed
Requirements:
- MATLAB Symbolic Math Toolbox for MATLAB
sympyfor Python
- Instead of a simple gradient descent with a constant step size, the Armijo condition is checked every time and backtracking is used to find an appropriate step size
- Three sample scripts have been provided:
- demo.m (demo.py for Python) contains a linear system with two states and two inputs.
- CSTR.m (CSTR.py for Python) solves the optimal control problem for a CSTR system (example 6.2-2 from Kirk's book).
- Free time CSTR (CSTR_freeTf.py for Python) solves the same optimal control problem as CSTR.m, but with free final time
This repo is maintained by me. Contributions are welcome as well. Feel free to create a pull request.