Control Barriers in Bayesian Learning of System Dynamics

This paper focuses on learning a model of system dynamics online while satisfying safety constraints. Our objective is to avoid offline system identification or hand-specified models and allow a system to safely and autonomously estimate and adapt its own model during operation. Given streaming observations of the system state, we use Bayesian learning to obtain a distribution over the system dynamics. Specifically, we use a matrix variate Gaussian process (MVGP) regression approach with efficient covariance factorization to learn the drift and input gain terms of a nonlinear control-affine system. The MVGP distribution is then used to optimize the system behavior and ensure safety with high probability, by specifying control Lyapunov function (CLF) and control barrier function (CBF) chance constraints. We show that a safe control policy can be synthesized for systems with arbitrary relative degree and probabilistic CLF-CBF constraints by solving a second order cone program (SOCP). Finally, we extend our design to a self-triggering formulation, adaptively determining the time at which a new control input needs to be applied in order to guarantee safety.

Comparing two controllers, one that accounts for variance of estimated dynamics (Bayes CBF), while other does not (Mean CBF).

Comparison of enforcing CBF constraints with Ackermann dynamics when accounting (Bayes CBF) and not accounting (Mean CBF) for the variance in the dynamics estimate. The top row shows the Ackermann vehicle trajectory in dashed blue with two obstacles in red, obtained The contour plots shows the minimum of the SCBC values corresponding to the two obstacles, evaluated on the $(x,y)$ grid while keeping $\theta$ and $\bfu$ fixed. The middle row shows the magnitude of the velocity input over time. The bottom row shows the minimum of the two SCBC values over time. Enforcing safety using only the mean CBC (Mean CBF) results in a collision, while accounting for stochastic CBC (Bayes CBF) constraint causes the Ackermann vehicle to slow down and turn away from the unsafe region.

Comparing two controllers, one that uses learning and other that does not.

The effect of online dynamics learning (right) versus no online learning (left) on the safe control of an Ackermann vehicle. The top row shows the vehicle trajectory in dashed blue with two obstacles in red. The middle row shows the trace of the covariance matrix $\tr(\bfB_k(\bfx, \bfx) \otimes \bfA)$, which we use as a measure of uncertainty. The bottom row shows the minimum of the two probabilistic safety constraint over time. Note that without learning, the vehicle gets stuck between the two obstacles because the uncertainty in the dynamics is too high, i.e., the safety condition cannot be rendered positive. With online learning, however, the uncertainty is reduced enough to allow the safety condition to become positive in the area between the two obstacles. The dynamics distribution is updated every 40 time steps. Note the drop in uncertainty in the middle row at these time steps.

Code

[github]

Citation

If you find our papers/code useful for your research, please cite our work as follows.

1. V. Dhiman, M. Khojasteh, M. Franceschetti, N. Atanasov. Control Barriers in Bayesian Learning of System Dynamics . In Submission 2020

@misc{dhiman2020control,
  title={Control Barriers in Bayesian Learning of System Dynamics},
  author={Dhiman, Vikas and Khojasteh, Mohammad Javad and Franceschetti, Massimo and Atanasov, Nikolay},
  url = {https://vikasdhiman.info/Bayesian_CBF},
  year={2020}
}

1. M. Khojasteh, V. Dhiman, M. Franceschetti, N. Atanasov. Probabilistic Safety Constraints for Learned High Relative Degree System Dynamics . In Learning for Dynamics and Control, PMLR 120:781-792, 2020.

@InProceedings{pmlr-v120-khojasteh20a,
 title = {Probabilistic Safety Constraints for Learned High Relative Degree System Dynamics},
 author = {Khojasteh, Mohammad Javad and Dhiman, Vikas and Franceschetti, Massimo and Atanasov, Nikolay},
 booktitle = {Learning for Dynamics and Control},
 pages = {781--792},
 year = {2020},
 volume = {120},
 series = {Proceedings of Machine Learning Research},
 address = {The Cloud},
 month = {10--11 Jun},
 publisher = {PMLR},
 pdf = {http://proceedings.mlr.press/v120/khojasteh20a/khojasteh20a.pdf},
 url = {http://proceedings.mlr.press/v120/khojasteh20a.html},
 }

Acknowledgements

We gratefully acknowledge support from ARL DCIST CRA W911NF-17-2-0181 and NSF awards CNS-1446891, ECCS-1917177, and IIS-2007141.
This webpage template was borrowed from https://thaipduong.github.io/sbkm/.