Safety guarantees in stochastic control-affine systems were formuated as
Quadratic contraints on the control signal using Exponential Control
More experiments (closer to the Motivation).
What if QCQP is non-convex?
Entropy objective to pick optimal actions for reducing uncertainity.
Application of Hansen-Wright like inequalities for tighter bounds on \( \CBCr \)
Shengyang Sun, Changyou Chen,
and Lawrence Carin. Learning Structured Weight Uncertainty in
Bayesian Neural Networks. In International Conference on Artificial
Intelligence and Statistics (AISTATS), pages 1283–1292, 2017.
A. D. Ames, S. Coogan, M. Egerstedt, G.
Notomista, K. Sreenath, and P. Tabuada. Control barrier functions:
Theory and applications. In 2019 18th European Control Conference
(ECC), pages 3420–3431, June 2019. doi:
Mauricio A Alvarez, Lorenzo Rosasco, and Neil D Lawrence. Kernels
for vector-valued functions: A review. Foundations and Trends in
Machine Learning, 4(3):195–266, 2012.
Niranjan Srinivas, Andreas Krause, Sham M Kakade, and Matthias
Seeger. Gaussian process opti- mization in the bandit setting: No
regret and experimental design. arXiv preprint arXiv:0912.3995,
Quan Nguyen and Koushil Sreenath. Exponential control barrier
functions for enforcing high relative- degree safety-critical
constraints. In 2016 American Control Conference (ACC), pages
322–328. IEEE, 2016a.
Louizos, Christos, and Max Welling. "Structured and efficient
variational deep learning with matrix gaussian posteriors."
International Conference on Machine Learning. 2016.
Khojasteh, M. J., Dhiman, V., Franceschetti, M., & Atanasov, N. (2020). Probabilistic safety constraints for learned high relative degree system dynamics. L4DC 2020. available https://arXiv.org/abs/1912.10116.
Learning from Interventions using Hierarchical Policies for Safe Learning
J Bi, V Dhiman, T Xiao, C Xu - AAAI 2020. Available https://arXiv.org/abs/1912.02241
Learning Navigation Costs from Demonstration in Partially Observable Environments
T Wang, V Dhiman, N Atanasov. ICRA 2020. Available https://arXiv.org/abs/2002.11637
Andrychowicz, Marcin, et al. "Hindsight experience replay." Advances in Neural Information Processing Systems. 2017.
Mutual localization: Two camera relative 6-dof pose estimation from reciprocal fiducial observation. V Dhiman, J Ryde, JJ Corso. IROS 2013
Learning Compositional Sparse Models of Bimodal Percepts. S Kumar, V Dhiman, JJ Corso AAAI, 2014
Voxel planes: Rapid visualization and meshification of point cloud ensembles.
J Ryde, V Dhiman, R Platt IROS, 2013
Modern MAP inference methods for accurate and fast occupancy grid mapping on higher order factor graphs. V Dhiman, A Kundu, F Dellaert, JJ Corso ICRA 2014
Continuous occlusion models for road scene understanding M Chandraker, V Dhiman. US Patent 9,821,813, 2017
A continuous occlusion model for road scene understanding V Dhiman, QH Tran, JJ Corso, M Chandraker. CVPR 2016
A Critical Investigation of DRL for Navigation V Dhiman, S Banerjee, B Griffin, JM Siskind, JJ Corso NeurIPS DRL Workshop, 2017.
Learning Compositional Sparse Bimodal Models S Kumar, V Dhiman, PA Koch, JJ Corso. PAMI, 2017.
(Mirowski et al. 2017) Learning to navigate in complex environments. In ICLR 2017.
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and
Request for Research. Matthias Plappert and Marcin Andrychowicz and Alex
Ray and Bob McGrew and Bowen Baker and Glenn Powell and Jonas Schneider
and Josh Tobin and Maciek Chociej and Peter Welinder and Vikash Kumar and
Wojciech Zaremba. ArXiV 2018. 1802.09464
Kaelbling, Leslie Pack. "Learning to achieve goals." IJCAI. 1993.
V. Dhiman, S. Banerjee, J. M. Siskind, and J. J. Corso. Learning goal-conditioned
value functions with one-step path rewards rather than goal-rewards. In
Submitted to ICLR, 2019. Under review.
Zachariou, Peter et al. “SPEEDING Effects on hazard perception and reaction time.” (2011).
Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529.
Watkins, Christopher JCH, and Peter Dayan. "Q-learning." Machine learning 8.3-4 (1992): 279-292.
Pearl, Judea. "Fusion, propagation, and structuring in belief networks." Artificial intelligence 29.3 (1986): 241-288.
Jojic, Vladimir, Stephen Gould, and Daphne Koller. "Accelerated dual decomposition for MAP inference." ICML. 2010.
Merali, Rehman S., and Timothy D. Barfoot. "Occupancy grid mapping with Markov chain monte carlo Gibbs sampling." Robotics and Automation (ICRA), 2013 IEEE International Conference on. IEEE, 2013.
Shayle R Searle and Marvin HJ Gruber.Linear models. John Wiley & Sons, 1971