Skip to main content

RBE PhD Final Defense : Lening Li | Optimal Control and Reinforcement Learning for Stochastic Systems under Temporal Logic Specifications

Tuesday, August 23, 2022
10:00 am to 12:00 pm


Floor/Room #: 
150 E

RBE PhD Final Defense 


Lening Li

Optimal Control and Reinforcement Learning for Stochastic Systems under Temporal Logic Specifications

Tuesday, August 23rd, 2022

10:00 AM - 12:00 PM

Location: Unity Hall room 150E

Abstract: We commonly encounter stochastic dynamic systems in diverse disciplines, such as economics, robotics, military operations, and cybersecurity. Separate applications specify various system properties compelled by high-level specifications. These specifications include liveness (something good will always eventually happen), safety (nothing bad will happen), and fairness (all constituent processes will involve, and non will starve). To satisfy these properties, this research aims to answer how to synthesize control policies for stochastic dynamic systems efficiently.

 This thesis presents a comprehensive probabilistic optimal planning framework for stochastic dynamic systems given high-level specifications. The first contribution is to define translations from a class of Probabilistic Computation Tree Logic (PCTL) formulas to chance constraints, where PCTL formulas specify hard constraints for systems. We present an efficient approximate dynamic programming (ADP) algorithm that adopts on-policy sampling to learn a near-optimal value function with the corresponding policy. The second contribution is to discover the structural information of task automata translated from Linear Temporal Logic (LTL) formulas that express temporal objectives. We leverage such topological information to improve the convergence due to sparse and temporal extended rewards. The third contribution is to offer a variant actor-critic algorithm to learn the policy for continuous-state systems efficiently. To further boost the performance, we incorporate modular learning in the proposed actor-critic algorithm and guide the learning with topological information. Last, we extend our framework to a two-player setting. We introduce a class of hypergame models that capture such adversarial interaction in the presence of asymmetric, incomplete information and establish a solution concept for this class of hypergames.  

Advisor: Jie Fu, Assistant Professor of Electrical & Computer Engineering, University of Florida

Committee members: Andrew Clark, Associate Professor of Electrical & Systems Engineering, Washington University in St. Louis

Raghvendra V. Cowlagi, Associate Professor of Aerospace Engineering, WPI

Carlo Pinciroli, Assistant Professor of Robotics Engineering, WPI