- This event has passed.
STOR Colloquium: Zaiwei Chen, Georgia Institute of Technology
10 Jan @ 3:30 pm - 4:30 pm
STOR Colloquium: Zaiwei Chen, Georgia Institute of Technology
10 Jan @ 3:30 pm – 4:30 pmA Lyapunov Theory for Finite-Sample Analysis of Reinforcement Learning Algorithms
This work develops a unified framework to study finite-sample convergence guarantees of a large class of reinforcement learning (RL) algorithms. We do this by first reformulating the RL algorithms as Markovian Stochastic Approximation (SA) algorithms to solve fixed-point equations. We then develop a Lyapunov analysis and derive mean-square error bounds on the convergence of the Markovian SA, where we introduce the Generalized Moreau Envelope as the novel Lyapunov function. Based on this powerful result on SA, we establish finite-sample mean-square bounds for a variety class of RL algorithms such as Q-learning, n-step TD, TD(\lambda), and off-policy TD algorithms including V-trace. As a by-product, by analyzing the convergence bounds of n-step TD and TD(\lambda), we provide theoretical insights into the bias-variance trade-off, i.e., efficiency of bootstrapping in RL. This was first posed as an open problem in (Sutton1999).