Policy Gradient Descent for Control: Global Optimality and Convex Parameterization
Speaker: Maryam Fazel
This talk is co-sponsored with the Center for Applied Mathematics (CAM)
Date: Friday, May 6, 3:45 p.m.
Where: 253 Rhodes Hall
Watch the video of the talk
Abstract: Policy Optimization methods enjoy wide practical use in reinforcement learning (RL) for applications ranging from robotic manipulation to game-playing, partly because they are easy to implement and allow for richly parameterized policies. Yet their theoretical properties, from optimality to statistical complexity, are still not fully understood. To help develop a theoretical basis for these methods, and to bridge the gap between RL and control theoretic approaches, recent work has studied whether gradient-based policy optimization can succeed in designing feedback control policies.
In this talk, we start by showing the convergence and optimality of these methods for controlling linear dynamical systems with quadratic costs, where despite nonconvexity, we show convergence to the optimal policy occurs under mild assumptions. Next, we make a connection between convex parameterizations in control theory on one hand, and the Polyak-Lojasiewicz property of the nonconvex cost function, on the other. Such a connection between the nonconvex and convex (in non-policy parameters) landscapes provides a unified approach towards extending the results to much more complex control problems.
Bio: Maryam Fazel is the Moorthy Family Professor of Electrical and Computer Engineering at the University of Washington, with adjunct appointments in Computer Science and Engineering, Mathematics, and Statistics. Maryam received her MS and PhD from Stanford University, and her BS from Sharif University of Technology in Iran, and was a postdoctoral scholar at Caltech before joining UW. She is a recipient of the NSF Career Award, UWEE Outstanding Teaching Award, and a UAI conference Best Student Paper Award with her student. She directs the Institute for Foundations of Data Science (IFDS), a multi-site NSF TRIPODS Institute. Her current research interests are in the area of optimization in machine learning and control.