Liu awarded NSF grant

September 30, 2022

Professor Yufeng Liu recently received an NSF grant along with Professor. Wei Sun at Purdue University. The title of the 3-year grant is “Trustworthy Reinforcement Learning for Online Decision Making”.

This research project will advance the frontiers of modern reinforcement learning for online decision-making problems. Reinforcement learning deals with how intelligent agents ought to take actions in an uncertain environment in order to maximize the cumulative reward. It has achieved phenomenal success in diverse business and scientific fields. However, less attention has been paid to the trustworthy aspects of reinforcement learning. This project will investigate different aspects of trustworthy issues like robustness, fairness, causality, and explainability in important online decision-making tasks. The results of this research will benefit many different fields such as statistics, machine learning, operations research, marketing, economics, and finance. Open-source software will be developed to provide applied researchers with cutting-edge tools. The project will recruit students, especially those from unrepresented groups, to be involved in the research and will develop new courses on statistical reinforcement learning and decision making.

This research project will focus on three interconnected trustworthy reinforcement learning methods for online decision-making problems: dynamic pricing, dynamic assortment selection, and matching in two-sided markets. Issues of robustness, fairness, causality, and explainability will be addressed in these decision-making tasks, which will advance the exploration techniques used in existing reinforcement learning algorithms. The project will develop new theoretical tools to analyze the statistical properties of these modern reinforcement learning algorithms. Regret upper bounds and matching lower bound will be thoroughly investigated. One important goal of online decision making is to identify an optimal policy that maximizes the overall gain, based on the contextual information and historical interactions with the environment. Due to the complex nature of such problems, there is a high demand for trustworthy tools for learning optimal personalized policy in various settings. The knowledge gained from this research will benefit learning in online auctions and other complex market design problems.