Hotelling Lecture: Yuval Peres, Microsoft Research
Towards Optimal Algorithms for Prediction with Expert Advice
We study the classical problem of prediction with expert advice in the adversarial setting with a geometric stopping time. Cover (1965) gave the optimal algorithm that minimizes worst-case regret for the case of 2 experts. In this talk, I will describe the optimal algorithm, adversary and regret for the case of 3 experts. We will see that optimal algorithm for 2 and 3 experts is a probability matching algorithm (analogous to Thompson sampling) against a particular randomized adversary. Remarkably, it turns out that this algorithm is not only optimal against this adversary, but also minimax optimal against all possible adversaries. We establish a constant factor separation between the regrets achieved by the optimal algorithm and the widely used multiplicative weights algorithm. A novel aspect of our analysis is that upper and lower bounds are proved simultaneously, analogous to the primal-dual method. The analysis of the optimal adversary relies on delicate random walk estimates. At the end of the talk, I will discuss the case of “Bandit feedback”, when we just learn the gain of the action we chose, and analyze the effects of imposing a switching cost.
(Talk based on joint works with Nick Gravin and Balu Sivan and with Ofer Dekel, Jian Ding and Tomer Koren.)
Refreshments will be served at 3:00pm in the 3rd floor lounge of Hanes Hall