Alternation Makes the Adversary Weaker in Two-player Games

发布者:梁慧丽发布时间:2025-05-08浏览次数:10

报告人

Stratis Skoulakis

Assistant Professor in the Department of Computer Science at Aarhus University

时间

2025年2月20日 星期四

上午 10:00-11:00

地点

308会议室


Abstract


Motivated by alternating game-play in two-player games, we present an altenating variant of the Online Linear Optimization (OLO). In alternating OLO,  a learner at each round t, selects a vector x^t and then an adversary selects a cost-vector c^t. The learner then experiences costinstead ofas in standard OLO. We establish that under this small twist, the Ω(√T) lower bound on the regret is no longer valid. More precisely, we present two online learning algorithms for alternating OLO that respectively admit O(T^{1/3}) regret for the simplex and O(ρ log T) regret for the ball of radius ρ>0. Our results imply that in alternating game-play, an agent can always guarantee O(T^{1/3}) regardless the strategies of the other agent while the regret bound improves to O(log T) in case the agent admits only two actions.


Biography


Stratis Skoulakis is an Assistant Professor in the Department of Computer Science at Aarhus University. Before that he was a postdoctoral researcher at EPFL and Singapore University of Technology and Design. He received his PhD from National Technical University of Athens in algorithmic game theory and theoretical computer science. His research focuses on the mathematical foundations of game theory, machine learning, and optimization.



搜索
您想要找的