|
Alexandre Proutiere:
Structured Bandit Optimization: Fundamental Limits and Efficient Algorithms
Abstract:
Bandit optimization problems constitute the most fundamental and basic instances of sequential decision problems with an exploration-exploitation trade-off. They naturally arise in many contemporary applications found in communication networks, e-commerce and recommendation systems. This lecture provides a survey of recent results for stochastic bandit problems with known structure. In such problems, the expected reward, seen as a function of the various possible decisions or arms, exhibits a known structure, e.g. unimodal, linear, convex, sub modular; and of course, any efficient sequential decision algorithm should leverage this structure. We present fundamental performance limits satisfied by any sequential decision algorithm, and devise algorithms that approach these limits, i.e., that optimally exploit the structural properties of the problem. Results are applied to the design of protocols and resource sharing algorithms in wireless systems, and to recommendation systems.
|