|
Benjamin Van Roy:
Learning to Optimize: Delayed Consequences
Abstract: Learning to make effective decisions that may influence observations appearing after subsequent decisions poses challenges beyond those faced when all consequences are immediate. In particular, observations must somehow be attributed to past actions. The area of reinforcement learning addresses this issue alongside the challenges of exploration and generalization. I will discuss reinforcement learning algorithms and results pertaining to them.
|