Landelijk Netwerk Mathematische Besliskunde
Course MDP: Markov Decision Processes
Time: |
Monday 15.15 - 17.00 (September 9 - November 11) |
Location: |
All LNMB courses can again be attended on the Campus Utrecht Science Park. Details about lecture rooms, as well as online facilities for students and lecturers follow upon registration. |
Lecturers: |
Dr. Aleida Braaksma (UT), Prof. Dr. F.M. Spieksma (UL) |
Course
description:
(for participants of this course: see the lecturers' website)
The theory of Markov decision processes (MDPs) - also known under the
names sequential decision theory, stochastic control or stochastic
dynamic programming - studies sequential optimization of stochastic
systems by controlling their transition mechanism over time. Each
control policy defines a stochastic process and values of objective
functions associated with this process. The goal is to select a control
policy that optimizes a function of the values generated by the utility
functions.
In real life, decisions
that are made usually have two
types of impact. Firstly, they cost or save resources, such as money or
time.
Secondly, by influencing the dynamics of the system they have an impact
on the
future as well. Therefore, the decision with the largest immediate
profit may
not be good in view of future rewards in many situations. MDPs model
this
paradigm and can be used to model many important applications in
practice. In
this course we provide results on the structure and existence of good
policies,
on methods for the computation of optimal policies, and illustrate them
by
applications.
Contents of the lectures:
1. Model formulation, policies, optimality criteria, the finite
horizon.
2. Average rewards: optimality equation and solution methods.
3. Discounted rewards: optimality equation and solution methods.
4. Structural properties.
5. Applications of MDPs.
6. Further topics in MDPs
Literature:
Lecture notes will be provided.
Prerequisites:
- Elementary knowledge of linear programming (e.g. K.G. Murty, Linear
programming, Wiley, 1983).
- Elementary knowledge of probability theory ( e.g. S.M. Ross, A first
course in probability, Macmillan, New York, 1976).
- Elementary knowledge of (numerical) analysis (e.g. Banach space;
contracting mappings; Newton’s method; Laurent series).
Examination:
Take home problems.
Address of the lecturers:
Dr. Aleida Braaksma
E-mail: a.braaksma@utwente.nl
Dr. F.M. Spieksma
Mathematical Institute, Leiden University
P.O. Box 9512, 2300 RA Leiden
Phone: 071 - 5277128 E-mail: spieksma@math.leidenuniv.nl
|