IE8571
Download as PDF
IE 8571 - Advanced Reinforcement Learning and Dynamic Programming (4 Cr.)
Industrial and Systems Engineering (11138)
TIOT - College of Science and Engineering
Course description
Markov Decision Processes (MDPs) form a rich class of mathematical models for sequential decision problems under uncertainty and provide a rigorous foundation for Reinforcement Learning (RL). The first part of this course will combine techniques from optimization and stochastics to build a modeling, theoretical, and algorithmic foundation for MDPs. Topics such as finite- and infinite-horizon MDPs; Bellman’s equations of dynamic programming; value iteration, policy iteration, and linear programming-based solution algorithms; partially observable MDPs; robust MDPs; stochastic games; continuous-time MDPs; semi-Markov decision processes; and continuous-time deterministic control will be covered. The second part of the course will build on this foundation to introduce fundamental ideas and solution techniques in RL. These will include Monte Carlo Policy Iteration, Q-learning, Temporal-Difference Learning, and Neuro-Dynamic Programming.
Prereq: knowledge of optimization and stochastic models at the undergraduate level and familiarity with a computer programming language such as Python.
Prereq: knowledge of optimization and stochastic models at the undergraduate level and familiarity with a computer programming language such as Python.
Minimum credits
4
Maximum credits
4
Is this course repeatable?
No
Grading basis
OPT - Student Option
Lecture
Requirements
000017
Fulfills the writing intensive requirement?
No
Typically offered term(s)
Fall Odd Year