IE8571

Download as PDF

IE 8571 - Advanced Reinforcement Learning and Dynamic Programming (4 Cr.)

Industrial and Systems Engineering (11138) TIOT - College of Science and Engineering

Course description

Markov Decision Processes (MDPs) form a rich class of mathematical models for sequential decision problems under uncertainty and provide a rigorous foundation for Reinforcement Learning (RL). The first part of this course will combine techniques from optimization and stochastics to build a modeling, theoretical, and algorithmic foundation for MDPs. Topics such as finite- and infinite-horizon MDPs; Bellman’s equations of dynamic programming; value iteration, policy iteration, and linear programming-based solution algorithms; partially observable MDPs; robust MDPs; stochastic games; continuous-time MDPs; semi-Markov decision processes; and continuous-time deterministic control will be covered. The second part of the course will build on this foundation to introduce fundamental ideas and solution techniques in RL. These will include Monte Carlo Policy Iteration, Q-learning, Temporal-Difference Learning, and Neuro-Dynamic Programming.

Prereq: knowledge of optimization and stochastic models at the undergraduate level and familiarity with a computer programming language such as Python.

Minimum credits

Maximum credits

Is this course repeatable?

Grading basis

OPT - Student Option

Course components

Lecture

Requirements

000017

Fulfills the writing intensive requirement?

Typically offered term(s)

Fall Odd Year