Course label : | Theory of Machine Learning |
---|---|
Teaching departement : | EEA / Electrotechnics - Electronics - Control Systems |
Teaching manager : | Mister PIERRE-ANTOINE THOUVENIN / Mister PIERRE CHAINAIS |
Education language : | |
Potential ects : | 0 |
Results grid : | |
Code and label (hp) : | MR_DS_S3_TF2 - Theoretical foundations of mac |
Education team
Teachers : Mister PIERRE-ANTOINE THOUVENIN / Mister PIERRE CHAINAIS
External contributors (business, research, secondary education): various temporary teachers
Summary
● The problem of sequential decision making under uncertainty ● Markov decision problems ● the planning problem, and algorithms ● the reinforcement learning problem, and algorithms (incl. deep reinforcement learning) ● the bandit problem, and algorithms All notions visited during the course are investigated in practical sessions. Course details can be found in: https://debabrota-basu.github.io/course_bandit_rl.html
Educational goals
After successfully taking this course, a student should be: ● know what the problem of sequential decision making under uncertainty is ● know the various approaches to solve, along with the associated hypothesis ● know how to recognize such a problem, and model it accordingly ● know Markov decision problems, and related problems ● know about the main planning algorithms to solve them ● know about reinforcement learning approaches ● know the bandit problem, and the main algorithms
Sustainable development goals
Knowledge control procedures
Continuous Assessment
Comments: Labs, 1.5 credits, grading scale: (min) 0 – 20 (max) - Passing grade = 10/20
Exam, 1.5 credits, grading scale: (min) 0 – 20 (max) - Passing grade = 10/20
Online resources
Bertsekas, Dynamic programming and optimal control, MIT Press Bertsekas, Neurodynamic Programming, MIT Press Puterman, Markov decision processes, Wiley Sutton, Barto, Reinforcement Learning, MIT Press, 2nd edition Tor Lattimore and Csaba Szepesvari, Bandit Algorithms, Cambridge University Press
Pedagogy
24 hours, 12h lectures, 12h labs/tutorial sessions
Sequencing / learning methods
Number of hours - Lectures : | 12 |
---|---|
Number of hours - Tutorial : | 12 |
Number of hours - Practical work : | 0 |
Number of hours - Seminar : | 0 |
Number of hours - Half-group seminar : | 0 |
Number of student hours in TEA (Autonomous learning) : | 0 |
Number of student hours in TNE (Non-supervised activities) : | 0 |
Number of hours in CB (Fixed exams) : | 0 |
Number of student hours in PER (Personal work) : | 0 |
Number of hours - Projects : | 0 |
Prerequisites
The M1 program + Machine learning 3M1 Data science.