View on GitHub

Appunti di Reinforcement Learning

Indice

Foundations of Deep RL - lecture series by Pieter Abbeel 🔗

Lecture 1 - MDPs, Exact Solution Methods, Max-ent RL
Lecture 2 - Deep Q-Learning
Lecture 3 - Policy Gradients and Advantage Estimation
Lecture 4 - TRPO and PPO
Lecture 5 - DDPG and SAC
Lecture 6 - Model-based RL

Open AI Spinning Up 🔗

Intro to Policy Optimization e VPG
Trust Region Policy Optimization
Proximal Policy Optimization

Miscellaneous

Note

Link utili

Generalized Advantage Estimate: Maths and Code 🔗
RL — Trust Region Policy Optimization (TRPO) Explained 🔗
RL — Trust Region Policy Optimization (TRPO) Part 2 🔗

Implementazioni

Baselines 🔗
Stable Baselines 🔗
Acme 🔗

Appunti di Reinforcement Learning maintained by dawoz

Published with GitHub Pages