A repository for easy understanding of codes in Deep Reinforcement Learning
This tutorial presents latest research in policy gradient methods for modle-free RL in the following order:
- Advantage Actor Critic (A2C)
- Continuous control with deep reinforcement learning
- Proximal Policy Optimization Algorithms
- Trust Region Policy Optimization
- Soft Actor Critic Algorithm
Here is one of the training sample for MuJoCo HalfCheetah:
If you want a hang in PyTorch then refer my tutorials to get started or the official tutorials.