Policy Optimization and PPO #2424

BrianPulfer · 2023-01-07T23:49:38Z

Dear all,

While the book currently has a small section on Reinforcement Learning covering MDPs, value iteration, and the Q-Learning algorithm, the book still does not cover an important family of algorithms: Policy optimization algorithms.

It'd be great to include an overview of the taxonomy of algorithms as the one provided by OpenAI's spinning UP

For that, I propose that we cover Proximal Policy Optimization (PPO) since:

It is very popular in the ML community
It is a state-of-the-art algorithm
It is relatively easy to implement and grasp.

I have already written a medium post about it. My idea would be to use the environment used for the Q-learning algorithm to train the PPO model.

astonzhang · 2023-01-08T07:47:00Z

@rasoolfa FYI

rasoolfa · 2023-01-08T08:43:38Z

Hi @BrianPulfer,

Thank you so much for the note and suggestion.
I'd like to note that our goal for the first run of the RL section is to cover fundamental concepts which are essential for more advanced materials and then start discussing advanced topics.
That said, we'll release a couple of more RL notebooks in coming weeks covering deep RL including both on-policy and off-policy methods, and advanced topics.

Rasool

BrianPulfer · 2023-01-08T17:09:36Z

Dear @rasoolfa,

Thank you for the answer. Please let me know if I can help with anything related to this, I'd love to!

Regards,
Brian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Policy Optimization and PPO #2424

Policy Optimization and PPO #2424

BrianPulfer commented Jan 7, 2023

astonzhang commented Jan 8, 2023

rasoolfa commented Jan 8, 2023

BrianPulfer commented Jan 8, 2023

Policy Optimization and PPO #2424

Policy Optimization and PPO #2424

Comments

BrianPulfer commented Jan 7, 2023

astonzhang commented Jan 8, 2023

rasoolfa commented Jan 8, 2023

BrianPulfer commented Jan 8, 2023