You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
which is fine, but when I tried to adapt isaacgym it became an issue. Specifically, I thought the to(device) code is no longer needed so just did
next_obs, reward, done, info=envs.step(action)
but this is wrong because I should have done next_done = done. The current next_done = torch.Tensor(done).to(device) just does not make a lot of sense.
Problem Description
A lot of the formatting changes are suggested by @Howuhh
1. Refactor on
next_done
The current code to handle
done
looks like thiswhich is fine, but when I tried to adapt isaacgym it became an issue. Specifically, I thought the
to(device)
code is no longer needed so just didbut this is wrong because I should have done
next_done = done
. The currentnext_done = torch.Tensor(done).to(device)
just does not make a lot of sense.We should refactor it to
2.
make_env
refactorto
3. flatten batch
to
4.
to
5.
cleanrl/cleanrl/ppo_atari.py
Line 209 in 9a74142
to
global_step += args.num_envs
6.
move
cleanrl/cleanrl/ppo.py
Line 183 in 9a74142
to the argparse.
The text was updated successfully, but these errors were encountered: