Skip to content

Commit

Permalink
Update cleanrl/ppo_atari_accelerate.py
Browse files Browse the repository at this point in the history
Co-authored-by: Costa Huang <[email protected]>
  • Loading branch information
edbeeching and vwxyzjn authored Feb 13, 2024
1 parent ba8fbd8 commit 03d1a1c
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions cleanrl/ppo_atari_accelerate.py
Original file line number Diff line number Diff line change
Expand Up @@ -278,12 +278,13 @@ def get_action_and_value(self, x, action=None):
b_values = values.reshape(-1)

# Optimizing the policy and value network
b_inds = np.arange(args.batch_size)
b_inds = np.arange(args.local_batch_size)
clipfracs = []
for epoch in range(args.update_epochs):
np.random.shuffle(b_inds)
for start in range(0, args.batch_size, args.minibatch_size):
end = start + args.minibatch_size
for start in range(0, args.local_batch_size, args.local_minibatch_size):
end = start + args.local_minibatch_size
mb_inds = b_inds[start:end]
mb_inds = b_inds[start:end]

_, newlogprob, entropy, newvalue = agent.get_action_and_value(b_obs[mb_inds], b_actions.long()[mb_inds])
Expand Down

0 comments on commit 03d1a1c

Please sign in to comment.