You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your code. However, I find that the implementation is different from the paper Morel.
This implementation truncates the uncertain rollouts instead of setting the negative reward.
If I didn't misunderstand your code, May you explain why there is some difference? And can you release the code which is totally following the algorithm described in your paper?
Look forward your replays. Thanks a lot.
The text was updated successfully, but these errors were encountered:
Thanks for your code. However, I find that the implementation is different from the paper Morel.
This implementation truncates the uncertain rollouts instead of setting the negative reward.
If I didn't misunderstand your code, May you explain why there is some difference? And can you release the code which is totally following the algorithm described in your paper?
Look forward your replays. Thanks a lot.
The text was updated successfully, but these errors were encountered: