Randomized Return Decomposition (RRD)

This is a TensorFlow implementation for our paper Learning Long-Term Reward Redistribution via Randomized Return Decomposition accepted by ICLR 2022.

Requirements

Python 3.6.13
gym == 0.18.3
TensorFlow == 1.12.0
BeautifulTable == 0.8.0
opencv-python == 4.5.3.56

Running Commands

Run the following commands to reproduce our main results shown in section 4.1.

python train.py --tag='RRD Ant-v2' --alg=rrd --basis_alg=sac --env=Ant-v2
python train.py --tag='RRD-L(RD) Ant-v2' --alg=rrd --basis_alg=sac --rrd_bias_correction=True --env=Ant-v2

The following commands to switch the back-end algorithm of RRD.

python train.py --tag='RRD-TD3 Ant-v2' --alg=rrd --basis_alg=td3 --env=Ant-v2
python train.py --tag='RRD-DDPG Ant-v2' --alg=rrd --basis_alg=ddpg --env=Ant-v2

We include an unofficial implementation of IRCR for the ease of baseline comparison.
Please refer to tgangwani/GuidanceRewards for the official implementation of IRCR.

python train.py --tag='IRCR-SAC Ant-v2' --alg=ircr --basis_alg=sac --env=Ant-v2
python train.py --tag='IRCR-TD3 Ant-v2' --alg=ircr --basis_alg=td3 --env=Ant-v2
python train.py --tag='IRCR-DDPG Ant-v2' --alg=ircr --basis_alg=ddpg --env=Ant-v2

The following commands support the experiments on Atari games with episodic rewards.

python train.py --tag='RRD-DQN Assault' --alg=rrd --basis_alg=dqn --env=Assault
python train.py --tag='IRCR-DQN Assault' --alg=ircr --basis_alg=dqn --env=Assault

Note: The implementation of RRD upon DQN on the Atari benchmark has not been well tuned. We release this interface only for the ease of future studies.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
algorithm		algorithm
envs		envs
learner		learner
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
common.py		common.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Randomized Return Decomposition (RRD)

Requirements

Running Commands

About

Releases

Packages

Languages

License

Stilwell-Git/Randomized-Return-Decomposition

Folders and files

Latest commit

History

Repository files navigation

Randomized Return Decomposition (RRD)

Requirements

Running Commands

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages