Memory-based implementation of Soft Actor Critic for humanoid locomotion tasks.
Our experiments uses the EDU version of CoppeliaSim as the simulation platform. You can install it with no costs following these instructions.
We also use PyRep as the Python API with Coppelia, which can be installed from their GitHub.
The code was developed in Linux, but should work fine in Windows platforms. We recomend using Python virtual environments to isolate this project dependencies from the host system dependencies.
This project is implemented as a Python package, and therefore should be installed as a dependency:
python -m venv mem_sac
source mem_sac/bin/activate
git clone [email protected]:larocs/memory-DRL.git
pip install -e memory-DRL
We use a Json file to describe experiments (model architectures, environemnt parameters and etc) in order to facilitate experiment versioning through git. You can find some examples at examples
folder.
Training a model can be done through an auxiliar script in this project (preferably copy this script to another folder):
python mysac/runner.py --exp_path examples/cartpole_fcn
And to execute the trained policy:
python mysac/run_policy.py --exp_path examples/cartpole_fcn
Training statistics can be found in the experiment folder, along with the trained policy:
ls examples/cartpole_fcn/stats/
We also include an auxiliary scripts for quick plotting these results:
# Training evaluation results
python mysac/plot.py examples/cartpole_fcn/stats/eval_stats.csv
# SAC statistics during training
python mysac/plot_sac_stats.py --exp_path examples/cartpole_fcn/
Since the code is implemented as a framework, all models and environment are modular and can be freely extended to allow new experiments. Most of the code is documented through Python docstrings.
Esther Luna Colombini & Samuel Chenatti