This benchmark relies on Mujoco and CommonRoad. These two environments are publicly available.
Make sure you have downloaded & installed conda before proceeding.
mkdir ./save_model
mkdir ./evaluate_model
conda env create -n cn-py37 python=3.7 -f python_environment.yml
conda activate cn-py37
cd ./data
wget https://cs.uwaterloo.ca/~ppoupart/datasets/expert_data.zip
unzip expert_data.zip
rm expert_data.zip
cd ../
To run the virtual environment, you need to set up MuJoCo.
- Download the MuJoCo version 2.1 binaries for Linux or OSX.
- Extract the downloaded mujoco210 directory into ~/.mujoco/mujoco210.
- Install and use mujoco-py.
pip install -U 'mujoco-py<2.2,>=2.1'
pip install -e ./mujuco_environment
We highly recommend you to ensure the MuJoCo is indeed working by running testing examples in mujoco-py. In most case, you need to run:
import mujoco_py
import os
mj_path = mujoco_py.utils.discover_mujoco()
xml_path = os.path.join(mj_path, 'model', 'humanoid.xml')
model = mujoco_py.load_model_from_path(xml_path)
sim = mujoco_py.MjSim(model)
- The Blocked Half-Cheetah Environment
cd ./interface/
# run PPO
python train_ppo.py ../config/mujuco_BlockedHalfCheetah/train_ppo_HCWithPos-v0.yaml -n 5 -s 123
# run PPO-Lag
python train_ppo.py ../config/mujuco_BlockedHalfCheetah/train_ppo_lag_HCWithPos-v0.yaml -n 5 -s 123
# run GACL
python train_gail.py ../config/mujuco_BlockedHalfCheetah/train_GAIL_HCWithPos-v0.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/mujuco_BlockedHalfCheetah/train_Binary_HCWithPos-v0.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/mujuco_BlockedHalfCheetah/train_ICRL_HCWithPos-v0.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/mujuco_BlockedHalfCheetah/train_VCIRL_HCWithPos-v0.yaml -n 5 -s 123
- The Blocked Ant Environment
cd ./interface/
# run PPO
python train_ppo.py ../config/mujoco_BlockedAntWall/train_ppo_AntWall-v0.yaml -n 5 -s 123
# run PPO-Lag
python train_ppo.py ../config/mujoco_BlockedAntWall/train_ppo_lag_AntWall-v0.yaml -n 5 -s 123
# run GACL
python train_gail.py ../config/mujoco_BlockedAntWall/train_GAIL_AntWall-v0.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/mujoco_BlockedAntWall/train_Binary_AntWall-v0.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/mujoco_BlockedAntWall/train_ICRL_AntWall-v0.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/mujoco_BlockedAntWall/train_VCIRL_AntWall-v0.yaml -n 5 -s 123
- The Biased Pendulum Environment
cd ./interface/
# run PPO
python train_ppo.py ../config/mujoco_BiasedPendulum/train_ppo_InvertedPendulumWall-v0.yaml -n 5 -s 123
# run PPO-Lag
python train_ppo.py ../config/mujoco_BiasedPendulum/train_ppo_lag_InvertedPendulumWall-v0.yaml -n 5 -s 123
# run GACL
python train_gail.py ../config/mujoco_BiasedPendulum/train_GAIL_InvertedPendulumWall-v0.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/mujoco_BiasedPendulum/train_Binary_InvertedPendulumWall-v0.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/mujoco_BiasedPendulum/train_ICRL_InvertedPendulumWall-v0.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/mujoco_BiasedPendulum/train_VCIRL_InvertedPendulumWall-v0.yaml -n 5 -s 123
- The Blocked Walker Environment
cd ./interface/
# run PPO
python train_ppo.py ../config/mujoco_BlockedWalker/train_ppo_WalkerWithPos-v0.yaml -n 5 -s 123
# run PPO-Lag
python train_ppo.py ../config/mujoco_BlockedWalker/train_ppo_lag_WalkerWithPos-v0.yaml -n 5 -s 123
# run GACL
python train_gail.py ../config/mujoco_BlockedWalker/train_GAIL_WalkerWithPos-v0.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/mujoco_BlockedWalker/train_Binary_WalkerWithPos-v0.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/mujoco_BlockedWalker/train_ICRL_WalkerWithPos-v0.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/mujoco_BlockedWalker/train_VCIRL_WalkerWithPos-v0.yaml -n 5 -s 123
- The Blocked Walker Environment
cd ./interface/
# run PPO
python train_ppo.py ../config/mujoco_BlockedSwimmer/train_ppo_SwmWithPos-v0.yaml -n 5 -s 123
# run PPO-Lag
python train_ppo.py ../config/mujoco_BlockedSwimmer/train_ppo_lag_SwmWithPos-v0.yaml -n 5 -s 123
# run GACL
python train_gail.py ../config/mujoco_BlockedSwimmer/train_GAIL_SwmWithPos-v0.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/mujoco_BlockedSwimmer/train_Binary_SwmWithPos-v0.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/mujoco_BlockedSwimmer/train_ICRL_SwmWithPos-v0.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/mujoco_BlockedSwimmer/train_VCIRL_SwmWithPos-v0.yaml -n 5 -s 123
sudo apt-get update
sudo apt-get install build-essential make cmake
# option 1: Install with sudo rights (cn-py37 is the name of conda environment).
cd ./commonroad_environment
bash ./scripts/install.sh -e cn-py37
# Option 2: Install without sudo rights
bash ./commonroad_environment/scripts/install.sh -e cn-py37 --no-root
[For Running with the Full HighD Data Only] Get the full dataset and Preprocess.
- Our repository uses some data examples from commonroad-rl tutorial. To build the full environments, you need to apply for the HighD dataset from here. The dataset is free for not non-commercial use.
- After you receive the data, do some preprocess according to Tutorial 01 - Data Preprocessing. We show a brief version as follow:
Once you have downloaded the data, extract all the .csv (e.g., 03_recordingMeta.csv
, 03_tracks.csv
, 03_tracksMeta.csv
) files to the folderCIRL-benchmarks-public/data/highD/raw/data/
, and then
cd ./commonroad_environment/install/
# install python packages
cd ./dataset-converters
pip install -r requirements.txt
# transfer raw data to .xml files
python -m src.main highD ../../../data/highD/raw/ ../../../data/highD/xmls/ --num_time_steps_scenario 1000
# compute the .pickle files
cd $YourProjectDir/CIRL-benchmarks-public/commonroad_environment/commonroad_rl
python -m commonroad_rl.tools.pickle_scenario.xml_to_pickle -i ../../data/highD/xmls -o ../../data/highD/pickles
# split the dataset
python -m commonroad_rl.utils_run.split_dataset -i ../../data/highD/pickles/problem -otrain ../../data/highD/pickles/problem_train -otest ../../data/highD/pickles/problem_test -tr_r 0.7
# scatter dataset for multiple processes
python -m commonroad_rl.tools.pickle_scenario.copy_files -i ../../data/highD/pickles/problem_train -o ../../data/highD/pickles/problem_train_split -f *.pickle -n 5
- The HighD Velocity Constraint
cd ./interface/
# run PPO
python train_ppo.py ../config/highD_velocity_constraint/train_ppo_highD_velocity_constraint.yaml -n 5 -s 123
# run PPO-Lag
python train_ppo.py ../config/highD_velocity_constraint/train_ppo_lag_highD_velocity_constraint.yaml -n 5 -s 123
# run GACL
python train_gail.py ../config/highD_velocity_constraint/train_GAIL_highd_velocity_constraint.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/highD_velocity_constraint/train_Binary_highD_velocity_constraint.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/highD_velocity_constraint/train_ICRL_highD_velocity_constraint.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/highD_velocity_constraint/train_VCIRL_highD_velocity_constraint.yaml -n 5 -s 123
- The HighD Velocity Constraint simplified
cd ./interface/
# run GACL
python train_gail.py ../config/highD_velocity_constraint/train_GAIL_highd_velocity_constraint_simplified.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/highD_velocity_constraint/train_Binary_highD_velocity_constraint_simplified.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/highD_velocity_constraint/train_ICRL_highD_velocity_constraint_simplified.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/highD_velocity_constraint/train_VCIRL_highD_velocity_constraint_simplified.yaml -n 5 -s 123
- The HighD Distance Constraint
cd ./interface/
# run PPO
python train_ppo.py ../config/highD_distance_constraint/train_ppo_highD_distance_constraint.yaml -n 5 -s 123
# run PPO-Lag
python train_ppo.py ../config/highD_distance_constraint/train_ppo_lag_highD_distance_constraint.yaml -n 5 -s 123
# run GACL
python train_gail.py ../config/highD_distance_constraint/train_GAIL_highD_distance_constraint.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/highD_distance_constraint/train_Binary_highD_distance_constraint.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/highD_distance_constraint/train_ICRL_highD_distance_constraint.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/highD_distance_constraint/train_VCIRL_highD_distance_constraint.yaml -n 5 -s 123
- The HighD Distance Constraint simplified
cd ./interface/
# run GACL
python train_gail.py ../config/highD_distance_constraint/train_GAIL_highD_distance_constraint_simplified.yaml -n 5 -s 123
# run BC2L
python train_cirl.py ../config/highD_distance_constraint/train_Binary_highD_distance_constraint_simplified.yaml -n 5 -s 123
# run MECL
python train_cirl.py ../config/highD_distance_constraint/train_ICRL_highD_distance_constraint_simplified.yaml -n 5 -s 123
# run VCIRL
python train_cirl.py ../config/highD_distance_constraint/train_VCIRL_highD_distance_constraint_simplified.yaml -n 5 -s 123
```