Reproduce (performance of) the following reinforcement learning methods:
-
Nature-DQN in: Human-level Control Through Deep Reinforcement Learning
-
Double-DQN in: Deep Reinforcement Learning with Double Q-learning
-
Dueling-DQN in: Dueling Network Architectures for Deep Reinforcement Learning
-
A3C in Asynchronous Methods for Deep Reinforcement Learning. (I used a modified version where each batch contains transitions from different simulators, which I called "Batch-A3C".)
Claimed performance in the paper can be reproduced, on several games I've tested with.
On one TitanX, Double-DQN took 1 day of training to reach a score of 400 on breakout. Batch-A3C implementation only took <2 hours.
Double-DQN with nature paper setting runs at 60 batches (3840 trained frames, 240 seen frames, 960 game frames) per second on (Maxwell) TitanX.
Install ALE and gym.
Download an atari rom, e.g.:
wget https://github.com/openai/atari-py/raw/master/atari_py/atari_roms/breakout.bin
Start Training:
./DQN.py --rom breakout.bin
# use `--algo` to select other DQN algorithms. See `-h` for more options.
Watch the agent play:
# Download pretrained models or use one you trained:
wget http://models.tensorpack.com/DeepQNetwork/DoubleDQN-Breakout.npz
./DQN.py --rom breakout.bin --task play --load DoubleDQN-Breakout.npz
A3C code and models for Atari games in OpenAI Gym are released in examples/A3C-Gym