Playing Rock, Paper, Scissors with Reinforcement Learning

This is an attempt to apply RL algorithms at the Rock, Paper, Scissors (RPS) Kaggle competition.

.
├── dojo
│     ├── black_belt
│     ├── blue_belt
│     ├── white_belt
│     └── my_little_dojo
├── runs
├── eval.py
├── ppo_model.py
├── ppo_train.py
└── test.py

Algorithms

We implemented and evaluated our algorithms against a number of baselines from [1] (white, blue and black belt baselines).

Our agents are located in the dojo/my_little_dojo folder:

Simple Multi-armed-bandits: jotaro.py
Bandit of bandits (Meta-bandits/Banditception): giorno.py
Tabular Q-learning: saitama.py
PPO-LSTM (considering a POMDP setting): levi.py

Note: Unlike other agents, the PPO-LSTM agent does not learn a strategy online from scratch, but is pre-trained against a number of white belt baselines offline. Weights of pre-trained agents are located in the runs folder.

How to run

Our implementation uses PyTorch and, in addition to common python data libraries, you may need to install the kaggle-environments library.

To evaluate an agent against the other baselines, run the eval.py specifying the agent name and the baseline against which you want to evaluate it.

I you want to run an algorithm in debug mode (not possible with the default Kaggle environment), you can use the test.py file, which reproduces the RPS game. This can be useful if eval.py fails silently (agent returning None), but please make sure you have all required libraries installed (torch, tensorboard, etc).

Run ppo_train.py if you want (specifying one or multiple opponents) to train the PPO-LSTM agent.

References

[1] RPS Dojo Kaggle Notebook (https://www.kaggle.com/chankhavu/rps-dojo)

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
dojo		dojo
runs/levi		runs/levi
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
ppo_model.py		ppo_model.py
ppo_train.py		ppo_train.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Playing Rock, Paper, Scissors with Reinforcement Learning

Algorithms

How to run

References

About

Releases

Packages

Contributors 2

Languages

alxthm/rld-project

Folders and files

Latest commit

History

Repository files navigation

Playing Rock, Paper, Scissors with Reinforcement Learning

Algorithms

How to run

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages