Tabular Q

tldr;

A tabular, online Q-learning algorithm that is trained on each program in the test set.

Authors: JD

Results:

Episode length 5, 2000 training episodes.
Episode length 10, 5000 training episodes

Publication:

CompilerGym version: 0.1.7

Open source? Yes, MIT licensed. Source Code.

Did you modify the CompilerGym source code? No.

What parameters does the approach have?

Episode length during the Q-table creation H.
Learning rate. λ
Discount fatcor. γ
Actions that are considered by the algorithm. a
Features that are used from the Autophase feature set. f
Number of episodes used during Q-table learning. N

What range of values were considered for the above parameters?

H=5, λ=0.1, γ=1.0, 15 selected actions, 3 selected features, N=2000 (short).
H=10, λ=0.1, γ=1.0, 15 selected actions, 3 selected features, N=5000 (long).

Is the policy deterministic? The policy itself is deterministic after its trained. However the training process is non-deterministic, so the behavior is different when trained again.

Description

Tabular Q learning is a standard reinforcement learning technique that computes the expected accumulated reward from any state action pair, and store them in a table. Through interaction with the environment, the algorithm improves the estimation by using step-wise reward and existing entries of the q table.

The implementation is online, thus for every step taken in the environment, the reward is immediately used to improve the current Q-table.

Experimental Setup

	Hardware Specification
OS	Ubuntu 20.04
CPU	Intel Xeon Gold 6230 CPU @ 2.10GHz (80× core)
Memory	754.5 GiB

Experimental Methodology

# short
$ python tabular_q_eval.py --episodes=2000 --episode_length=5 --learning_rate=0.1 --discount=1 --log_every=0
# long
$ python tabular_q_eval.py --episodes=5000 --episode_length=10 --learning_rate=0.1 --discount=1 --log_every=0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Tabular Q

Description

Experimental Setup

Experimental Methodology

Files

README.md

Latest commit

History

README.md

File metadata and controls

Tabular Q

Description

Experimental Setup

Experimental Methodology