title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

Low-precision training has become a popular approach to reduce compute requirements, memory footprint, and energy consumption in supervised learning. In contrast, this promising approach has not yet enjoyed similarly widespread adoption within the reinforcement learning (RL) community, partly because RL agents can be notoriously hard to train even in full precision. In this paper we consider continuous control with the state-of-the-art SAC agent and demonstrate that a naïve adaptation of low-precision methods from supervised learning fails. We propose a set of six modifications, all straightforward to implement, that leaves the underlying agent and its hyperparameters unchanged but improves the numerical stability dramatically. The resulting modified SAC agent has lower memory and compute requirements while matching full-precision rewards, demonstrating that low-precision training can substantially accelerate state-of-the-art RL without parameter tuning.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

bjorck21a

0

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

980

991

980-991

980

false

Bj{\"o}rck, Johan and Chen, Xiangyu and De Sa, Christopher and Gomes, Carla P and Weinberger, Kilian

given	family
Johan	Björck

given	family
Xiangyu	Chen

given	family
Christopher	De Sa

given	family
Carla P	Gomes

given	family
Kilian	Weinberger

2021-07-01

Proceedings of the 38th International Conference on Machine Learning

139

inproceedings

date-parts

2021

7

1

http://proceedings.mlr.press/v139/bjorck21a/bjorck21a.pdf

label	link
Supplementary PDF	http://proceedings.mlr.press/v139/bjorck21a/bjorck21a-supp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2021-07-01-bjorck21a.md

2021-07-01-bjorck21a.md

Files

2021-07-01-bjorck21a.md

Latest commit

History

2021-07-01-bjorck21a.md

File metadata and controls