An efficient and versatile backprop-based learning algorithm and architecture.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. © 2021 Association for Computing Machinery. 1539-9087/2021/1-ART1 $15.00 https://doi.org/10.1145/3504034
If you find this work useful for your research, please use the following BibTeX entry.
@article{10.1145/3504034,
author = {Hong, Ziyang and Yue, C. Patrick},
title = {Efficient-Grad: <u class="uu">Efficient</u> Training Deep Convolutional Neural Networks on Edge Devices with <u class="uu">Grad</u>ient Optimizations},
year = {2022},
issue_date = {March 2022},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {21},
number = {2},
issn = {1539-9087},
url = {https://doi.org/10.1145/3504034},
doi = {10.1145/3504034},
month = {feb},
articleno = {19},
numpages = {24},
}
- DF-LNPU: A Pipelined Direct Feedback Alignment-Based Deep Neural Network Learning Processor for Fast Online Learning. (KAIST)(FA)
- A 40nm 4.81TFLOPS/W 8b Floating-Point Training Processor for Non-Sparse Neural Networks Using Shared Exponent Bias and 24-Way Fused Multiply-Add Tree. (SNU)(Shared-Expo FP)
- TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training. (Toronto)(Sparsity)