Octo: {INT8} Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning

Qihua Zhou; Song Guo; Zhihao Qu; Jingcai Guo; Zhenda Xu; Jiewei Zhang; Tao Guo; Boyuan Luo; Jingren Zhou

Authors:

Qihua Zhou and Song Guo, Hong Kong Polytechnic University; Zhihao Qu, Hohai University; Jingcai Guo, Zhenda Xu, Jiewei Zhang, Tao Guo, and Boyuan Luo, Hong Kong Polytechnic University; Jingren Zhou, Alibaba Group

Abstract:

On-device learning is an emerging technique to pave the last mile of enabling edge intelligence, which eliminates the limitations of conventional in-cloud computing where dozens of computational capacities and memories are needed. A high-performance on-device learning system requires breaking the constraints of limited resources and alleviating computational overhead. In this paper, we show that employing the 8-bit fixed-point (INT8) quantization in both forward and backward passes over a deep model is a promising way to enable tiny on-device learning in practice. The key to an efficient quantization-aware training method is to exploit the hardware-level enabled acceleration while preserving the training quality in each layer. However, off-the-shelf quantization methods cannot handle the on-device learning paradigm of fixed-point processing. To overcome these challenges, we propose a novel INT8 training method, which optimizes the computation of forward and backward passes via the delicately designed Loss-aware Compensation (LAC) and Parameterized Range Clipping (PRC), respectively. Specifically, we build a new network component, the compensation layer, to automatically counteract the quantization error of tensor arithmetic. We implement our method in Octo, a lightweight cross-platform system for tiny on-device learning. Evaluation on commercial AI chips shows that Octo holds higher training efficiency over state-of-the-art quantization training methods, while achieving adequate processing speedup and memory reduction over the full-precision training.

Qihua Zhou, Hong Kong Polytechnic University

Song Guo, Hong Kong Polytechnic University

Zhihao Qu, Hohai University

Jingcai Guo, Hong Kong Polytechnic University

Zhenda Xu, Hong Kong Polytechnic University

Jiewei Zhang, Hong Kong Polytechnic University

Tao Guo, Hong Kong Polytechnic University

Boyuan Luo, Hong Kong Polytechnic University

Jingren Zhou, Alibaba Group

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {273875,
author = {Qihua Zhou and Song Guo and Zhihao Qu and Jingcai Guo and Zhenda Xu and Jiewei Zhang and Tao Guo and Boyuan Luo and Jingren Zhou},
title = {Octo: {INT8} Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning},
booktitle = {2021 USENIX Annual Technical Conference (USENIX ATC 21)},
year = {2021},
isbn = {978-1-939133-23-6},
pages = {177--191},
url = {https://www.usenix.org/conference/atc21/presentation/zhou-qihua},
publisher = {USENIX Association},
month = jul
}

Download

Zhou PDF

View the slides

Octo: INT8 Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning