文章摘要
Li Zhen(李震)* **,Zhi Tian* ***,Liu Enhe* **,Liu Shaoli* ***,Chen Tianshi* ***.[J].高技术通讯(英文),2020,26(2):145~151
MW-DLA: a dynamic bit width deep learning accelerator
  
DOI:doi:10.3772/j.issn.1006-6748.2020.02.003
中文关键词: 
英文关键词: deep learning accelerator (DLA), per-layer representation, multiple-precision arithmetic unit
基金项目:
Author NameAffiliation
Li Zhen(李震)* ** (*Intelligent Processor Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (**University of Chinese Academy of Sciences, Beijing 100049, P.R.China) 
Zhi Tian* *** (*Intelligent Processor Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (***Cambricon Technologies Corporation Limited, Beijing 100191, P.R.China) 
Liu Enhe* ** (*Intelligent Processor Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (**University of Chinese Academy of Sciences, Beijing 100049, P.R.China) 
Liu Shaoli* *** (*Intelligent Processor Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (***Cambricon Technologies Corporation Limited, Beijing 100191, P.R.China) 
Chen Tianshi* *** (*Intelligent Processor Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (***Cambricon Technologies Corporation Limited, Beijing 100191, P.R.China) 
Hits: 1388
Download times: 1297
中文摘要:
      
英文摘要:
      Deep learning algorithms are the basis of many artificial intelligence applications. Those algorithms are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Thus various deep learning accelerators (DLAs) are proposed and applied to achieve better performance and lower power consumption. However, most deep learning accelerators are unable to support multiple data formats. This research proposes the MW-DLA, a deep learning accelerator supporting dynamic configurable data-width. This work analyzes the data distribution of different data types in different layers and trains a typical network with per-layer representation. As a result, the proposed MW-DLA achieves 2X performance and more than 50% memory requirement for AlexNet with less than 5.77% area overhead.
View Full Text   View/Add Comment  Download reader
Close

分享按钮