谭龙* **,严明玉*,吴欣欣* **,李文明*,吴海彬* **,范东睿* **.面向稀疏卷积神经网络的CGRA加速器研究[J].高技术通讯(中文),2024,34(2):173~186
The research of CGRA accelerator for sparse convolutional neural networks
DOI:10. 3772/ j. issn. 1002-0470. 2024. 02. 007
中文关键词: 稀疏卷积神经网络(CNN); 专用加速结构; 粗粒度可重构架构(CGRA); 动态指令过滤; 动态负载调度
英文关键词: sparse convolutional neural network (CNN), dedicated accelerator, coarse-grained reconfigurable architecture(CGRA), dynamic instruction filtering, dynamic workload balance
谭龙* ** (*中国科学院计算技术研究所处理器国家重点实验室北京 100190) (**中国科学院大学北京 100049) 
吴欣欣* **  
吴海彬* **  
范东睿* **  
摘要点击次数: 376
全文下载次数: 264
      A novel accelerator named DyCNN is proposed for sparse convolutional neural network (CNN) that has the increasing scale and rapid evolution. DyCNN is an energy-efficient and flexible accelerator,which is based on coarse grained reconfigurable architecture (CGRA). DyCNN utilizes a data-aware dynamic filtering mechanism to eliminatea large number of invalid calculations and memory accesses caused by the static sparsity of filters and dynamic sparsity of activation values in sparse convolutional neural network and increase the on-chip reuse of instructions among processing units. Meanwhile, a dynamic work-stealing strategy combined with a static work distribution scheme is proposed to alleviate the load imbalance caused by the sparsity of filter and activation values. Overall, DyCNN achieves a 1.69× speedup and 3.04× energy savings on average when running sparse CNN compared with running dense CNN. DyCNN achieves 2.78×, 1.48× speedup and 35.62× and 1.17× energy savings compared with the state-of-the-art GPU (cuSPARSE) and Cambricon-X solutions respectively.
查看全文   查看/发表评论  下载PDF阅读器
