| XIA Chunwei(夏春伟)* **,ZHAO Jiacheng*,CUI Huimin* **,FENG Xiaobing* **.[J].高技术通讯(英文),2022,28(4):363~372 |
|
| HOPE: a heterogeneity-oriented parallel execution engine for inference on mobiles |
| |
| DOI:10.3772/j.issn.1006-6748.2022.04.004 |
| 中文关键词: |
| 英文关键词: deep neural network (DNN), mobile, heterogeneous scheduler, parallel computing |
| 基金项目: |
| Author Name | Affiliation | | XIA Chunwei(夏春伟)* ** | (*Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China)
(**School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China) | | ZHAO Jiacheng* | (*Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China)
(**School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China) | | CUI Huimin* ** | (*Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China)
(**School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China) | | FENG Xiaobing* ** | (*Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China)
(**School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China) |
|
| Hits: 2300 |
| Download times: 2094 |
| 中文摘要: |
| |
| 英文摘要: |
| It is significant to efficiently support artificial intelligence (AI) applications on heterogeneous mobile platforms, especially coordinately execute a deep neural network (DNN) model on multiple computing devices of one mobile platform. This paper proposes HOPE, an end-to-end heterogeneous inference framework running on mobile platforms to distribute the operators in a DNN model to different computing devices. The problem is formalized into an integer linear programming (ILP) problem and a heuristic algorithm is proposed to determine the near-optimal heterogeneous execution plan. The experimental results demonstrate that HOPE can reduce up to 36.2% inference latency (with an average of 22.0%) than MOSAIC, 22.0% (with an average of 10.2%) than StarPU and 41.8% (with an average of 18.4%) than μLayer respectively. |
|
View Full Text
View/Add Comment Download reader |
| Close |
|
|
|