HOPE: a heterogeneity-oriented parallel execution engine for inference on mobiles

XIA Chunwei(夏春伟)* **; ZHAO Jiacheng*; CUI Huimin* **; FENG Xiaobing* **

文章摘要

XIA Chunwei(夏春伟)* **,ZHAO Jiacheng*,CUI Huimin* **,FENG Xiaobing* **.[J].高技术通讯(英文),2022,28(4):363~372

HOPE: a heterogeneity-oriented parallel execution engine for inference on mobiles

DOI：10.3772/j.issn.1006-6748.2022.04.004

中文关键词:

英文关键词: deep neural network (DNN), mobile, heterogeneous scheduler, parallel computing

基金项目:

Author Name	Affiliation
XIA Chunwei(夏春伟)* **	(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (*School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China)
ZHAO Jiacheng*	(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (*School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China)
CUI Huimin* **	(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (*School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China)
FENG Xiaobing* **	(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, P.R.China) (*School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, P.R.China)

Hits: 498

Download times: 596

中文摘要:

英文摘要:

It is significant to efficiently support artificial intelligence (AI) applications on heterogeneous mobile platforms, especially coordinately execute a deep neural network (DNN) model on multiple computing devices of one mobile platform. This paper proposes HOPE, an end-to-end heterogeneous inference framework running on mobile platforms to distribute the operators in a DNN model to different computing devices. The problem is formalized into an integer linear programming (ILP) problem and a heuristic algorithm is proposed to determine the near-optimal heterogeneous execution plan. The experimental results demonstrate that HOPE can reduce up to 36.2% inference latency (with an average of 22.0%) than MOSAIC, 22.0% (with an average of 10.2%) than StarPU and 41.8% (with an average of 18.4%) than μLayer respectively.

View Full Text View/Add Comment Download reader