文章摘要
XIE Xiaoyan(谢晓燕)*,XU Hao*,ZHU Yun**,HE Wanqi*.[J].高技术通讯(英文),2023,29(1):50~59
IOPS: computational graph optimization based on inter-operators parallel scheduling
  
DOI:10. 3772/ j. issn. 1006-6748. 2023. 01. 006
中文关键词: 
英文关键词: compile optimization, convolutional neural network(CNN), inter-operator parallelism schedule(IOPS), operator replacement
基金项目:
Author NameAffiliation
XIE Xiaoyan(谢晓燕)* (*School of Computer, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) (**School of Electronic Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) 
XU Hao* (*School of Computer, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) (**School of Electronic Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) 
ZHU Yun** (*School of Computer, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) (**School of Electronic Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) 
HE Wanqi* (*School of Computer, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) (**School of Electronic Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, P.R.China) 
Hits: 667
Download times: 625
中文摘要:
      
英文摘要:
      To improve the inference efficiency of convolutional neural networks (CNN), the existing neural networks mainly adopt heuristic and dynamic programming algorithms to realize parallel scheduling among operators. Heuristic scheduling algorithms can generate local optima easily, while the dynamic programming algorithm has a long convergence time for complex structural models. This paper mainly studies the parallel scheduling between operators and proposes an inter-operator parallelism schedule (IOPS) scheduling algorithm that guarantees the minimum similar execution delay. Firstly, a graph partitioning algorithm based on the largest block is designed to split the neural network model into multiple subgraphs. Then, the operators that meet the conditions is replaced according to the defined operator replacement rules. Finally, the optimal scheduling method based on backtracking is used to schedule the computational graph. Network models such as Inception-v3, ResNet-50, and RandWire are selected for testing. The experimental results show that the algorithm designed in this paper can achieve a 1.6× speedup compared with the existing sequential execution methods.
View Full Text   View/Add Comment  Download reader
Close

分享按钮