文章摘要
WANG Xiaoxi (王晓曦)*,WU Wenjun*,YANG Feng*,SI Pengbo*,ZHANG Xuanyi**,ZHANG Yanhua*.[J].高技术通讯(英文),2022,28(2):172~180
Pseudo-label based semi-supervised learning in the distributed machine learning framework
  
DOI:10.3772/j.issn.1006-6748.2022.02.007
中文关键词: 
英文关键词: distributed machine learning (DML), semi-supervised, deep neural network (DNN)
基金项目:
Author NameAffiliation
WANG Xiaoxi (王晓曦)* (*Faculty of Information Technology, Beijing University of Technology, Beijing 100124, P.R.China) (**Beijing Capital International Airport Co., Ltd., Beijing 101317, P.R.China) 
WU Wenjun* (*Faculty of Information Technology, Beijing University of Technology, Beijing 100124, P.R.China) (**Beijing Capital International Airport Co., Ltd., Beijing 101317, P.R.China) 
YANG Feng* (*Faculty of Information Technology, Beijing University of Technology, Beijing 100124, P.R.China) (**Beijing Capital International Airport Co., Ltd., Beijing 101317, P.R.China) 
SI Pengbo* (*Faculty of Information Technology, Beijing University of Technology, Beijing 100124, P.R.China) (**Beijing Capital International Airport Co., Ltd., Beijing 101317, P.R.China) 
ZHANG Xuanyi** (*Faculty of Information Technology, Beijing University of Technology, Beijing 100124, P.R.China) (**Beijing Capital International Airport Co., Ltd., Beijing 101317, P.R.China) 
ZHANG Yanhua* (*Faculty of Information Technology, Beijing University of Technology, Beijing 100124, P.R.China) (**Beijing Capital International Airport Co., Ltd., Beijing 101317, P.R.China) 
Hits: 830
Download times: 737
中文摘要:
      
英文摘要:
      With the emergence of various intelligent applications, machine learning technologies face lots of challenges including large-scale models, application oriented real-time dataset and limited capabilities of nodes in practice. Therefore, distributed machine learning (DML) and semi-supervised learning methods which help solve these problems have been addressed in both academia and industry. In this paper, the semi-supervised learning method and the data parallelism DML framework are combined. The pseudo-label based local loss function for each distributed node is studied, and the stochastic gradient descent(SGD) based distributed parameter update principle is derived. A demo that implements the pseudo-label based semi-supervised learning in the DML framework is conducted, and the CIFAR-10 dataset for target classification is used to evaluate the performance. Experimental results confirm the convergence and the accuracy of the model using the pseudo-label based semi-supervised learning in the DML framework. Given the proportion of the pseudo-label dataset is 20%, the accuracy of the model is over 90% when the value of local parameter update steps between two global aggregations is less than 5. Besides, fixing the global aggregations interval to 3, the model converges with acceptable performance degradation when the proportion of the pseudo-label dataset varies from 20% to 80%.
View Full Text   View/Add Comment  Download reader
Close

分享按钮