牛京玉* **,胡瑜* **,李玮*,韩银和* **.基于持续强化学习的自动驾驶赛车决策算法研究[J].高技术通讯(中文),2024,34(1):1~14
Decision making based on continual reinforcement learning for autonomous racing
DOI:10. 3772/ j. issn. 1002-0470. 2024. 01. 001
中文关键词: 强化学习(RL); 持续学习; 行为决策; 自动驾驶赛车; 动力学特征提取
英文关键词: reinforcement learning (RL), continual learning, decision making, autonomous racing, dynamics feature extraction
牛京玉* ** (*中国科学院计算技术研究所智能计算机研究中心北京 100190) (**中国科学院大学北京 100049) 
胡瑜* **  
韩银和* **  
摘要点击次数: 1011
全文下载次数: 1018
      The variety of road shapes and materials presents a serious decision-making challenge for high-speed autonomous racing. To address the issue of dynamics gap between various roads, a decision-making algorithm based on continual reinforcement learning (CRL) is proposed. These roads are considered as different tasks. The first training stage of the algorithm extracts low-dimension task features that can characterize the vehicle dynamics on different roads. These features are used to compute the task similarity. The second training stage of the algorithm provides two CRL constraints for policy learning. One is the weight regularization constraint, which restricts the updates of policy weights that are important for old tasks. This restriction is adaptively regulated by task similarity. The other is the reward constraint, which encourages no performance degradation on old tasks while the policy is learning a new task. Racing experiments with different task sequences and CRL metrics are set to evaluate the algorithm. The results show that the proposed algorithm outperforms baselines without storing old tasks’ data or expanding policy network size.
查看全文   查看/发表评论  下载PDF阅读器
