受限密集环境下基于对比学习和强化学习的机器人导航方法国

禹鑫燚; 胡加南; 郑万财; 欧林林

文章摘要

禹鑫燚,胡加南,郑万财,欧林林.受限密集环境下基于对比学习和强化学习的机器人导航方法国[J].高技术通讯(中文),2024,34(7):734~743

受限密集环境下基于对比学习和强化学习的机器人导航方法国

Robot navigation method based on contrastive learning and reinforcement learning in restricted and dense environments

DOI：10. 3772 / j. issn. 1002-0470. 2024. 07. 007

中文关键词: 深度强化学习（DRL）；对比学习；机器人导航；人机交互

英文关键词: deep reinforcement learning (DRL), contrastive learning, robot navigation, human-robot interaction

基金项目:

作者	单位
禹鑫燚	（浙江工业大学信息工程学院杭州 310023）
胡加南
郑万财
欧林林

摘要点击次数: 3570

全文下载次数: 2728

中文摘要:

动态环境下的机器人导航是一个重要且具有挑战性的任务。针对机器人在受限密集环境下的导航任务，本文提出了一种基于深度强化学习（DRL）和对比学习结合的机器人导航方法。首先通过轨迹向量化方法来获取机器人和行人的历史信息，并设计了一个子图网络对其进行聚合，从而提高机器人对未来场景的预测能力。其次通过图神经网络(GNN)提取智能体（机器人、行人）之间的交互信息，赋予机器人预测行人意图的能力。最后在强化学习的基础上融入对比学习，并基于随机性策略强化学习算法性质提出了一种正样本增强方法，从而赋予机器人判断场景中其余位置安全性的能力以及找到更多可行路径的能力，提高其在复杂环境中的导航成功率。仿真实验验证了本文方法在受限密集环境中比现有的方法具有更好的性能。

英文摘要:

Robot navigation in dynamic environment is an important but challenging task. For the robot navigation in restricted and dense environment, this paper proposes a robot navigation method based on the combination of deep reinforcement learning (DRL) and contrastive learning. Firstly, the trajectory vectorization is used to obtain the history information of robot and humans, and a subgraph network is designed to aggregate it, so that the ability of robot to predict future scenes is improved. Secondly, the interaction information between agents (robot and humans) is extracted by the graph neural network (GNN), which gives the robot the ability to predict the intention of humans. Finally, on the basis of reinforcement learning, contrastive learning is integrated, and a positive sample enhancement method is proposed based on the nature of stochastic policy reinforcement learning algorithm, so as to give the robot the ability to judge the security of other position in the scene and to find more feasible paths, improving navigation success rate in complex environment. Simulation results show that the proposed method has better performance than the existing method in restricted and dense environment.

查看全文查看/发表评论下载PDF阅读器

关闭