基于A3C算法的无人车智能博弈估计与控制

刘青林; 叶泽华; 张丹

文章摘要

刘青林,叶泽华,张丹.基于A3C算法的无人车智能博弈估计与控制[J].高技术通讯(中文),2026,36(1):89~100

基于A3C算法的无人车智能博弈估计与控制

Intelligent game estimation and control of unmanned vehicles based on A3C algorithm

DOI：10. 3772 / j. issn. 1002 - 0470. 2026. 01. 008

中文关键词: 干扰；卡尔曼滤波器；斯塔克尔伯格博弈论；模型预测控制；强化学习

英文关键词: jamming, Kalman filters, Stackelberg game, model predictive control, reinforcement learning

基金项目:

作者	单位
刘青林	（浙江工业大学信息工程学院杭州 310023）
叶泽华
张丹

摘要点击次数: 23

全文下载次数: 17

中文摘要:

车辆交通系统中路况复杂、网络干扰频繁，传统基于卡尔曼滤波器的模型预测控制(model predictive control，MPC)方法因受干扰的影响会降低其控制性能。为了解决这个问题，首先建立一个三自由度的车辆运动学模型并提出模型预测控制算法实现理想状态下的轨迹跟踪。其次考虑到外部干扰对轨迹跟踪精度的影响，构建一种基于攻防对抗的博弈模型，即通过攻防思想建立网络干扰以及车辆防御系统之间博弈关系，并利用A3C(asynchronous advantage actor-critic)算法求解斯塔克尔伯格(Stackelberg)的均衡解。鉴于复杂网络干扰下量测信息可能存在丢失现象，设计了基于攻防信息的卡尔曼滤波器，以实现车辆的状态估计。在获得状态估计的基础上，运用MPC方法实现了复杂扰动干扰下无人车的轨迹跟踪。最后，仿真实验验证了所提方法的有效性。

英文摘要:

Vehicular traffic systems have complex road conditions and frequent network disturbances. The traditional model predictive control (MPC) method degrades its control performance due to the influence of disturbances. To solve this problem, a three-degree-of-freedom kinematic model of the vehicle is firstly established. The model predictive control algorithm is proposed to achieve trajectory tracking under the ideal state. Secondly, considering the influence of external interference on trajectory tracking accuracy, a game model based on attack and defence confrontation is constructed. That is, the game relationship between network interference and vehicle defence system is established through the idea of attack and defence. Stackelberg’s equilibrium solution is solved by asynchronous advantage actor-critic(A3C) algorithm. Due to the possible loss of measurement information under complex network interference, a Kalman filter based on attack and defence information is designed to achieve the state estimation of the vehicle. On the basis of obtaining the state estimation, the trajectory tracking of the unmanned vehicle under complex disturbance interference is achieved by using the MPC method. Finally, the effectiveness of the proposed method is verified by simulation.

查看全文查看/发表评论下载PDF阅读器

关闭