袁婷帅,冯宇,李永强.结合先验知识的多智能体博弈对抗研究[J].高技术通讯(中文),2024,34(3):256~264 |
结合先验知识的多智能体博弈对抗研究 |
Research on multi-agent game confrontation combined with prior knowledge |
|
DOI::10. 3772 / j. issn. 1002-0470. 2024. 03. 004 |
中文关键词: 智能博弈; 先验知识; 深度强化学习(DRL); 威胁评估; 任务调度 |
英文关键词: intelligent game, prior knowledge, deep reinforcement learning(DRL), threat estimation, task dispatch |
基金项目: |
作者 | 单位 | 袁婷帅 | (浙江工业大学信息工程学院杭州 310023) | 冯宇 | | 李永强 | |
|
摘要点击次数: 782 |
全文下载次数: 469 |
中文摘要: |
无实时奖励的复杂对抗环境是目前深度强化学习(DRL)领域的研究热点,面对此类环境,纯粹使用深度强化学习算法会导致智能体训练无法快速收敛以及对抗效果不佳等问题。基于此,本文提出了一种基于先验知识与深度强化学习相结合的智能博弈流程框架,设计了数据处理、增强机制以及动作决策3个模块,通过威胁评估、任务调度和损失比率3种增强机制来提升智能体在复杂对抗环境下的收敛速度和对抗效果。在数据堡垒(DC)平台上进行仿真,实验结果验证了本文所提出的智能博弈流程框架训练的智能体相较于单纯基于深度强化学习的智能体拥有更快的收敛速度以及更高的胜率。 |
英文摘要: |
The complex adversarial environment without real-time reward is the current research hot spot in the field of deep reinforcement learning(DRL). In such environment, the use of deep reinforcement learning algorithm alone in general leads to a lower convergence speed and unsatisfactory performance. In this regard, this paper proposes an intelligent game process framework based on the combination of prior knowledge and deep reinforcement learning, and designs three modules of data processing, enhancement mechanism and action decision-making to improve both the convergence speed and the countermeasure effect under complex confrontation environment through three enhancement mechanisms including threat assessment, task scheduling and loss ratio. The simulation results on the DataCastle (DC) platform show that the agent trained by the proposed intelligent game process framework has a fast convergence speed and higher winning rate than the agent only based on deep reinforcement learning. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |