• 中国科技核心期刊
  • Scopus收录期刊
  • DOAJ收录期刊
  • JST收录期刊
  • Euro Pub收录期刊
Volume 33 Issue 2
May  2025
Turn off MathJax
Article Contents
GAN Wenhao, PENG Yunfei, QIAO Lei. Multi-Underwater Target Interception Strategy Based on Deep Reinforcement Learning[J]. Journal of Unmanned Undersea Systems, 2025, 33(2): 325-332. doi: 10.11993/j.issn.2096-3920.2025-0004
Citation: GAN Wenhao, PENG Yunfei, QIAO Lei. Multi-Underwater Target Interception Strategy Based on Deep Reinforcement Learning[J]. Journal of Unmanned Undersea Systems, 2025, 33(2): 325-332. doi: 10.11993/j.issn.2096-3920.2025-0004

Multi-Underwater Target Interception Strategy Based on Deep Reinforcement Learning

doi: 10.11993/j.issn.2096-3920.2025-0004
  • Received Date: 2025-01-08
  • Accepted Date: 2025-02-08
  • Rev Recd Date: 2025-02-06
  • Available Online: 2025-03-07
  • In the context of multiple autonomous undersea vehicles(AUVs) executing underwater target interception missions, AUVs are required to make precise decisions based on both enemy and partner information, navigating the dual challenges of competition and cooperation. Most existing research typically focuses on single-target interception in simple environments and lacks a detailed exploration of collaborative mechanisms for multi-target interception mechanisms in complex environments. Therefore, this paper proposed a multi-agent deep reinforcement learning framework for AUVs to learn interception strategies in environments with complex obstacles and time-vary ocean currents, with a focus on cooperation in many-to-many game scenarios. First, a hierarchical maneuvering framework was introduced to improve the decision-making ability of AUVs through a three-layer loop structure. Next, the multi-agent proximal policy optimization algorithm was used to construct a scalable state and action space and design a compound reward function, enhancing interception efficiency and cooperation of AUVs. Finally, a population expansion–curriculum learning approach was incorporated within a centralized training and distributed execution architecture to help AUVs master generalizable cooperation strategies. Training results show rapid convergence and high success rates of the proposed interception strategies. The simulation experiments show that the trained AUVs can use the same set of models in multiple population configurations to effectively intercept multiple intruding targets through cooperation while avoiding obstacles.

     

  • loading
  • [1]
    胡桥, 赵振轶, 冯豪博, 等. AUV 智能集群协同任务研究进展[J]. 水下无人系统学报, 2023, 31(2): 189-200. doi: 10.11993/j.issn.2096-3920.2023-0002
    [2]
    梁晓龙, 杨爱武, 张佳强, 等. 无人集群博弈对抗系统仿真验证及决策关键技术综述[J]. 系统仿真学报, 2024, 36(4): 805-816.
    [3]
    SUN S, SONG B, WANG P, et al. Real-time mission-motion planner for multi-UUVs cooperative work using tri-level programing[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 23(2): 1260-1273.
    [4]
    ANTONIONI E, SURIANI V, RICCIO F, et al. Game strategies for physical robot soccer players: A survey[J]. IEEE Transactions on Games, 2021, 13(4): 342-357. doi: 10.1109/TG.2021.3075065
    [5]
    赵伟, 叶军, 王邠. 基于人工智能的智能化指挥决策和控制[J]. 信息安全与通信保密, 2022(2): 2-8. doi: 10.3969/j.issn.1009-8054.2022.02.001
    [6]
    秦家虎, 马麒超, 李曼, 等. 多智能体协同研究进展综述: 博弈和控制交叉视角[J]. 自动化学报, 2025, 51(3): 489-509.
    [7]
    罗彪, 胡天萌, 周育豪, 等. 多智能体强化学习控制与决策研究综述[J]. 自动化学报, 2025, 51(3): 510-539.
    [8]
    HOU Y, HAN G, ZHANG F, et al. Distributional soft actor-critic-based multi-AUV cooperative pursuit for maritime security protection[J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(6): 6049-6060. doi: 10.1109/TITS.2023.3341034
    [9]
    XU J, ZHANG Z, WANG J, et al. Multi-AUV pursuit-evasion game in the internet of underwater things: An efficient training framework via offline reinforcement learning[J]. IEEE Internet of Things Journal, 2024, 11(19): 31273-31286. doi: 10.1109/JIOT.2024.3416616
    [10]
    ZHANG C, CHENG P, LIN B, et al. DRL-based target interception strategy design for an underactuated USV without obstacle collision[J]. Ocean Engineering, 2023, 280: 114443. doi: 10.1016/j.oceaneng.2023.114443
    [11]
    于长东, 刘新阳, 陈聪, 等. 基于多智能体深度强化学习的无人艇集群博弈对抗研究[J]. 水下无人系统学报, 2024, 32(1): 79-86. doi: 10.11993/j.issn.2096-3920.2023-0159
    [12]
    夏家伟, 朱旭芳, 张建强, 等. 基于多智能体强化学习的无人艇协同围捕方法[J]. 控制与决策, 2023, 38(5): 1438-1447.
    [13]
    孙兵, 戚国亮, 张威, 等. 基于粒子群优化-人工势场的多AUV拦截技术研究[J]. 控制工程, 2024, 31(5): 769-777.
    [14]
    SUN B, MA H, ZHU D. A fusion designed improved elastic potential field method in AUV underwater target interception[J]. IEEE Journal of Oceanic Engineering, 2023, 48(3): 640-648. doi: 10.1109/JOE.2023.3258068
    [15]
    YU C, VELU A, VINITSKY E, et al. The surprising effectiveness of PPO in cooperative multi-agent games[J]. Advances in Neural Information Processing Systems, 2022, 35: 24611-24624.
    [16]
    JANOSOV M, VIRÁGH C, VÁSÁRHELYI G, et al. Group chasing tactics: How to catch a faster prey[J]. New Journal of Physics, 2017, 19(5): 053003. doi: 10.1088/1367-2630/aa69e7
    [17]
    SCHULMAN J, MORITZ P, LEVINE S, et al. High-dimensional continuous control using generalized advantage estimation[EB/OL]. (2018-10-20)[2025-2-20]. https://arxiv.org/abs/1506.02438.
    [18]
    BAO H, ZHU H. Modeling and trajectory tracking model predictive control novel method of AUV based on CFD data[J]. Sensors, 2022, 22(11): 4234. doi: 10.3390/s22114234
    [19]
    QIAO L. 基于深度强化学习的多水下目标拦截策略研究[EB/OL]. [2024-12-23]. https://sjtu-mirus.github.io/MIRUS.github.io/research/MMI.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(1)

    Article Metrics

    Article Views(408) PDF Downloads(61) Cited by()
    Proportional views
    Related
    Service
    Subscribe

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return