• 中国科技核心期刊
  • Scopus收录期刊
  • DOAJ收录期刊
  • JST收录期刊
  • Euro Pub收录期刊
Volume 34 Issue 2
Apr  2026
Turn off MathJax
Article Contents
ZHANG Tao, ZENG Xiangguang, LI Min, XIE Dijie, REN Wenzhe, PENG Bei. Dynamic Obstacle Avoidance for Autonomous Undersea Vehicles via VO-PPO[J]. Journal of Unmanned Undersea Systems, 2026, 34(2): 326-337, 362. doi: 10.11993/j.issn.2096-3920.2025-0154
Citation: ZHANG Tao, ZENG Xiangguang, LI Min, XIE Dijie, REN Wenzhe, PENG Bei. Dynamic Obstacle Avoidance for Autonomous Undersea Vehicles via VO-PPO[J]. Journal of Unmanned Undersea Systems, 2026, 34(2): 326-337, 362. doi: 10.11993/j.issn.2096-3920.2025-0154

Dynamic Obstacle Avoidance for Autonomous Undersea Vehicles via VO-PPO

doi: 10.11993/j.issn.2096-3920.2025-0154
  • Received Date: 2025-11-11
  • Accepted Date: 2025-12-24
  • Rev Recd Date: 2025-12-12
  • Available Online: 2026-03-16
  • Efficient and safe dynamic obstacle avoidance is crucial for autonomous underwater vehicles(AUV) performing military missions. To address the high collision risk and slow convergence of conventional reinforcement learning-based approaches in AUV obstacle-avoidance training, this paper proposes a dynamic obstacle-avoidance algorithm for AUV, termed VO-PPO, which integrates an improved velocity obstacle(VO) method with proximal policy optimization(PPO). In the traditional VO framework, the algorithm introduces a safety margin and a time-window mechanism to enhance the safety and efficiency of obstacle-avoidance decisions. Meanwhile, by constructing a “discrete-check-continuous-execution” safe action mask, it embeds geometric safety constraints into the policy optimization process. Combined with state-space decoupling and a multi-objective reward design, the proposed method guides the learned policy to balance safety, efficiency, and trajectory smoothness. Simulation results show that, compared with the traditional VO method, VO-PPO generates smoother obstacle-avoidance paths that better match the motion characteristics of AUV; compared with a baseline PPO algorithm, it improves the obstacle-avoidance success rate by 53%, accelerates training convergence by 67.5%, and increases the accumulated reward by 56.7%, effectively mitigating the problems of high collision risk and slow convergence.

     

  • loading
  • [1]
    郭银景, 鲍建康, 刘琦, 等. AUV实时避障算法研究进展[J]. 水下无人系统学报, 2020, 28(4): 351-358, 369.

    Guo Y J, Bao J K, Liu Q, et al. Research progress of real-time obstacle avoidance algorithms for unmanned undersea vehicle: A review[J]. Journal of Unmanned Undersea Systems, 2020, 28(4): 351-358, 369.
    [2]
    朱仲本, 张嘉豪, 薛祎凡, 等. 洋流环境下基于DVFH+的AUV避障控制[J]. 水下无人系统学报, 2025, 33(1): 15-23. doi: 10.11993/j.issn.2096-3920.2024-0077

    Zhu Z B, Zhang J H, Xue Y F, et al. Obstacle avoidance control of autonomous undersea vehicle based on DVFH+ in ocean current environment[J]. Journal of Unmanned Undersea Systems, 2025, 33(1): 15-23. doi: 10.11993/j.issn.2096-3920.2024-0077
    [3]
    侯海平, 钱家昌, 赵楠, 等. 自主式水下航行器水下生存力关键技术[J]. 舰船科学技术, 2023, 45(11): 98-101.

    Hou H P, Qian J C, Zhao N, et al. Key technologies of underwater survivability of AUV[J]. Ship Science and Technology, 2023, 45(11): 98-101.
    [4]
    Li C, Guo S, Guo J. Study on obstacle avoidance strategy using multiple ultrasonic sensors for spherical underwater robots[J]. IEEE Sensors Journal, 2022, 22(24): 24458-24470. doi: 10.1109/JSEN.2022.3220246
    [5]
    Hao L Y, Dong G G, Li T S, et al. Path-following control with obstacle avoidance of autonomous surface vehicles subject to actuator faults[J]. IEEE/CAA Journal of Automatica Sinica, 2024, 11(4): 956-964. doi: 10.1109/JAS.2023.123675
    [6]
    Lin C, Liu Y, Lin S. An adaptive dynamic window approach for UUV obstacle avoidance planning in 3D environments[J]. Journal of Physics: Conference Series, 2024, 2704: 012026. doi: 10.1088/1742-6596/2704/1/012026
    [7]
    唐意成. 基于改进动态窗口法的无人艇动态避障方法研究[J]. 通信与信息技术, 2025(2): 23-27.
    [8]
    Fiorini P, Shiller Z. Motion planning in dynamic environments using velocity obstacles[J]. The International Journal of Robotics Research, 1998, 17(7): 760-772. doi: 10.1177/027836499801700706
    [9]
    Sun Y, Luo X, Ran X, et al. A 2D optimal path planning algorithm for autonomous underwater vehicle driving in unknown underwater canyons[J]. Journal of Marine Science and Engineering, 2021, 9(3): 252. doi: 10.3390/jmse9030252
    [10]
    Sun Y, Ran X, Zhang G, et al. AUV 3D path planning based on the improved hierarchical deep Q network[J]. Journal of Marine Science and Engineering, 2020, 8(2): 145. doi: 10.3390/jmse8020145
    [11]
    Pang W, Zhu D, Sun C. Multi-AUV formation reconfiguration obstacle avoidance algorithm based on affine transformation and improved artificial potential field under ocean currents disturbance[J]. IEEE Transactions on Automation Science and Engineering, 2024, 21(2): 1469-1487. doi: 10.1109/TASE.2023.3245818
    [12]
    张艳, 李炳华, 霍涛, 等. 融合改进A*算法与DWA算法的机器人动态避障方法研究[J]. 系统仿真学报, 2025, 37(6): 1555-1564. doi: 10.16182/j.issn1004731x.joss.24-0143
    [13]
    Zhang W, Wei S, Teng Y, et al. Dynamic obstacle avoidance for unmanned underwater vehicles based on an improved velocity obstacle method[J]. Sensors, 2017, 17: 2742. doi: 10.3390/s17122742
    [14]
    许文瑶, 贺继林. 基于改进速度障碍法的水下机器人动态避障[J]. 电光与控制, 2021, 28(12): 86-90.
    [15]
    章飞, 胡春磊. 基于滚动速度障碍法的AUV动态避障路径规划[J]. 水下无人系统学报, 2021, 29(1): 30-38.

    Zhang F, Hu C L. Research on AUV dynamic obstacle avoidance path planning based on the rolling speed obstacle method[J]. Journal of Unmanned Undersea Systems, 2021, 29(1): 30-38.
    [16]
    Wang H, Gao W, Wang Z, et al. Research on obstacle avoidance planning for UUV based on A3C algorithm[J]. Journal of Marine Science and Engineering, 2023, 12: 63. doi: 10.3390/jmse12010063
    [17]
    Xu J, Huang F, Wu D, et al. A learning method for AUV collision avoidance through deep reinforcement learning[J]. Ocean Engineering, 2022, 260: 112038. doi: 10.1016/j.oceaneng.2022.112038
    [18]
    潘云伟, 李敏, 曾祥光, 等. 基于形状离散层的多智能体编队控制[J]. 计算机科学, 2025, 52(10): 287-295.

    Pan Y W, Li M, Zeng X G, et al. Multi-agent formation control based on discrete layers of formation shapes[J]. Computer Science, 2025, 52(10): 287-295.
    [19]
    Yuan J Y, Wang H J, Zhang H H, et al. AUV obstacle avoidance planning based on deep reinforcement learning[J]. Journal of Marine Science and Engineering, 2021, 9: 1166. doi: 10.3390/jmse9111166
    [20]
    Chu Z, Wang F, Lei T, et al. Path planning based on deep reinforcement learning for autonomous underwater vehicles under ocean current disturbance[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(1): 108-120. doi: 10.1109/TIV.2022.3153352
    [21]
    Gao X, Yan L, Li Z, et al. Improved deep deterministic policy gradient for dynamic obstacle avoidance of mobile robot[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2023, 53(6): 3675-3682. doi: 10.1109/TSMC.2022.3230666
    [22]
    李明, 叶汪忠, 燕洁华. 基于深度强化学习的沙漠机器人路径规划[J]. 系统仿真学报, 2024, 36(12): 2917-2925. doi: 10.16182/j.issn1004731x.joss.23-1422
    [23]
    邢丽静, 李敏, 曾祥光, 等. 部分未知环境下基于行为克隆与改进DQN的AUV路径规划[J]. 系统仿真学报, 2025, 37(11): 2754-2767.

    Xing L J, Li M, Zeng X G, et al. AUV path planning based on behavior cloning and improved DQN in partially unknown environments[J]. Journal of System Simulation, 2025, 37(11): 2754-2767.
    [24]
    宗律, 李立刚, 贺则昊, 等. 融合速度障碍法和DQN的无人船避障方法[J]. 电子测量技术, 2024, 47(20): 60-67.
    [25]
    Zhu G, Shen Z, Liu L, et al. AUV dynamic obstacle avoidance method based on improved PPO algorithm[J]. IEEE Access, 2022, 10: 121340-121351. doi: 10.1109/ACCESS.2022.3223382
    [26]
    蔡泽, 胡耀光, 闻敬谦, 等. 复杂动态环境下基于深度强化学习的AGV避障方法[J]. 计算机集成制造系统, 2023, 29(1): 236-245. doi: 10.13196/j.cims.2023.01.020
    [27]
    潘云伟, 李敏, 曾祥光, 等. 基于人工势场和改进强化学习的自主式水下潜航器避障和航迹规划[J]. 兵工学报, 2025, 46(4): 72-83. doi: 10.12382/bgxb.2024.0300

    Pan Y W, Li M, Zeng X G, et al. AUV obstacle avoidance and path planning based on artificial potential field and improved reinforcement learning[J]. Acta Armamentarii, 2025, 46(4): 72-83. doi: 10.12382/bgxb.2024.0300
    [28]
    Helbing D, Farkas I, Vicsek T. Simulating dynamical features of escape panic[J]. Nature, 2000, 407(6803): 487-490. doi: 10.1038/35035023
    [29]
    Almeida A, Ramalho G, Santana H, et al. Recent advances on multi-agent patrolling[C]//17th Brazilian Symposium on Artificial Intelligence, SBIA 2004, 2004: 474-483.
    [30]
    Litimein H, Huang Z Y, Hamza A. A survey on techniques in the circular formation of multi-agent systems[J]. Electronics, 2021, 10(23): 2959. doi: 10.3390/electronics10232959
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(13)  / Tables(5)

    Article Metrics

    Article Views(154) PDF Downloads(77) Cited by()
    Proportional views
    Related
    Service
    Subscribe

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return