• 中国科技核心期刊
  • JST收录期刊
  • Scopus收录期刊
  • DOAJ收录期刊

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于自适应多目标优化的UUV全覆盖路径规划方法

赵少靖 付松琛 白乐天 郭雨桐 黎塔

赵少靖, 付松琛, 白乐天, 等. 基于自适应多目标优化的UUV全覆盖路径规划方法[J]. 水下无人系统学报, 2025, 33(3): 459-472 doi: 10.11993/j.issn.2096-3920.2025-0031
引用本文: 赵少靖, 付松琛, 白乐天, 等. 基于自适应多目标优化的UUV全覆盖路径规划方法[J]. 水下无人系统学报, 2025, 33(3): 459-472 doi: 10.11993/j.issn.2096-3920.2025-0031
ZHAO Shaojing, FU Songchen, BAI Letian, GUO Yutong, LI Ta. Adaptive Multi-Objective Optimization-Based Coverage Path Planning Method for UUVs[J]. Journal of Unmanned Undersea Systems, 2025, 33(3): 459-472. doi: 10.11993/j.issn.2096-3920.2025-0031
Citation: ZHAO Shaojing, FU Songchen, BAI Letian, GUO Yutong, LI Ta. Adaptive Multi-Objective Optimization-Based Coverage Path Planning Method for UUVs[J]. Journal of Unmanned Undersea Systems, 2025, 33(3): 459-472. doi: 10.11993/j.issn.2096-3920.2025-0031

基于自适应多目标优化的UUV全覆盖路径规划方法

doi: 10.11993/j.issn.2096-3920.2025-0031
详细信息
    作者简介:

    赵少靖(1998-), 男, 在读博士, 主要研究方向为水下航行器的自主决策算法及其应用

  • 中图分类号: TJ630; U674.941

Adaptive Multi-Objective Optimization-Based Coverage Path Planning Method for UUVs

  • 摘要: 全覆盖路径规划作为无人水下航行器(UUV)在未知水域环境中的一项关键任务,受环境不确定性、运动约束和能耗限制等因素影响, 传统路径规划方法难以适应复杂场景。文中提出了一种基于自适应多目标优化的UUV全覆盖路径规划方法, 结合近端优化强化学习算法与动态权重调节机制, 通过奖励目标的相关性分析与线性回归估计, 自适应调整不同优化目标的权重, 使UUV能够在未知障碍物和洋流环境中自主规划高效的覆盖路径。为验证方法的有效性, 构建了一个基于二维仿真环境的UUV运动与声呐探测模型, 其中UUV运动模型在6自由度刚体运动的基础上简化为平面运动, 并在多种障碍物分布与随机洋流条件下进行了对比实验分析。实验结果表明, 相较于传统方法, 该方法能够在提高覆盖率的同时优化任务完成率、轨迹长度、能耗与信息延时等关键指标。其中, 覆盖率提升4.03%, 任务完成率提高10%, 效用值提升10.96%, 任务完成时间缩短14.13%, 轨迹长度减少26.85%, 能耗降低10.3%, 信息延时减少19.34%。结果证明该方法能够在复杂环境中显著提升UUV的适应性和鲁棒性, 为自主水下探测任务提供了新的优化策略参考。

     

  • 图  1  仿真环境地图构建过程

    Figure  1.  Building process of simulation environment map

    图  2  模型整体结构

    Figure  2.  Overall structure of the model

    图  3  特征提取器结构

    Figure  3.  Structure of the feature extractor

    图  4  策略模型结构

    Figure  4.  Structure of the policy model

    图  5  价值函数模型结构

    Figure  5.  Structure of the value function model

    图  6  训练中各指标变化曲线

    Figure  6.  Curves of indicators during training

    图  7  矩形障碍物场景示例

    Figure  7.  Examples of rectangular obstacle scenarios

    图  8  不规则障碍物场景示例

    Figure  8.  Examples of irregular obstacle scenarios

    图  9  路径规划结果示例

    Figure  9.  Examples of path planning results

    图  10  训练中自适应权重变化曲线

    Figure  10.  Adaptive weight curves during training

    表  1  各目标平均权重

    Table  1.   Average weight of each objective

    目标平均权重
    时间−0.013 359
    声呐−0.002 343
    探测0.301 260
    边界0.020 247
    障碍物0.307 023
    速度0.194 401
    转向−0.007 376
    下载: 导出CSV

    表  2  矩形障碍物场景下测试结果

    Table  2.   Test results in rectangular obstacle scenarios

    方法 障碍物是
    否已知
    覆盖
    率/%↑
    碰撞
    率/%↓
    轨迹长度/
    km↓
    能耗
    率/%↓
    信息
    延时/h↓
    效用
    值↑
    任务完
    成率/%↑
    任务完成
    时间/h↓
    AdaptiveW-MO-PPO 96.66 0 65.11 70.23 1.73 0.71 100 4.20
    97.15 0 62.42 67.4 1.69 0.72 100 4.02
    AveragedW-MO-PPO 94.30 10 67.12 71.45 1.72 0.62 85 4.32
    94.74 10 67.37 71.89 1.75 0.61 85 4.38
    EqualW-MO-PPO 55.18 45 0.15 0
    40.86 65 0.09 0
    AveragedW-SO-PPO 79.55 10 78.36 93.46 2.38 0.44 15 6.40
    76.66 15 75.30 89.37 2.41 0.43 30 6.06
    PPO 82.30 0 69.61 80.01 2.00 0.50 40 5.11
    85.42 5 73.11 84.10 2.05 0.51 45 5.20
    A* 93.20 0 90.61 79.99 2.12 0.65 95 5.00
    下载: 导出CSV

    表  3  不规则障碍物场景下测试结果

    Table  3.   Test results in irregular obstacle scenarios

    方法 障碍物是
    否已知
    覆盖
    率/%↑
    碰撞
    率/%↓
    轨迹
    长度/km↓
    能耗
    率/%↓
    信息
    延时/h↓
    效用
    值↑
    任务完
    成率/%↑
    任务完成
    时间/h↓
    AdaptiveW-MO-PPO 96.69 0 65.92 71.62 1.65 0.71 100 4.29
    96.84 0 68.47 73.70 1.81 0.71 100 4.46
    AveragedW-MO-PPO 94.49 0 69.65 74.43 1.83 0.67 90 4.53
    95.02 0 68.73 73.27 1.79 0.68 90 4.48
    EqualW-MO-PPO 56.11 25 0.19 0
    38.17 75 0.06 0
    AveragedW-SO-PPO 73.96 0 83.21 98.16 2.45 0.43 5 6.58
    74.17 5 73.33 86.84 2.47 0.42 20 5.86
    PPO 76.88 0 73.80 84.99 2.04 0.44 15 5.54
    79.92 0 73.07 83.95 2.12 0.47 15 5.77
    A* 92.09 0 88.55 78.18 2.07 0.63 85 4.89
    下载: 导出CSV

    表  4  示例的性能指标

    Table  4.   Performance indicators of the example

    方法覆盖
    率/%↑
    是否
    碰撞
    轨迹
    长度/km↓
    能耗
    率/%↓
    信息
    延时/h↓
    效用
    值 ↑
    任务完成
    时间/h↓
    AdaptiveW-
    MO-PPO
    99.0676.6971.161.430.794.28
    AveragedW-
    MO-PPO
    98.8394.8783.492.240.755.14
    EqualW-
    MO-PPO
    15.380
    AveragedW-
    SO-PPO
    80.6284.360.46
    PPO76.6287.400.41
    A*86.3879.690.58
    下载: 导出CSV
  • [1] 陈昭, 丁一杰, 张治强. 无人潜航器发展历程及运用优势研究[J]. 舰船科学技术, 2024, 46(23): 98-102.

    CHEN Z, DING Y J, ZHANG Z Q. Research on the development history and application advantages of unmanned underwater vehicle[J]. Ship Science and Technology, 2024, 46(23): 98-102.
    [2] 延远航. 无人水下航行器运动控制研究[D]. 太原: 中北大学, 2024.
    [3] 张翔鸢, 花吉. 国外超大型无人潜航器发展与运用研究[J]. 中国舰船研究, 2024, 19(5): 17-27.

    ZHANG X Y, HUA J. Study on the development and application of foreign extra-largeunmanned underwater vehicles[J]. Chinese Journal of Ship Research, 2024, 19(5): 17-27.
    [4] CHENG C, SHA Q, HE B, et al. Path planning and obstacle avoidance for AUV: A review[J]. Ocean Engineering, 2021, 235: 109355. doi: 10.1016/j.oceaneng.2021.109355
    [5] ZENG Z, SAMMUT K, LIAN L, et al. A comparison of optimization techniques for AUV path planning in environments with ocean currents[J]. Robotics and Autonomous Systems, 2016, 82: 61-72. doi: 10.1016/j.robot.2016.03.011
    [6] REPOULIAS F, PAPADOPOULOS E. Planar trajectory planning and tracking control design for underactuated AUVs[J]. Ocean Engineering, 2007, 34(11-12): 1650-1667. doi: 10.1016/j.oceaneng.2006.11.007
    [7] YU H, WANG Y. Multi-objective AUV path planning in large complex battlefield environments[C]//2014 Seventh International Symposium on Computational Intelligence and Design. Hangzhou, China: IEEE, 2014: 345-348.
    [8] TAN C S, MOHD-MOKHTAR R, ARSHAD M R. A comprehensive review of coverage path planning in robotics using classical and heuristic algorithms[J]. IEEE Access, 2021, 9: 119310-42. doi: 10.1109/ACCESS.2021.3108177
    [9] GAMMELL J D, SRINIVASA S S, BARFOOT T D. Informed RRT*: Optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic[C]//2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. Chicago, USA: IEEE, 2014: 2997-3004.
    [10] TORRES M, PELTA D A, VERDEGAY J L, et al. Coverage path planning with unmanned aerial vehicles for 3D terrain reconstruction[J]. Expert Systems with Applications, 2016, 55: 441-451. doi: 10.1016/j.eswa.2016.02.007
    [11] GABRIELY Y, RIMON E. Spanning-tree based coverage of continuous areas by a mobile robot[J]. Annals of Mathematics and Artificial Intelligence, 2001, 31: 77-98. doi: 10.1023/A:1016610507833
    [12] HUANG W H. Optimal line-sweep-based decompositions for coverage algorithms[C]//Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation. Seoul, Korea(South): IEEE, 2001, 1: 27-32.
    [13] KYAW P T, PAING A, THU T T, et al. Coverage path planning for decomposition reconfigurable grid-maps using deep reinforcement learning based travelling salesman problem[J]. IEEE Access, 2020, 8: 225945-56. doi: 10.1109/ACCESS.2020.3045027
    [14] HEYDARI J, SAHA O, GANAPATHY V. Reinforcement learning-based coverage path planning with implicit cellular decomposition[EB/OL]. [2025-4-14]. https://arxiv.org/abs/2110.09018.
    [15] AI B, JIA M, XU H, et al. Coverage path planning for maritime search and rescue using reinforcement learning[J]. Ocean Engineering, 2021, 241: 110098. doi: 10.1016/j.oceaneng.2021.110098
    [16] RÜCKIN J, JIN L, POPOVIĆ M. Adaptive informative path planning using deep reinforcement learning for UAV-based active sensing[C]//2022 International Conference on Robotics and Automation. Philadelphia, USA: IEEE, 2022: 4473-4479.
    [17] ZHAO Y, SUN P, LIM C G. The simulation of adaptive coverage path planning policy for an underwater desilting robot using deep reinforcement learning[C]//International Conference on Robot Intelligence Technology and Applications. Cham, Switzerland: Springer International Publishing, 2022: 68-75.
    [18] XING B, WANG X, YANG L, et al. An algorithm of complete coverage path planning for unmanned surface vehicle based on reinforcement learning[J]. Journal of Marine Science and Engineering, 2023, 11(3): 645. doi: 10.3390/jmse11030645
    [19] JONNARTH A, ZHAO J, FELSBERG M. Learning coverage paths in unknown environments with deep reinforcement learning[C]//International Conference on Machine Learning. Vienna, Austria: PMLR, 2024: 22491-508.
    [20] GRONDMAN I, BUSONIU L, LOPES G A D, et al. A survey of Actor-Critic reinforcement learning: Standard and natural policy gradients[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C(Applications and Reviews), 2012, 42(6): 1291-307. doi: 10.1109/TSMCC.2012.2218595
    [21] VAN MOFFAERT K, DRUGAN M M, NOWÉ A. Scalarized multi-objective reinforcement learning: Novel design techniques[C]//2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning. Singapore: IEEE, 2013: 191-199.
    [22] REYMOND M, HAYES C F, STECKELMACHER D, et al. Actor-Critic multi-objective reinforcement learning for non-linear utility functions[J]. Autonomous Agents and Multi-Agent Systems, 2023, 37(2): 23. doi: 10.1007/s10458-023-09604-x
    [23] FOSSEN T I. Handbook of marine craft hydrodynamics and motion control[M]. Hoboken, USA: John Willy & Sons Ltd, 2011.
    [24] WANG Z, DU J, JIANG C, et al. Task scheduling for distributed AUV network target hunting and searching: An energy-efficient AoI-aware DMAPPO approach[J]. IEEE Internet of Things Journal, 2022, 10(9): 8271-85.
  • 加载中
图(10) / 表(4)
计量
  • 文章访问数:  72
  • HTML全文浏览量:  27
  • PDF下载量:  30
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-02-25
  • 修回日期:  2025-03-19
  • 录用日期:  2025-03-25
  • 网络出版日期:  2025-06-03

目录

    /

    返回文章
    返回
    服务号
    订阅号