• 中国科技核心期刊
  • Scopus收录期刊
  • DOAJ收录期刊
  • JST收录期刊
  • Euro Pub收录期刊
Turn off MathJax
Article Contents
Xiao Wenwen, Cai Qianya, Mao Lifu, Lin Yuan, Zhao Yuan, WANG Mianjin. A Double-Layer Autonomous Decision-Making Method Based on Expert Knowledge and Deep Reinforcement Learning[J]. Journal of Unmanned Undersea Systems. doi: 10.11993/j.issn.2096-3920.2025-0098
Citation: Xiao Wenwen, Cai Qianya, Mao Lifu, Lin Yuan, Zhao Yuan, WANG Mianjin. A Double-Layer Autonomous Decision-Making Method Based on Expert Knowledge and Deep Reinforcement Learning[J]. Journal of Unmanned Undersea Systems. doi: 10.11993/j.issn.2096-3920.2025-0098

A Double-Layer Autonomous Decision-Making Method Based on Expert Knowledge and Deep Reinforcement Learning

doi: 10.11993/j.issn.2096-3920.2025-0098
  • Received Date: 2025-07-30
  • Accepted Date: 2025-10-09
  • Rev Recd Date: 2025-09-23
  • Available Online: 2026-01-05
  • Due to the complex and dynamic underwater environment, underwater unmanned systems face challenges of unpredictability and incomplete perception, which makes it difficult for them to accurately and efficiently accomplish autonomous decision - making tasks. Traditional methods highly rely on complete perception data and map information. However, limited by the dynamic characteristics of the underwater environment, it is difficult to construct effective map information in real - time, thus leading to limited efficiency of underwater unmanned systems in executing tasks such as underwater detection, resource exploration, and environmental monitoring. To address the above challenges, this paper proposes a double-layer decision-making method based on expert knowledge and deep reinforcement learning. This method can effectively enhance the adaptive ability of unmanned systems in underwater intelligent decision-making and significantly improve the efficiency of task execution. Specifically, an autonomous decision-making strategy generation method is first proposed to enhance the adaptive ability of underwater unmanned systems in unknown scenarios, further strengthening their autonomous decision-making level in complex environments. Secondly, a double-layer autonomous decision-making method is put forward. By enhancing the robustness of the system, it effectively ensures navigation safety. Finally, a multi - module design method is proposed to achieve the decoupling of each functional module, effectively improving the research and development efficiency of underwater unmanned systems. Taking the unmanned underwater vehicle (UUV) as the research object, experimental results show that the success rate and the convergence speed of the average reward value of the method in this paper outperform various benchmark methods in the simulation scenarios of UUV autonomous navigation and obstacle avoidance, providing a solid theoretical support for autonomous decision - making in real-world scenarios.

     

  • loading
  • [1]
    曹迟, 史文涛, 王百合, 等. 无人水下航行器反潜作战模型仿真[J]. 水下无人系统学报, 2025, 33(1): 156-163.

    CAO C, SHI W T, WANG B H, et al. Simulation of anti-submarine warfare model for unmanned underwater vehicles[J]. Journal of Unmanned Undersea Systems, 2025, 33(1): 156-163.
    [2]
    陈昭, 丁一杰, 张治强. 无人潜航器发展历程及运用优势研究[J]. 舰船科学技术, 2024, 46(23): 98-102. doi: 10.3404/j.issn.1672-7649.2024.23.016

    CHEN Z, DING Y J, ZHANG Z Q. Development history and application advantages of unmanned underwater vehicles[J]. Ship Science and Technology, 2024, 46(23): 98-102. doi: 10.3404/j.issn.1672-7649.2024.23.016
    [3]
    张龙伟, 李中政, 董黄伟. 基于UUV的海洋环境测量系统设计[J]. 船电技术, 2023, 43(8): 38-41.

    ZHANG L W, LI Z Z, DONG H W. Design of marine environmental monitoring system based on UUV[J]. Marine Electric & Electronic Technology, 2023, 43(8): 38-41.
    [4]
    王旭, 李金明, 毛昭勇, 等. 基于组合赋权TOPSIS的智能UUV目标识别与反对抗效能评估[J]. 水下无人系统学报, 2024, 32(5): 779-786.

    WANG X, LI J M, MAO Z Y, et al. Intelligent UUV target recognition and anti-countermeasure effectiveness evaluation based on combined weighting TOPSIS[J]. Journal of Unmanned Undersea Systems, 2024, 32(5): 779-786.
    [5]
    郑康洁, 张新宇, 王伟菘, 等. DQN与规则结合的智能船舶动态自主避障决策[J]. 系统工程与电子技术, 2025, 47(6): 1994-2001. doi: 10.12305/j.issn.1001-506X.2025.06.27

    ZHENG K J, ZHANG X Y, WANG W S, et al. Dynamic autonomous obstacle avoidance decision for intelligent ships combining DQN and rules[J]. Systems Engineering and Electronics, 2025, 47(6): 1994-2001. doi: 10.12305/j.issn.1001-506X.2025.06.27
    [6]
    李磊, 杜度, 陈科. 基于改进生物启发模型的UUV在线避障方法[J]. 水下无人系统学报, 2019, 27(3): 266-271. doi: 10.11993/j.issn.2096-3920.2019.03.005

    LI L, DU D, CHEN K. UUV online obstacle avoidance method based on improved bio-inspired model[J]. Journal of Unmanned Undersea Systems, 2019, 27(3): 266-271. doi: 10.11993/j.issn.2096-3920.2019.03.005
    [7]
    杨长兵, 张海华, 刘焕牢. 基于深度强化学习的船舶路径规划方法研究[J]. 信息技术, 2024(10): 128-135. doi: 10.13274/j.cnki.hdzj.2024.10.019

    YANG C B, ZHANG H H, LIU H L. Research on ship path planning method based on deep reinforcement learning[J]. Information Technology, 2024(10): 128-135. doi: 10.13274/j.cnki.hdzj.2024.10.019
    [8]
    詹天碧, 冯辉, 徐海祥, 等. 基于噪声DQN的智能船舶全局路径规划方法[J]. 大连海事大学学报, 2025, 51(1): 43-53. doi: 10.16411/j.cnki.issn1006-7736.2025.01.005

    ZHAN T B, FENG H, XU H X, et al. Global path planning method for intelligent ships based on noisy DQN[J]. Journal of Dalian Maritime University, 2025, 51(1): 43-53. doi: 10.16411/j.cnki.issn1006-7736.2025.01.005
    [9]
    欧昌奎, 谢磊, 查天奇, 等. 基于深度强化学习和历史轨迹的船舶路径规划[J]. 中国航海, 2024, 47(1): 36-44. doi: 10.3969/j.issn.1000-4653.2024.01.005

    OU C K, XIE L, ZHA T Q, et al. Ship path planning based on deep reinforcement learning and historical trajectories[J]. Navigation of China, 2024, 47(1): 36-44. doi: 10.3969/j.issn.1000-4653.2024.01.005
    [10]
    徐江鹏, 王俊雷, 唐怡. AUV全向运动轨迹跟踪控制方法[J]. 水下无人系统学报, 2024, 32(6): 1018-1028.

    XU, J. P. , WANG, J. L. , TANG, Y. AUV omnidirectional motion trajectory tracking control method[J]. Journal of Unmanned Undersea Systems, 2024, 32(6): 1018-1028.
    [11]
    刘清河, 聂文鹏, 乔应, 等. 基于强化学习的无人船路径跟踪控制方法[C]//中国汽车工程学会. 第三十一届中国汽车工程学会年会论文集(1). 哈尔滨工业大学(威海), 2024: 158-164.
    [12]
    谭靖, 杨丽刚, 李潇睿, 等. 深度强化学习及其在工业场景的应用与展望[J]. 工程科学学报, 2025, 47(4): 768-779. doi: 10.13374/j.issn2095-9389.2024.10.29.006

    TAN J, YANG L G, LI X R, et al. Deep reinforcement learning and its applications and prospects in industrial scenarios[J]. Journal of Engineering Sciences, 2025, 47(4): 768-779. doi: 10.13374/j.issn2095-9389.2024.10.29.006
    [13]
    赵经纬, 熊华乔, 崔峰, 等. 无人水下航行器智能运动控制方法研究[J]. 运输经理世界, 2024(34): 58-60. doi: 10.3969/j.issn.1673-3681.2024.34.020

    ZHAO J W, XIONG H Q, CUI F, et al. Research on intelligent motion control methods for unmanned underwater vehicles[J]. Transportation Manager World, 2024(34): 58-60. doi: 10.3969/j.issn.1673-3681.2024.34.020
    [14]
    温志文, 蔡卫军, 杨春武. UUV自主航行路径规划方法[J]. 制造业自动化, 2016, 38(11): 1-5. doi: 10.3969/j.issn.1009-0134.2016.11.001

    WEN Z W, CAI W J, YANG C W. UUV Autonomous navigation path planning method[J]. Manufacturing Automation, 2016, 38(11): 1-5. doi: 10.3969/j.issn.1009-0134.2016.11.001
    [15]
    严浙平, 姜玲, 王晓娟, 等. 基于双目视觉的UUV避障半实物仿真系统[J]. 鱼雷技术, 2012, 20(02): 143-148. doi: 10.3969/j.issn.1673-1948.2012.02.014

    YAN Z P, JIANG L, WANG X J, et al. Semi-physical simulation system for UUV obstacle avoidance based on binocular vision[J]. Torpedo Technology, 2012, 20(02): 143-148. doi: 10.3969/j.issn.1673-1948.2012.02.014
    [16]
    李康斌, 朱齐丹, 牟进友, 等. 基于改进DDQN船舶自动靠泊路径规划方法[J]. 智能系统学报, 2025, 20(1): 73-80. doi: 10.11992/tis.202401005

    LI K B, ZHU Q D, MU J Y, et al. Automatic berthing path planning method for ships based on improved DDQN[J]. CAAI Transactions on Intelligent Systems, 2025, 20(1): 73-80. doi: 10.11992/tis.202401005
    [17]
    ZHU X, HOU X. Quantum architecture search via truly proximal policy optimization[J]. Scientific Reports, 2023, 13(1): 5157. doi: 10.1038/s41598-023-32349-2
    [18]
    徐红丽, 贾本卿, 栾阔. 基于改进人工势场的多UUV编队避障方法[J]. 东北大学学报(自然科学版), 2024, 45(11): 1547-1556. doi: 10.12068/j.issn.1005-3026.2024.11.004

    XU H L, JIA B Q, LUAN, K. Multi-UUV formation obstacle avoidance method based on improved artificial potential field[J]. Journal of Northeastern University (Natural Science), 2024, 45(11): 1547-1556. doi: 10.12068/j.issn.1005-3026.2024.11.004
    [19]
    程建华, 李鹏程, 管行, 等. 基于改进A*算法的UUV冰下避障航迹规划算法[J]. 导航定位与授时, 2021, 8(06): 13-18. doi: 10.19306/j.cnki.2095-8110.2021.06.002

    CHENG J H, LI P C, GUAN X, et al. UUV under-ice obstacle avoidance trajectory planning algorithm based on improved a algorithm[J]. Navigation Positioning and Timing, 2021, 8(06): 13-18. doi: 10.19306/j.cnki.2095-8110.2021.06.002
    [20]
    周畅, 于特, 刘佳鹏, 等. 基于快速随机搜索树*与凸优化的船舶路径规划与跟踪算法[J]. 中国舰船研究, 2025, 20(1): 147-161. doi: 10.19693/j.issn.1673-3185.03837

    ZHOU C, YU T, LIU J P, et al. Ship path planning and tracking algorithm based on rapidly-exploring random tree and convex optimization[J]. Chinese Journal of Ship Research, 2025, 20(1): 147-161. doi: 10.19693/j.issn.1673-3185.03837
    [21]
    滕建平, 梁霄, 陶浩, 等. 无人水下航行器全局路径规划及有限时间跟踪控制[J]. 上海海事大学学报, 2022, 43(01): 1-7. doi: 10.13340/j.jsmu.2022.01.001

    TENG J P, LIANG X, TAO H, et al. Global path planning and finite-time tracking control for unmanned underwater vehicles[J]. Journal of Shanghai Maritime University, 2022, 43(01): 1-7. doi: 10.13340/j.jsmu.2022.01.001
    [22]
    马焱, 肖玉杰, 陈轶, 等. 基于改进烟花-蚁群算法的海流环境下水下无人潜航器的避障路径规划[J]. 导航与控制, 2019, 18(1): 51-59.

    MA Y, XIAO Y J, CHEN Y, et al. Obstacle avoidance path planning for underwater unmanned vehicles in ocean current environments based on improved fireworks-ant colony algorithm[J]. Navigation and Control, 2019, 18(1): 51-59.
    [23]
    张宏瀚, 王亚博等. 近海复杂环境下UUV动态路径规划方法研究[J]. 智能系统学报, 2024, 19(1): 114-121. doi: 10.11992/tis.202302028

    ZHANG H H, WANG Y B, et al. Dynamic path planning method for UUVs in complex coastal environments[J]. CAAI Transactions on Intelligent Systems, 2024, 19(1): 114-121. doi: 10.11992/tis.202302028
    [24]
    王景楠, 薛晨阳, 齐向东, 等. 基于RBF神经网络PID的UUV轨迹跟踪控制[J]. 中北大学学报(自然科学版), 2024, 45(6): 843-851.

    WANG J N, XUE C Y, QI X D, et al. UUV trajectory tracking control based on RBF neural network PID[J]. Journal of North University of China (Natural Science Edition), 2024, 45(6): 843-851.
    [25]
    野汶博, 方洋旺, 洪瑞阳, 等. 基于控制障碍函数的欠驱动无人水下航行器椭圆障碍物避障制导[J]. 兵工学报, 2025, 46(5): 362-374. doi: 10.12382/bgxb.2024.0404

    YE W B, FANG Y W, HONG R Y, et al. Elliptical obstacle avoidance guidance for underactuated unmanned underwater vehicles based on control barrier functions[J]. Acta Armamentarii, 2025, 46(5): 362-374. doi: 10.12382/bgxb.2024.0404
    [26]
    何喆, 刘峰, 马子飞. 一种基于膨胀算法的多UUV队形生成与避障策略[J]. 中国新通信, 2022, 24(7): 40-42. doi: 10.3969/j.issn.1673-4866.2022.07.015

    HE Z, LIU F, MA Z F. Multi-UUV formation generation and obstacle avoidance strategy based on inflation algorithm[J]. China New Telecommunications, 2022, 24(7): 40-42. doi: 10.3969/j.issn.1673-4866.2022.07.015
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(3)

    Article Metrics

    Article Views(6) PDF Downloads(2) Cited by()
    Proportional views
    Related
    Service
    Subscribe

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return