融合Q学习与PID控制器的AUV跟踪控制

闫 敬; 李文飚; 杨 晛; 李兴龙; 罗小元

doi:10.11993/j.issn.2096-3920.2021.05.008

融合Q学习与PID控制器的AUV跟踪控制

doi: 10.11993/j.issn.2096-3920.2021.05.008

燕山大学电气工程学院, 河北秦皇岛, 066004

基金项目: 国家自然科学基金重点项目(编号: 62033011)

详细信息

作者简介:
闫敬(1985-), 男, 博士生导师, 教授, 研究方向为水下机器人/传感网协同监测.

中图分类号: TJ630.33 TP273.2
计量
- 文章访问数: 1493
- HTML全文浏览量: 50
- PDF下载量: 74
- 被引次数: 0
出版历程
- 收稿日期: 2020-10-27
- 修回日期: 2020-12-16
- 刊出日期: 2021-10-31

Tracking Control for AUV by Combining Q Learning and a PID Controller

School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China

摘要

摘要: 为进一步提升自主水下航行器(AUV)跟踪控制性能, 文中设计了一种融合Q学习与比例-积分-微分(PID)控制器的AUV跟踪控制算法。首先, 根据AUV的跟踪误差构建基于PID控制器的跟踪控制算法。为提升跟踪的静态与动态性能, 将PID控制器参数的自适应调整描述为一种Q学习问题。然后采用动作更新的形式对不同状态下的Q值进行迭代优化, 直到每个状态-动作所对应Q值保持不变。相比于传统的PID控制器, 该算法不仅可以保持PID简单实用的特点, 还可根据环境信息的变化进行参数自适应调整。仿真与试验结果均验证了所提算法的有效性。
- 自主水下航行器 /
- 跟踪控制 /
- Q学习 /
- PID控制器
Abstract: In this study, a tracking control algorithm for autonomous underwater vehicles(AUVs) is developed by combining Q learning and a proportional-integral-derivative(PID) controller. First, a PID-based tracking control algorithm based on the tracking error of the AUV is presented. To improve the static and dynamic tracking performances, the adaptive adjustment of the parameters of the PID controller is described by utilizing a Q learning problem; then, an action update strategy is employed to iteratively optimize the Q values in different states until the Q value corresponding to each state action is stabilized. Compared with the traditional PID controllers, the proposed control algorithm can preserve the simple and practical characteristics of PID controllers and adaptively adjust the parameters according to the changes in the environment information. Finally, the simulation and experimental results confirm the effectiveness of the proposed algorithm
- autonomous undersea vehicle(AUV) /
- tracking control /
- Q learning /
- proportional-integral-derivative(PID) controller

HTML全文

参考文献(20)

[1]	赵振轶, 李亚安, 陈晓, 等. 基于双观测站的水下机动目标被动跟踪[J]. 水下无人系统学报, 2018, 26(1): 40-45. Zhao Zhen-yi, Li Ya-an, Chen Xiao, et al. Passive Tracking of Underwater Maneuvering Target Based on Double Observation Station[J]. Journal of Unmanned Undersea Systems, 2018, 26(1): 40-45.
[2]	杜度. 基于RBF神经网络参数自整定的AUV深度控制[J]. 水下无人系统学报, 2019, 27(3): 284-289. Du Du. Parameters Self-Tuning for Depth Control of AUV Based on RBF Neural Network[J]. Journal of Unmanned Undersea Systems, 2019, 27(3): 284-289.
[3]	温志文, 蔡卫军, 杨春武. 基于改进蚁群算法的UUV三维路径规划方法[J]. 鱼雷技术, 2016, 24(2): 120-125. Wen Zhi-wen, Cai Wei-jun, Yang Chun-wu. Three- dimensional Path Planning Method Based on Improved Ant Colony Algorithm for UUV[J]. Torpedo Technology, 2016, 24(2): 120-125.
[4]	Shen C, Shi Y, Buckham B. Trajectory Tracking Control of an Autonomous Underwater Vehicle Using Lyapunov-based Model Predictive Control[J]. IEEE Transactions on Industrial Electronics, 2017, 65(7): 5796-5805.
[5]	Li Y, Wei C, Wu Q, et al. Study of 3 Dimension Trajectory Tracking of Underactuated Autonomous Underwater Vehicle[J]. Ocean Engineering, 2015, 105(1): 270-274.
[6]	Carlucho I, Paula M, Villar S, et al. Incremental Q-learning Strategy for Adaptive PID Control of Mobile Robots[J]. Expert Systems with Application, 2017, 80(1): 183-199.
[7]	赵健, 白春江, 章文俊. 水下潜器姿态角的分数阶PID 控制研究[J]. 船舶科学技术, 2016, 38(11): 129-132. Zhao Jian, Bai Chun-jiang, Zhang Wen-jun. Research on Fractional-order PID Control for Underwater Vehicle Attitude Angle[J]. Ship Science and Technology, 2016, 38(11): 129-132.
[8]	Ban H, Yang X, Luo X, et al. Fuzzy-based Tracking Controller Design for Autonomous Underwater Vehicle[C]//Chinese Control Conference, Dalian, China: IEEE, 2017: 4813-4818.
[9]	Hernandez-Alvarado R, Garcia-Valdovinos L, Salgado-Jimenez T, et al. Self-tuned PID Control Based on Backpropagation Neural Networks for underwater Vehicles [C]//International Conference on Ocean. Monterey, USA: IEEE, 2016, 1-5.
[10]	Wu H, Song S, You K. Depth Control of Model-Free AUVs via Reinforcement Learning[J]. IEEE Transactions on System, Man, and Cybernetics, 2019, 49(12): 2499-2510.
[11]	Kim M. Greedy Learning of Sparse Eigenfaces for Face Recognition and Tracking[J]. International Journal of Fuzzy Logic and Intelligent Systems, 2014, 14(3): 162-170.
[12]	Luo B, Liu D, Huang T, et al. Model-Free Optimal Tracking Control via Critic-Only Q-Learning[J]. IEEE Transactions on Neural Networks & Learning Systems, 2016, 27(10): 2134-2144.
[13]	Yuan C, He H, Wang C. Cooperative Deterministic Learning-based Formation Control for a Group of Nonlinear Uncertain Mechanical Systems[J]. IEEE Transactions on Industrial Informatics, 2019, 15(1): 319-333.
[14]	Yan J, Luo X, Li X, et al. Joint Localization and Tracking for Autonomous Underwater Vehicle: A Reinforcement Learning Based Approach[J]. IET Control Theory & Applications, 2019, 13(17): 2856-2865.
[15]	Yan J, Gao J, Yang X, et al. Tracking Control of a Remotely Operated Underwater Vehicle with Time Delay and Actuator Saturation[J]. Ocean Engineering, 2019, 184(1): 299-310.
[16]	Yan J, Gao J, Yang X, et al. Position Tracking Control of Remotely Operated Underwater Vehicles with Communication Delay[J]. IEEE Transactions on Control Systems Technology, 2019, 28(6): 2506-2514.
[17]	Galetzka A, Bontinck Z, Romer U, et al. A Multilevel Monte Carlo Method for High-dimensional Uncertainty Quantification of Low-frequency Electromagnetic Devices[J]. IEEE Transactions on Magnetics, 2019, 55(8): 1-12.
[18]	Wan Y, Roy S, Lesieutre B, et al. Uncertainty Evaluation Through Mapping Identification in Intensive Dynamic Simulations[J]. IEEE Transactions on System, Man, and Cybernetics, 2010, 40(5): 1094-1104.
[19]	Lewis F L, Vrabie D. Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control[J]. IEEE Circuits and Systems Magazine, 2009, 9(3): 32-50.
[20]	Huang S M, Giving S N. A Q-Learning Approach to Flocking With UAVs in a Stochastic Environment[J]. IEEE Transactions on Cybernetics, 2017, 47(1): 186-197.