Underwater Visual Multi-Target Tracking Algorithm Integrating Re-parameterization and Attention Mechanism

LI Junyi; HE Mingle; LIU Chang; XU Yong

doi:10.11993/j.issn.2096-3920.2025-0012

Volume 33 Issue 2

May 2025

Turn off MathJax

Article Contents

Article Navigation > Journal of Unmanned Undersea Systems > 2025 > 33(2): 249-260

LI Junyi, HE Mingle, LIU Chang, XU Yong. Underwater Visual Multi-Target Tracking Algorithm Integrating Re-parameterization and Attention Mechanism[J]. Journal of Unmanned Undersea Systems, 2025, 33(2): 249-260. doi: 10.11993/j.issn.2096-3920.2025-0012

Citation:

LI Junyi, HE Mingle, LIU Chang, XU Yong. Underwater Visual Multi-Target Tracking Algorithm Integrating Re-parameterization and Attention Mechanism[J]. Journal of Unmanned Undersea Systems, 2025, 33(2): 249-260. doi: 10.11993/j.issn.2096-3920.2025-0012

Citation:

PDF( 1807 KB)

Underwater Visual Multi-Target Tracking Algorithm Integrating Re-parameterization and Attention Mechanism

doi: 10.11993/j.issn.2096-3920.2025-0012

LI Junyi^{1, 2, 3},
HE Mingle^{1, 2, 3
,},
LIU Chang^{1, 2, 3},
XU Yong^{1, 2, 3}

1.
School of Automation, Guangdong University of Technology, Guangzhou 510006, China
2.
Guangdong-Hong Kong Joint Laboratory for Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou 510006, China
3.
Guangdong Provincial Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou 510006, China

Received Date: 2025-01-15
Accepted Date: 2025-02-25
Rev Recd Date: 2025-02-20

Available Online: 2025-03-10

Abstract

Abstract

The complex underwater environment can severely impact the stability of imaging devices and the quality of captured images, posing significant challenges for visual multi-target tracking in underwater unmanned autonomous systems. To address the difficulties arising from underwater camera jitter and image degradation, this paper proposed an underwater visual multi-target tracking algorithm that integrated re-parameterization and attention mechanisms, specifically tailored for underwater unmanned autonomous systems. First, to tackle the diversity of underwater targets and image degradation, an improved YOLOv8 algorithm based on re-parameterization and attention mechanism(RA-YOLOv8) was proposed. This algorithm effectively enhanced the network’s multi-scale feature extraction capability and improved the detection accuracy of the model by integrating a structurally re-parameterized multi-scale feature extraction convolutional structure(DBB-RFAConv) and the AMSCE-attention mechanism. Then, to address the challenges of real-time target tracking caused by underwater camera jitter, an Inner-PIoUv2-enhanced ByteTrack algorithm(IP2-ByteTrack) was proposed. Inner-PIoUv2 was used as the similarity measure in the matching process of the tracking algorithm, which enhanced the model’s performance in underwater detection and tracking tasks, improving the accuracy of tracking trajectory matching. Finally, based on the RA-YOLOv8 and IP2-ByteTrack algorithms, an underwater visual multi-target tracking algorithm that integrated re-parameterization and attention mechanisms for underwater autonomous systems was proposed. Experimental results show that the proposed algorithm exhibits excellent performance in complex underwater environments and can effectively address the shortcomings of existing methods in underwater multi-target tracking.
- underwater visual,
- multi-target tracking,
- YOLO,
- ByteTrack,
- re-parameterization; attention mechanism

FullText(HTML)

References(26)

References

[1]	徐正兴, 诸云, 吴祎楠. 基于改进Sigma点的无迹卡尔曼滤波水下目标跟踪算法[J]. 无人系统技术, 2023, 6(4): 22-30.
[2]	张博宇, 齐滨, 王晋晋, 等. 密集杂波背景下的水下多目标跟踪方法[J]. 导航定位与授时, 2023, 10(5): 31-39.
[3]	王学敏, 于洪波, 张翔宇, 等. 基于Hough变换检测前跟踪的水下多目标被动检测方法[J]. 兵工学报, 2023, 44(7): 2114-2121.
[4]	郑繁亭, 邢关生. 基于改进DeepSort的行人多目标跟踪算法[J]. 现代电子技术, 2023, 46(5): 40-46.
[5]	何水龙, 张靖佳, 张林俊, 等. 基于Transformer改进的YOLOv5+DeepSORT的车辆跟踪算法[J]. 汽车技术, 2024(7): 9-16.
[6]	陈辉, 杜双燕, 连峰, 等. Track-MT3: 一种基于Transformer的新型多目标跟踪算法[J]. 雷达学报, 2024, 13 (6): 1202-1219. doi: 10.12000/JR24164
[7]	赵海翔, 崔鸿武, 黄桢铭, 等. 基于Bytetrack的多目标跟踪算法在斑马鱼毒性行为识别中的应用[J]. 渔业科学进展, 2024, 45(2): 136-149.
[8]	REDMON J. You only look once: Unified, real-time object detection[C]//Computer Vision & Pattern Recognition. Las Vegas, USA: IEEE, 2016.
[9]	JIAN M W, LIU X Y, LUO H J, et al. Underwater image processing and analysis: A review[J]. Signal Processing: Image Communication, 2021, 91(1): 116088.
[10]	CHEN S, CHEN S, SHAN W, et al. A self-supervised underwater image denoising method based on pseudo-siamese neural network[C]//2023 5th International Conference on Robotics and Computer Vision. Nanjing, China: ICRCV, 2023: 130-134.
[11]	ZHANG X, LIU C, YANG D, et al. RFAConv: Innovating spatial attention and standard convolutional operation[EB/OL]. (2023-04-06) [2025-02-12]. https://doi.org/10.48550/arXiv.2304.03198.
[12]	DING X, ZHANG X, HAN J, et al. Diverse branch block: Building a convolution as an inception-like unit[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, Tennessee, USA: IEEE, 2021: 10886-10895.
[13]	OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Rhodes, Greece: IEEE, 2023: 1-5.
[14]	NARAYANAN M. SENetV2: Aggregated dense layer for channelwise and global representations[EB/OL]. (2023-11-17) [2025-02-12]. https://doi.org/10.48550/arXiv.2311.10807.
[15]	CAI X, LAI Q, WANG Y, et al. Poly kernel inception network for remote sensing detection[EB/OL]. (2024-03-10)[2025-02-12]. https://doi.org/10.48550/arXiv.2403.06258.
[16]	ZHANG Y, SUN P, JIANG Y, et al. Bytetrack: Multi-object tracking by associating every detection box[C]//European conference on computer vision. Tel Aviv, Israel: ECCV, 2022: 1-21.
[17]	LIU C, WANG K, LI Q, et al. Powerful-IoU: More straight forward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. Neural Networks, 2024, 170(1): 276-284.
[18]	ZHANG H, XU C, ZHANG S. Inner-iou: More effective intersection over union loss with auxiliary bounding box[EB/OL]. (2023-11-06) [2025-02-12]. https://doi.org/10.48550/arXiv.2311.02877.
[19]	ZHANG X, ZENG H, LIU X, et al. In situ holothurian noncontact counting system: A general framework for holothurian counting[J]. IEEE Access, 2020, 8: 210041-210053. doi: 10.1109/ACCESS.2020.3038643
[20]	YANG L, ZHANG R Y, LI L, et al. A simple, parameter-free attention module for convolutional neural networks[C]//International conference on machine learning.Vienna, Austria: PMLR, 2021: 11863-11874.
[21]	WOO S, PARK J, LEE J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision. Munich, Germany: ECCV, 2018: 3-19.
[22]	SUN S, REN W, GAO X, et al. Restoring images in adverse weather conditions via histogram transformer[C]//European Conference on Computer Vision. Shanghai, China: ECCV, 2025: 111-129.
[23]	WANG Y, LI Y, WANG G, et al. Multi-scale attention network for single image super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2024.
[24]	GUO M H, LU C Z, HOU Q, et al. Segnext: Rethinking convolutional attention design for semantic segmentation[J]. Advances in Neural Information Processing Systems, 2022, 35: 1140-1156.
[25]	HUANG H, CHEN Z, ZOU Y, et al. Channel prior convolutional attention for medical image segmentation[EB/OL]. (2023-11-06) [2025-02-12]. https://doi.org/10.48550/arXiv.2306.05196
[26]	MAGGIOLINO G, AHMAD A, CAO J, et al. Deep oc-sort: Multi-pedestrian tracking by adaptive re-identification[C]//2023 IEEE International Conference on Image Processing(ICIP). KL, Malaysia: IEEE, 2023: 3025-3029.