基于小波变换特征增强的水下目标检测方法

魏楠; 杨万扣; 周伟杰; 姜龙玉

doi:10.11993/j.issn.2096-3920.2025-0003

基于小波变换特征增强的水下目标检测方法

doi: 10.11993/j.issn.2096-3920.2025-0003

1.
东南大学软件学院, 江苏南京, 211102
2.
东南大学自动化学院, 江苏南京, 210096
3.
东南大学计算机科学与工程学院, 江苏南京, 211102

基金项目: 国家自然科学基金项目(61871124、61876037); 声呐科学技术实验室支持国防预研基金项目(6142109KF201806); 声学科学技术实验室稳定支持国防预研基金项目(JCKYS2019604SSJSSO12).

详细信息

通讯作者:
姜龙玉(1982-), 女, 教授, 博士生导师, 主要研究方向为水声信号与图像处理及人工智能大数据等.

中图分类号: TP391.41; TJ630
计量
- 文章访问数: 11
- HTML全文浏览量: 7
- PDF下载量: 2
- 被引次数: 0
出版历程
- 收稿日期: 2025-01-06
- 修回日期: 2025-02-18
- 录用日期: 2025-02-25
- 网络出版日期: 2025-03-24

Underwater Object Detection Method with Wavelet Transform for Optimized Feature Representation

1.
College of Software Engineering, Southeast University, Nanjing 211102, China
2.
School of Automation, Southeast University, Nanjing 210096, China
3.
School of Computer Science and Engineering, Southeast University, Nanjing 211102, China

摘要

摘要: 复杂特殊的水下环境导致水下图像质量较低, 存在对比度低、模糊及水下退化等特性, 这极大地影响了水下目标检测性能。针对这一问题, 文中提出了一种基于小波变换特征增强的水下目标检测方法。引入了离散小波变换将深度学习框架中提取到的多层次特征进行高低频分解, 进而将分解得到的频域特征分量通过文中设计的基于注意力机制的频域交互模块进行交互增强, 优化特征表达能力, 经过增强后的特征继续通入目标检测网络用于改善目标检测的性能。最终经过实验证明, 文中提出的水下目标检测方法与常见的目标检测方法相比, 在性能上具有一定优势, 能够有效提升水下目标检测的能力。
- 水下目标检测 /
- 深度学习 /
- 小波变换
Abstract: The complex and unique underwater environment results in low-quality underwater images, characterized by low contrast, blurriness, and underwater degradation, which significantly impacts the capabilities of underwater object detection. To address this issue, this paper proposes a wavelet transform-based feature enhancement method for underwater object detection. We introduce discrete wavelet transform (DWT) to decompose the multi-level features extracted by the deep learning framework into high and low-frequency components. These frequency domain feature components are then interactively enhanced using a frequency domain interaction module based on attention mechanism designed in this work. The enhanced features are subsequently fed into the object detection network to improve the performance. Experimental results demonstrate that the proposed underwater object detection method outperforms conventional object detection methods, significantly improving the abilities in underwater environments.
- underwater object detection /
- deep learning /
- wavelet transform

HTML全文

图 1 算法总体模型架构图

Figure 1. Architecture diagram of overall algorithm model

下载: 全尺寸图片幻灯片

图 2 小波变换多尺度特征分解模块结构图

Figure 2. Structure diagram of the wavelet transform multi-scale feature decomposition module

下载: 全尺寸图片幻灯片

图 3 注意力机制频域增强模块结构图

Figure 3. Structure diagram of attention mechanism frequency domain enhancement module

下载: 全尺寸图片幻灯片

表 1 各类目标检测方法性能对比

Table 1. Performance comparison of various target detection methods

方法	GFLOPS	Params(m)	Map/%	AP₅₀ /%	AP₇₅ /%	APs/%	APm/%	APl/%
FCOS^[16]	51.730	32.155	39.2	62.4	43.6	25.5	36.4	37.8
Faster RCNN^[14]	63.583	41.753	46.5	72.0	51.1	30.7	43.4	48.2
Reppoints^[18]	48.581	36.845	33.6	54.8	35.8	21.1	31.0	35.6
GFL^[17]	52.522	32.300	49.3	71.0	55.9	27.7	44.0	52.9
FSAF^[19]	52.693	36.420	34.9	56.3	37.0	28.9	32.1	38.4
ATSS^[20]	51.730	32.155	50.0	73.6	56.7	31.5	44.2	54.6
RetinaNet^[15]	61.141	37.969	36.5	57.3	40.2	20.3	32.6	36.9
FCOS+WFDE	51.751	32.234	45.3(+6.1)	72.7	51.9	25.2	42.3	46.1
GFL+WFDE	52.543	32.379	51.0(+1.7)	74.1	58.4	29.3	44.3	55.3

下载: 导出CSV

表 2 其他水下数据集实验结果

Table 2. Experimental results of other underwater datasets

数据集	方法	mAP /%	AP₅₀/%	AP₇₅/%
UTDAC	RetinaNet^[14]	37.4	70.3	35.8
	Sparse RCNN^[22]	37.4	70.4	35.9
	文中方法	42.8	78.6	42.4
UODD^[21]	Cascade RCNN^[23]	46.3	85.3	43.8
	Faster RCNN^[13]	45.5	85.0	43.8
	文中方法	50.6	87.2	55.5

下载: 导出CSV

表 3 交互组合消融实验结果

Table 3. Interactive combination ablation experiment results

基线模型	HH/LL交互	LH/HL交互	mAP/%
FCOS^[16]	×	×	39.2
	×	√	42.8(+3.6)
	√	×	41.5(+2.3)
	√	√	45.3(+6.1)

下载: 导出CSV

表 4 注意力权重生成消融实验结果

Table 4. Experimental results of attentional weight generation ablation

基线模型	K是否交换	V是否交换	mAP/%
FCOS^[16]	×	×	39.2
	×	√	41.9(+2.7)
	√	×	43.6(+4.4)
	√	√	45.3(+6.1)

下载: 导出CSV

参考文献(23)

[1]	FU C P, LIU R S, FAN X, et al. Rethinking general underwater object detection: Datasets, challenges, and solutions[J]. Neurocomputing, 2023, 517: 243-256. doi: 10.1016/j.neucom.2022.10.039
[2]	LIN W H, ZHONG J X, LIU S, et al. Roimix: Proposal-fusion among multiple images for underwater object detection[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona, Spain: IEEE, 2020: 2588-2592.
[3]	CHEN L, LIU Z, TONG L, et al. Underwater object detection using invert multi-class adaboost with deep learning[C]//2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020: 1-8.
[4]	LIANG X, SONG P. Excavating roi attention for underwater object detection[C]//2022 IEEE International Conference on Image Processing (ICIP). Bordeaux, France: IEEE, 2022: 2651-2655.
[5]	SONG P, LI P, DAI L, et al. Boosting R-CNN: Reweighting R-CNN samples by RPN’s error for underwater object detection[J]. Neurocomputing, 2023, 530: 150-164.
[6]	FU C, FAN X, XIAO J, et al. Learning heavily-degraded prior for underwater object detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(11): 6887-6896.
[7]	ZHOU J, HE Z, LAM K M, et al. AMSP-UOD: When vortex convolution and stochastic perturbation meet underwater object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver, Canada: AAAI, 2024, 38(7): 7659-7667.
[8]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, Nevada, USA: IEEE, 2016: 770-778.
[9]	SUN S, REN W, WANG T, et al. Rethinking image restoration for object detection[J]. Advances in Neural Information Processing Systems, 2022, 35: 4461-4474.
[10]	Haar A. Zur Theorie der orthogonalen Funktionensysteme. (Zweite Mitteilung)[J]. Mathematische Annalen, 1912, 71: 38-53.
[11]	LIU C, LI H, WANG S, et al. A dataset and benchmark of underwater object detection for robot picking[C]//2021 IEEE international conference on multimedia & expo workshops (ICMEW). Shenzhen, China: IEEE, 2021: 1-6.
[12]	PEDERSEN M , HAURUM J B , GADE R, et al. Detection of marine animals in a new underwater dataset with varying visibility[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Long Beach, California, USA: IEEE, 2019: 18-26.
[13]	DENG J, DONG W, SOCHER R, et al. Imagenet: A large-scale hierarchical image database[C]//2009 IEEE conference on computer vision and pattern recognition. Florida, USA: IEEE, 2009: 248-255.
[14]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149.
[15]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal Loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2980-2988.
[16]	TIAN Z, SHEN C, CHEN H, et al. FCOS: A simple and strong anchor-free object detector[J]. IEEE Transactions On Pattern Analysis And Machine Intelligence, 2020, 44(4): 1922-1933.
[17]	LI X, WANG W, WU L, et al. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection[J]. Advances in Neural Information Processing Systems, 2020, 33: 21002-21012.
[18]	YANG Z, LIU S, HU H, et al. Reppoints: Point set representation for object detection[C]//Proceedings of the IEEE/CVF International Conference On Computer Vision. Seoul, Korea: IEEE, 2019: 9657-9666.
[19]	ZHU C, HE Y, SAVVIDES M. Feature selective anchor-free module for single-shot object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, California, USA: IEEE, 2019: 840-849.
[20]	ZHANG S, CHI C, YAO Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, Washington, USA: IEEE, 2020: 9759-9768.
[21]	JIANG L, WANG Y, JIA Q, et al. Underwater species detection using channel sharpening attention[C]//Proceedings of the 29th ACM International Conference on Multimedia. Chengdu, China: Proceedings of the 29th ACM International Conference on Multimedia, 2021: 4259-4267.
[22]	SUN P, ZHANG R, JIANG Y, et al. Sparse r-cnn: End-to-end object detection with learnable proposals[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Nashville, Tennessee, USA: IEEE, 2021: 14454-14463.
[23]	CAI Z, VASCONCELOS N. Cascade r-cnn: Delving into high quality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. SLC, Utah, USA: IEEE, 2018: 6154-6162.