Intelligent Perception Algorithms for Sonar Images: A Survey
-
摘要: 声呐图像智能感知算法在海洋探测与水下救援中具有至关重要的作用。近年来, 深度学习技术在声呐图像智能感知任务中取得了显著进展。文中对该领域进行了全面的梳理, 从声呐图像数据集与数据增强、经典的声呐图像处理算法以及基于深度学习的声呐图像处理方法三个方面进行探讨。首先, 归纳了不同任务的开源数据集与常用的数据增强技术, 为后续研究提供数据支撑; 其次, 系统回顾了从经典算法到基于深度学习的先进算法在不同任务中的应用与发展现状, 旨在为研究者提供系统的领域概览; 最后, 基于国内外的研究进展, 文中展望了未来的研究方向, 指出可以通过获取更大规模的声呐图像数据、设计更强健的算法以及开发更适用于真实水下场景的任务设置, 进一步提升声纳图像的解译能力。Abstract: Intelligent perception algorithms for sonar images are vital in ocean exploration and underwater rescue. In recent years, deep learning has achieved remarkable advancements in intelligent perception tasks related to sonar images. This paper provides a comprehensive review of the field, focusing on sonar image datasets, data augmentation techniques, and the progression of sonar image processing algorithms, from classical approaches to deep learning-based methods. By summarizing open-source datasets and commonly used data augmentation techniques, we aim to support future research efforts. Additionally, this paper systematically examines the application and evolution of both classical and deep learning algorithms across various tasks, offering researchers an overview of the current state of the field. Finally, we explore potential future research directions, suggesting ways to enhance sonar image interpretation through larger datasets, more robust algorithms, and task settings better suited to real-world underwater environments.
-
表 1 现有声呐图像分类数据集信息汇总表
Table 1. Summary of existing sonar image classification dataset information
-
[1] HUO G, WU Z, LI J. Underwater object classification in sidescan sonar images using deep transfer learning and semisynthetic training data[J]. IEEE Access, 2020, 8: 47407-47418. doi: 10.1109/ACCESS.2020.2978880 [2] GERG I D, MONGA V. Structural prior driven regularized deep learning for sonar image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-16. [3] NGUYEN H-T, LEE E-H, LEE S. Study on the classification performance of underwater sonar image classification based on convolutional neural networks for detecting a submerged human body[J]. Sensors, 2020, 20(1): 94. [4] HUO G, YANG S X, LI Q, et al. A robust and fast method for sidescan sonar image segmentation using nonlocal despeckling and active contour model[J]. IEEE Transactions On Cybernetics, 2016, 47(4): 855-872. [5] YUAN F, XIAO F, ZHANG K, et al. Noise reduction for sonar images by statistical analysis and fields of experts[J]. Journal of Visual Communication and Image Representation, 2021, 74: 102995. doi: 10.1016/j.jvcir.2020.102995 [6] MCKAY J, MONGA V, RAJ R G. Robust sonar ATR through Bayesian pose-corrected sparse classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(10): 5563-5576. doi: 10.1109/TGRS.2017.2710040 [7] YE X, LI C, ZHANG S, et al. Research on side-scan sonar image target classification method based on transfer learning[C]//OCEANS MTS/IEEE Charleston (Conference). Harbin, China: IEEE, 2018. [8] OCHAL M, VAZQUEZ J, PETILLOT Y, et al. A comparison of few-shot learning methods for underwater optical and sonar image classification[C] //Global Oceans 2020. Singapore: IEEE, 2020: 1-10. [9] NOVAKOVIC J . Using information gain attribute evaluation to classify sonar targets[J]. Telecommunications Forum Telfor, 2009: 1351-1354. [10] REED S, PETILLOT Y, BELL J. An automatic approach to the detection and extraction of mine features in sidescan sonar[J]. IEEE journal of oceanic engineering, 2003, 28(1): 90-105. doi: 10.1109/JOE.2002.808199 [11] KUMAR N, TAN Q F, NARAYANAN S S. Object classification in sidescan sonar images with sparse representation techniques[C]//2012 IEEE international conference on acoustics, speech and signal processing (ICASSP). Kyoto, Japan: ICASSP, 2012: 1333-1336. [12] MCKAY J, GERG I, MONGA V, et al. What’s mine is yours: Pretrained CNNs for limited training sonar ATR[C]//Oceans 2017-Anchorage. Anchorage, USA: IEEE, 2017: 1-4. [13] WANG X, JIAO J, YIN J, et al. Underwater sonar image classification using adaptive weights convolutional neural network[J]. Applied Acoustics, 2019, 146(3): 145-154. [14] Valdenegro-Toro M. Real-time convolutional networks for sonar image classification in low-power embedded systems[J]. Sensors, 2017: 1709.02153. [15] WILLIAMS D P. On the use of tiny convolutional neural networks for human-expert-level classification performance in sonar imagery[J]. IEEE Journal of Oceanic Engineering, 2020, 46(1): 236-260. [16] ZHU P, ISAACS J, FU B, et al. Deep learning feature extraction for target recognition and classification in underwater sonar images[C]//2017 IEEE 56th Annual Conference on Decision and Control (CDC). Australia, Melbourne: IEEE, 2017: 2724-2731. [17] Singh D, Valdenegro-Toro M. [C]//Proceedings of the ieee/cvf international conference on computer vision. Canada, Montreal: IEEE, 2021: 3741-3749. [18] CUI Y, JIA M, LIN T Y, et al. Class-balanced loss based on effective number of samples[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Los Angeles, USA: CVPR, 2019: 9268-9277. [19] JIAO W, ZHANG J, ZHANG C. Open-set recognition with long-tail sonar images[J]. Expert Systems with Applications, 2024: 123495. [20] ZHU X, LIANG Y, ZHANG J, et al. STAFNet: Swin transformer based anchor-free network for detection of forward-looking sonar imagery[C]//Proceedings of the 2022 International Conference on Multimedia Retrieval. Newark, USA: ACM, 2022: 443-450. [21] ZHANG P, TANG J, ZHONG H, et al. Self-trained target detection of radar and sonar images using automatic deep learning[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-14. [22] XIE K, YANG J, QIU K. A dataset with multibeam forward-looking sonar for underwater object detection[J]. Scientific Data, 2022, 9(1): 739. doi: 10.1038/s41597-022-01854-w [23] LIANG Y, ZHU X, ZHANG J. MiTU-Net: An Efficient Mix Transformer U-like Network for Forward-looking Sonar Image Segmentation[C]//2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI). Changsha, China: CCAI, 2022: 149-154. [24] CHENG Z, HUO G, LI H. A multi-domain collaborative transfer learning method with multi-scale repeated attention mechanism for underwater side-scan sonar image classification[J]. Remote Sensing, 2022, 14(2): 355. doi: 10.3390/rs14020355 [25] JIAO W, ZHANG J. Sonar images classification while facing long-tail and few-shot[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1-20. [26] ZHU M, SONG Y, GUO J, et al. PCA and kernel-based extreme learning machine for side-scan sonar image classification[C]//2017 IEEE Underwater Technology (UT). Busan, Korea: IEEE, 2017: 1-4. [27] KUMAR N, MITRA U, NARAYANAN S S. Robust object classification in underwater sidescan sonar images by using reliability-aware fusion of shadow features[J]. IEEE Journal of oceanic engineering, 2014, 40(3): 592-606. [28] WANG X, LIU X, JAPKOWICZ N, et al. Automated approach to classification of mine-like objects using multiple-aspect sonar images[J]. Journal of Artificial Intelligence and Soft Computing Research, 2014, 4(2): 133-148. doi: 10.1515/jaiscr-2015-0004 [29] MYERS V, FAWCETT J. A template matching procedure for automatic target recognition in synthetic aperture sonar imagery[J]. IEEE Signal Processing Letters, 2010, 17(7): 683-686. doi: 10.1109/LSP.2010.2051574 [30] TAO W, LIU Y. Combined imaging matching method of side scan sonar images with prior position knowledge[J]. IET Image Processing, 2018, 12(2): 194-199. doi: 10.1049/iet-ipr.2017.0172 [31] LIU H, YE X. Forward-looking sonar image stitching based on midline template matching in polar image[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 62: 1-10. [32] LANE D M, STONER J P. Automatic interpretation of sonar imagery using qualitative feature matching[J]. IEEE Journal of Oceanic Engineering, 1994, 19(3): 391-405. doi: 10.1109/48.312915 [33] AYKIN M D, NEGAHDARIPOUR S. On feature matching and image registration for two-dimensional forward-scan sonar imaging[J]. Journal of Field Robotics, 2013, 30(4): 602-623. doi: 10.1002/rob.21461 [34] AYKIN M, NEGAHDARIPOUR S. On feature extraction and region matching for forward scan sonar imaging[C]//2012 Oceans. Yeosu, Korea: IEEE, 2012: 1-9. [35] TUELLER P, KASTNER R, DIAMANT R. A comparison of feature detectors for underwater sonar imagery[C]//OCEANS 2018 MTS/IEEE Charleston. South Carolina, Charleston, USA: IEEE&MTS, 2018: 1-6. [36] LU Z, SHI Y. A Novel target detector of marine radar based on HOG feature[C]//2021 IEEE International Conference on Mechatronics and Automation (ICMA). Takamatsu, Japan: IEEE, 2021: 727-732. [37] FAKIRIS E, PAPATHEODOROU G. Sonar Class: A MATLAB toolbox for the classification of side scan sonar imagery, using local textural and reverberational characteristics[C]//International Conference on Advanced Concepts for Intelligent Vision Systems (ACIVS 2006). Antwerp, Belgium: ACIVS, 2006: 488 - 499. [38] HE J, CHEN J, XU H, et al. SonarNet: Hybrid CNN-transformer-HOG framework and multifeature fusion mechanism for forward-looking sonar image segmentation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 1-17. [39] WANG X, WANG J, YANG F, et al. Target detection in colorful imaging sonar based on HOG[C]//2018 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC). Qingdao, China: IEEE, 2018: 1–5. [40] LIN Y, LIU B. Underwater image bidirectional matching for localization based on SIFT[J]. Journal of Marine Science and Application, 2014, 13(2): 225-229. doi: 10.1007/s11804-014-1252-z [41] XU H, YUAN H. An SVM-based AdaBoost cascade classifier for sonar image[J]. IEEE Access, 2020, 8: 115857-115864. doi: 10.1109/ACCESS.2020.3004473 [42] ERKMEN B, YLDRM T. Improving classification performance of sonar targets by applying general regression neural network with PCA[J]. Expert Systems with Applications, 2008, 35(1-2): 472-475. doi: 10.1016/j.eswa.2007.07.021 [43] JI X, YANG B, TANG Q. Seabed sediment classification using multibeam backscatter data based on the selecting optimal random forest model[J]. Applied Acoustics, 2020, 167: 107387. doi: 10.1016/j.apacoust.2020.107387 [44] NAIT-CHABANE A, ZERR B, LE CHENADEC G. Sidescan sonar imagery segmentation with a combination of texture and spectral analysis[C]//2013 MTS/IEEE OCEANS-Bergen. Bergen, Norway: IEEE&MTS, 2013: 1-6. [45] FEBRIAWAN H, HELMHOLZ P, PARNUM I. Support vector machine and decision tree based classification of side-scan sonar mosaics using textural features[J]. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences-ISPRS Archives, 2019, 42(2/W13): 27-34. [46] WANG J, XIAOLEI D, ZHOU P. Current situation and review of image segmentation[J]. Recent Patents on Computer Science, 2017, 10(1): 70-79. [47] ZHEN L, YUCHAO C, TIEDONG Z, et al. A sonar image segmentation algorithm based on two-dimensional spatio-temporal fuzzy entropy[C]//2018 IEEE 8th International Conference on Underwater System Technology: Theory and Applications (USYS). Wuhan, China: IEEE, 2018: 1-5. [48] LIU L, BIAN H, YAGI S-I, et al. A prior-knowledge-based threshold segmentation method of forward-looking sonar images for underwater linear object detection[J]. Japanese Journal of Applied Physics, 2016, 55(7): 07KG06. [49] YUAN X, MARTNEZ J-F, ECKERT M, et al. Án improved Otsu threshold segmentation method for underwater simultaneous localization and mapping-based navigation[J]. Sensors, 2016, 16(7): 1148. doi: 10.3390/s16071148 [50] QSONG S, SI B, FENG X, et al. Prior parameter estimation for Ising-MRF-based sonar image segmentation by local center-encoding[C]//OCEANS 2015-Genova. Genova, Italy: IEEE, 2015: 1-5. [51] SONG S, SI B, FENG X, et al. Label field initialization for MRF-based sonar image segmentation by selective autoencoding[C]//OCEANS 2016-Shanghai. Shanghai, China: IEEE, 2016: 1-5. [52] BANERJEE S, RAY R, SHOME S N, et al. Noise induced feature enhancement and object segmentation of forward looking SONAR image[J]. Procedia Technology, 2014, 14: 125-132. doi: 10.1016/j.protcy.2014.08.017 [53] LIU G Y, PANG Y J, BIAN H Y, et al. Sonar image segmentation using the level set method without re-initialization[J]. Advanced Materials Research, 2014, 981: 368-371. doi: 10.4028/www.scientific.net/AMR.981.368 [54] OJALA T, PIETIKAINEN M, HARWOOD D. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions[C]//Proceedings of 12th international conference on pattern recognition. Lisbon, Portugal: IEEE, 1994, 1: 582-585. [55] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]//2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05). California, San Diego, USA: IEEE, 2005, 1 : 886–893. [56] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[R/OL]. (2015-04-10)[2025-05-06]. arXiv preprint arXiv: 1409.1556, 2014. [57] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770–778. [58] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Nevada, Las Vegas, USA: IEEE, 2016: 2818–2826. [59] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, Honolulu, USA: IEEE, 2017: 4700-4708. [60] HOWARD A G, ZHU M, CHEN B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[R/OL]. (2017-04-26)[2025-05-06]. arXiv preprint arXiv: 1704.04861. [61] TAN M, LE Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning. Vienna, Austria: PMLR, 2019: 6105-6114. [62] DOSOVITSKIY A. An image is worth 16x16 words: Transformers for image recognition at scale[R/OL]. (2020-12-01)[2025-05-06]. arxiv preprint arxiv: 2010.11929. [63] LIU Z, LIN Y, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 10012-10022. [64] RUTLEDGE J, YUAN W, WU J, et al. Intelligent shipwreck search using autonomous underwater vehicles[C]//2018 IEEE International Conference on Robotics and Automation (ICRA). Brisbane, Australia: IEEE, 2018: 6175–6182. [65] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Nevada, Las Vegas, USA: IEEE, 2016: 779-788. [66] LIU W, ANGUELOV D, ERHAN D, et al. Ssd: Single shot multibox detector[C]//Computer Vision-ECCV 2016: 14th European Conference. Amsterdam, The Netherlands: ECCV, 2016: 14(1): 21-37. [67] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Amsterdam, Netherlands: IEEE, 2017: 2980-2988. [68] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2016, 39(6): 1137 - 1149. [69] HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2961-2969. [70] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//European Conference on Computer Vision. Glasgow, Scotland: ECCV, 2020: 213–229. [71] SUN P, ZHANG R, JIANG Y, et al. Sparse R-CNN: End-to-end object detection with learnable proposals[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Tennessee, Nashville, USA: IEEE, 2021: 14454–14463. [72] ZHU X, SU W, LU L, et al. Deformable DETR: Deformable transformers for end-to-end object detection[C]//International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020: 1-12. [73] WANG Z, GUO J, ZENG L, et al. MLFFNet: Multilevel feature fusion network for object detection in sonar images[J]. IEEE Transactions on Geoscience and Remote Sensing, New York, USA: IEEE, 2022, 60: 1-19. [74] KVASIC I, MIŠKOVIC N, VUKIC Z. Convolutional neural network architectures for sonar-based diver detection and tracking[C]//ÓCEANS 2019-Marseille. Marseille, France: IEEE, 2019: 1-6. [75] SUNG M, CHO H, KIM T, et al. Crosstalk removal in forward scan sonar image using deep learning for object detection[J]. IEEE Sensors Journal, 2019, 19(21): 9929-9944. doi: 10.1109/JSEN.2019.2925830 [76] CHEN G, MAO Z, WANG K, et al. HTDet: A hybrid transformer-based approach for underwater small object detection[J]. Remote Sensing, 2023, 15(4): 1-16. [77] ZHOU X, TIAN K, ZHOU Z, et al. SID-TGAN: A transformer-based generative adversarial network for sonar image despeckling[J]. Remote Sensing, 2023, 15(20): 5072. doi: 10.3390/rs15205072 [78] CHEN R, ZHAN S, CHEN Y. Underwater target detection algorithm based on YOLO and Swin transformer for sonar images[C]//OCEANS 2022, Hampton Roads. Virginia, Virginia Beach, USA: IEEE, 2022: 1–7. [79] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Massachusetts, Boston, USA: IEEE, 2015: 3431-3440. [80] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Advances In Neural Information Processing Systems, 2012, 25: 1097-1105. [81] BADRINARAYANAN V, KENDALL A, CIPOLLA R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. doi: 10.1109/TPAMI.2016.2644615 [82] RONNEBERGER O, FISCHER P, BROX T. U-NET: Convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Aassisted Intervention–MICCAI 2015: 18th International Conference. Munich, Germany: Springer, 2015, 18(3): 234–241. [83] OKTAY O, SCHLEMPER J, FOLGOC L L, et al. Attention U-Net: Learning where to look for the pancreas[C]//Medical Imaging with Deep Learning. Zurich, Switzerland: 2022. [84] ZHOU Z, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet++: Redesigning skip connections to exploit multiscale features in image segmentation[J]. IEEE Transactions on Medical Imaging, 2019, 39(6): 1856-1867. [85] HUANG H, LIN L, TONG R, et al. UNet 3+: A full-scale connected unet for medical image segmentation[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona, Spain: IEEE, 2020: 1055-1059. [86] ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, Honolulu, USA: IEEE, 2017: 2881-2890. [87] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected crfs[C]//International Conference on Learning Representations. San Diego, California, USA: IEEE, 2017, 40(4): 834 - 848. [88] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848. [89] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[J]. (2017-06-01)[2025-05-06]. arXiv preprint arXiv: 1706.05587. [90] ZHANG T, LIU S, HE X, et al. Underwater target tracking using forward-looking sonar for autonomous underwater vehicles[J]. Sensors, 2019, 20(1): 1-16. doi: 10.3390/s20010001 [91] GUSSEN C M, DINIZ P S, CAMPOS M L, et al. A survey of underwater wireless communication technologies[J]. J. Commun. Inf. Sys, 2016, 31(1): 242-255. [92] LIANG Y, ZHU X, ZHANG J. Maanu-Net: Multi-level attention and atrous pyramid nested U-Net for wrecked objects segmentation in forward-looking sonar images[C]//2022 IEEE International Conference on Image Processing (ICIP). Bordeaux, France: IEEE, 2022: 736-740. -