• 中国科技核心期刊
  • JST收录期刊
  • Scopus收录期刊
  • DOAJ收录期刊

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于风格迁移的侧扫声呐图像扩充方法

白忠玉 徐红丽 茹敬雨 邱少雄

白忠玉, 徐红丽, 茹敬雨, 等. 基于风格迁移的侧扫声呐图像扩充方法[J]. 水下无人系统学报, 2025, 33(4): 599-605 doi: 10.11993/j.issn.2096-3920.2025-0045
引用本文: 白忠玉, 徐红丽, 茹敬雨, 等. 基于风格迁移的侧扫声呐图像扩充方法[J]. 水下无人系统学报, 2025, 33(4): 599-605 doi: 10.11993/j.issn.2096-3920.2025-0045
BAI Zhongyu, XU Hongli, RU Jingyu, QIU Shaoxiong. Style Transfer-Based Augmentation for Side-Scan Sonar Images[J]. Journal of Unmanned Undersea Systems, 2025, 33(4): 599-605. doi: 10.11993/j.issn.2096-3920.2025-0045
Citation: BAI Zhongyu, XU Hongli, RU Jingyu, QIU Shaoxiong. Style Transfer-Based Augmentation for Side-Scan Sonar Images[J]. Journal of Unmanned Undersea Systems, 2025, 33(4): 599-605. doi: 10.11993/j.issn.2096-3920.2025-0045

基于风格迁移的侧扫声呐图像扩充方法

doi: 10.11993/j.issn.2096-3920.2025-0045
基金项目: 国家自然科学基金青年科学基金项目(62303099).
详细信息
    作者简介:

    白忠玉(1996-), 男, 博士, 主要研究方向为水下智能感知

  • 中图分类号: TJ630; U663

Style Transfer-Based Augmentation for Side-Scan Sonar Images

  • 摘要: 侧扫声呐(SSS)因其在自主水下航行器(AUV)上的稳定性与高效性, 在海洋探测领域得到了广泛应用。然而, SSS图像获取难度大、样本数量稀少, 严重限制了基于深度神经网络(DNN)的SSS图像分类性能。针对这一问题, 文中提出了一种多尺度注意力融合网络(MSANet), 利用光学-声学图像进行数据扩充来提升SSS图像分类模型的泛化能力。首先, 通过编码器不同层提取输入图像的浅层与深层特征, 以充分捕捉内容与风格信息。随后, 引入多尺度注意力模块(MSAM), 提取风格图像在通道维度上的局部与全局上下文信息, 并与光学特征进行高效融合, 使光学特征在不同空间位置精准匹配相应的声学特征。最终, 将不同层融合后的特征进行尺度对齐, 并输入到解码器生成高质量的模拟SSS图像样本, 用于训练SSS图像分类网络。在真实SSS图像数据集上的实验结果表明, 提出的风格迁移方法能够有效生成高质量模拟SSS图像样本, 进而提高基于DNN的SSS图像分类性能。

     

  • 图  1  MSANet整体框架图

    Figure  1.  The overall framework of the multi-scale attention network

    图  2  MSAM结构示意图

    Figure  2.  Schematic diagram of multi-scale attention module

    图  3  不同方法生成的模拟SSS图像

    Figure  3.  Simulated SSS images generated by different methods

    表  1  不同图像扩充方法的SSS图像分类实验结果

    Table  1.   Experimental results of SSS image classification with different image augmentation methods

    方法 图像张数 识别精度/%
    飞机 沉船
    光学图像 8 98 77.37
    WCT 9 101 80.29
    文献[9] 13 105 86.13
    SANet 10 109 86.86
    文献[4] 12 111 89.78
    文中方法 13 115 93.43
    下载: 导出CSV

    表  2  不同方法合成图像训练分类器的分类实验结果

    Table  2.   Classification experimental results of classifiers trained with synthetic images using different methods

    方法 图像张数 识别精度/%
    飞机 沉船
    WCT 22 125 53.65
    文献[9] 25 133 57.67
    SANet 19 128 53.65
    文献[4] 19 179 72.26
    文中方法 23 218 87.96
    下载: 导出CSV

    表  3  不同分类器对真实SSS图像的分类实验结果

    Table  3.   Classification experimental results of real SSS images by different classifiers

    方法 图像张数 识别精度/%
    飞机 沉船
    VGG-19 21 209 83.94
    ResNet-152 23 218 87.96
    ViT-B/32 23 213 86.13
    EfficientNet-v2s 25 215 87.59
    下载: 导出CSV
  • [1] 郝紫霄, 王琦. 基于声呐图像的水下目标检测研究综述[J]. 水下无人系统学报, 2023, 31(2): 339-348.

    HAO Z X, WANG Q. Underwater target detection based on sonar image[J]. Journal of Unmanned Undersea Systems, 2023, 31(2): 339-348.
    [2] 朱兆彤, 付学志, 胡友峰. 一种利用迁移学习训练卷积神经网络的声呐图像识别方法[J]. 水下无人系统学报, 2020, 28(1): 89-96.

    ZHU Z T, FU X Z, HU Y F. A sonar image recognition method based on convolutional neural network trained through transfer learning[J]. Journal of Unmanned Undersea Systems, 2020, 28(1): 89-96.
    [3] ZHU B, WANG X, CHU Z, et al. Active learning for recognition of shipwreck target in side-scan sonar image[J]. Remote Sensing, 2019, 11(3): 243. doi: 10.3390/rs11030243
    [4] LI C, YE X, CAO D, et al. Zero shot objects classification method of side scan sonar image based on synthesis of pseudo samples[J]. Applied Acoustics, 2021, 173: 107691. doi: 10.1016/j.apacoust.2020.107691
    [5] NAYAK N, NARA M, GAMBIN T, et al. Machine learning techniques for AUV side-scan sonar data feature extraction as applied to intelligent search for underwater archaeological sites[C]//Field and Service Robotics: Results of the 12th International Conference. Singapore: Springer, 2021: 219-233.
    [6] JIANG Y, KU B, KIM W, et al. Side-scan sonar image synthesis based on generative adversarial network for images in multiple frequencies[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 18(9): 1505-1509.
    [7] HUANG C, ZHAO J, YU Y, et al. Comprehensive sample augmentation by fully considering SSS imaging mechanism and environment for shipwreck detection under zero real samples[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-14.
    [8] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2025-03-10]. https://arxiv.org/abs/1409.1556.
    [9] DAI Y, GIESEKE F, OEHMCKE S, et al. Attentional feature fusion[C]//2021 IEEE Winter Conference on Applications of Computer Vision(WACV). Waikoloa, USA: WACV, 2021: 3559-3568.
    [10] CAO Y, XU J, LIN S, et al. GCNET: Non-local networks meet squeeze-excitation networks and beyond[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop(ICCVW). Seoul, Korea(South): ICCVW, 2019: 1971-1980.
    [11] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas, NV, USA: CVPR, 2016: 770-778.
    [12] PARK D Y, LEE K H. Arbitrary style transfer with style-attentional networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach, CA, USA: CVPR, 2019: 5873-5881.
    [13] NOMAN M, STANKOVIC V, TAWFIK A. Object detection techniques: Overview and performance comparison[C]//2019 IEEE International Symposium on Signal Processing and Information Technology(ISSPIT). Ajman, United Arab Emirates: ISSPIT, 2019: 1-5.
    [14] PHILLIPS F, MACKINTOSH B. Wiki Art Gallery, inc.: A case for critical thinking[J]. Issues in Accounting Education, 2011, 26(3): 593-608. doi: 10.2308/iace-50038
    [15] LI Y, FANG C, YANG J, et al. Universal style transfer via feature transforms[J]. Advances in Neural Information Processing Systems, 2017, 386-396.
    [16] HUANG X, BELONGIE S. Arbitrary style transfer in real-time with adaptive instance normalization[C]//2017 IEEE International Conference on Computer Vision(ICCV). Venice, Italy: ICCV, 2017: 1510-1519.
    [17] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03) [2025-03-10]. https://arxiv.org/abs/2010.11929.
    [18] TAN M, LE Q. Efficientnetv2: Smaller models and faster training[EB/OL]. (2021-06-23) [2025-03-10]. https://arxiv.org/abs/2104.00298.
  • 加载中
图(3) / 表(3)
计量
  • 文章访问数:  39
  • HTML全文浏览量:  7
  • PDF下载量:  5
  • 被引次数: 0
出版历程
  • 收稿日期:  2025-03-13
  • 修回日期:  2025-04-07
  • 录用日期:  2025-04-16
  • 网络出版日期:  2025-07-07

目录

    /

    返回文章
    返回
    服务号
    订阅号