• 中国科技核心期刊
  • Scopus收录期刊
  • DOAJ收录期刊
  • JST收录期刊
  • Euro Pub收录期刊
Turn off MathJax
Article Contents
WANG Yudi, YANG Mingzhong, LIU Lixin. Ocean sound separation algorithm based on time-frequency interleaved attention[J]. Journal of Unmanned Undersea Systems. doi: 10.11993/j.issn.2096-3920.2025-0127
Citation: WANG Yudi, YANG Mingzhong, LIU Lixin. Ocean sound separation algorithm based on time-frequency interleaved attention[J]. Journal of Unmanned Undersea Systems. doi: 10.11993/j.issn.2096-3920.2025-0127

Ocean sound separation algorithm based on time-frequency interleaved attention

doi: 10.11993/j.issn.2096-3920.2025-0127
  • Received Date: 2025-09-15
  • Accepted Date: 2025-10-10
  • Rev Recd Date: 2025-09-30
  • Available Online: 2026-01-14
  • Ocean sound contains rich information, but the complex ocean acoustic environment and the variable characteristics of underwater target signals pose a serious challenge to the ability of fine perception and discrimination of underwater sound features. In order to truly restore the underwater target sound of interest, the paper proposes a marine sound source separation algorithm based on integrated filtration module(IFM). The model adopts a frequency band segmentation strategy, uses an encoder to convert the mixed audio to the time-frequency spectrum, cross extracts the time-frequency gain using a multi-scale attention mechanism, and improves the separation capability of the underwater sound by the IFM proposed in the paper. Among them, the IFM employs an adaptive weighting mechanism to efficiently fuse the features extracted from the multiscale convolutional spatial filtering pathway and the self-attention feature-dependent pathway with the original features, and inputs the fused features into a decoder to reconstruct high-quality pure target audio, which enhances the details of the target signals while efficiently filtering out the background noises and interferences. Experimental results on marine typical sound datasets show that the proposed algorithm can significantly improve the audio separation performance of the target of interest, and the SDRi reaches 8.56dB and 10.74dB in the audio separation experiments between humpback whales and passenger ships, and killer whales and passenger ships, respectively, and also outperforms the existing baseline model in a variety of other metrics.

     

  • loading
  • [1]
    BAYRAKCI G, KLINGELHOEFER F. An introduction to the ocean soundscape[M]. Noisy Oceans: Monitoring Seismic and Acoustic Signals in the Marine Environment. Hoboken, NJ: Wiley, 2024.
    [2]
    DUARTE C M, CHAPUIS L, COLLIN S P, et al. The soundscape of the Anthropocene ocean[J]. Science, 2021, 371(6529): eaba4658. doi: 10.1126/science.aba4658
    [3]
    LI D, WU M, YU L, et al. Single-channel blind source separation of underwater acoustic signals using improved NMF and FastICA[J]. Frontiers in Marine Science, 2023, 9: 1097003. doi: 10.3389/fmars.2022.1097003
    [4]
    谢加武. 基于深度学习的水下声源分离技术研究[D]. 成都: 电子科技大学, 2019.
    [5]
    WANG M, ZHANG W, SHAO M, et al. Separation and extraction of compound-fault signal based on multi-constraint non-negative Matrix factorization[J]. Entropy, 2024, 26(7): 583. doi: 10.3390/e26070583
    [6]
    SCHULZE F K, RICHARD G, KELLEY L, et al. Unsupervised music source separation using differentiable parametric source models[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 31: 1276-1289. doi: 10.1109/TASLP.2023.3252272
    [7]
    ANSARI S, ALATRANY A S, ALNAJJAR K A, et al. A survey of artificial intelligence approaches in blind source separation[J]. Neurocomputing, 2023, 561: 126895. doi: 10.1016/j.neucom.2023.126895
    [8]
    CHANDNA P, CUESTA H, PETERMANN D, et al. A deep-learning based framework for source separation, analysis, and synthesis of choral ensembles[J]. Frontiers in Signal Processing, 2022, 2: 808594. doi: 10.3389/frsip.2022.808594
    [9]
    LUO Y, MESGARANI N. Tasnet: time-domain audio separation network for real-time, single-channel speech separation[C]//2018 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Calgary, AB, Canada: IEEE, 2018: 696-700.
    [10]
    LUO Y, MESGARANI N. Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation[J]. IEEE/ACM transactions on audio, speech, and language processing, 2019, 27(8): 1256-1266. doi: 10.1109/TASLP.2019.2915167
    [11]
    LUO Y, CHEN Z, YOSHIOKA T. Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Barcelona, Spain: IEEE, 2020: 46-50.
    [12]
    CHEN J, MAO Q, LIU D. Dual-path transformer network: Direct context-aware modeling for end-to-end monaural speech separation[EB/OL]. [2025-11-25] https://arxiv.org/abs/2007.13975.
    [13]
    VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]. Advances in Neural Information Processing Systems, 2017(30): 5998-6008.
    [14]
    ZHANG W, LI X, ZHOU A, et al. Underwater acoustic source separation with deep Bi-LSTM networks[C]//2021 4th International Conference on Information Communication and Signal Processing (ICICSP). Shanghai, China: IEEE, 2021: 254-258.
    [15]
    HE Q, WANG H, ZENG X, et al. Ship-radiated noise separation in underwater acoustic environments using a deep time-domain network[J]. Journal of Marine Science and Engineering, 2024, 12(6): 885. doi: 10.3390/jmse12060885
    [16]
    LIU Y, JIANG L. Passive underwater acoustic signal separation based on feature decoupling dual-path network[EB/OL]. [2025-09-25] https://arxiv.org/abs/2504.08371
    [17]
    XU M, LI K, CHEN G, et al. Tiger: Time-frequency interleaved gain extraction and reconstruction for efficient speech separation[EB/OL]. [2025-09-25] https://arxiv.org/abs/2410.01469
    [18]
    SAYIGH L, DAHER M A, ALLEN J, et al. The Watkins marine mammal sound database: an online, freely accessible resource[C]//Proceedings of Meetings on Acoustics. Acoustical Society of America. Dublin, Ireland: POMA, 2016: 040013.
    [19]
    SANTOS-DOMÍNGUEZ D, TORRES-GUIJARRO S, CARDENAL-LÓPEZ A, et al. ShipsEar: An underwater vessel noise database[J]. Applied Acoustics, 2016, 113: 64-69. doi: 10.1016/j.apacoust.2016.06.008
    [20]
    YUREN B. Research on music source separation technology based on deep learning[J]. Computer Science and Application, 2022, 12: 2788. doi: 10.12677/CSA.2022.1212283
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(2)

    Article Metrics

    Article Views(35) PDF Downloads(13) Cited by()
    Proportional views
    Related
    Service
    Subscribe

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return