Wang Y, Wang H H, Liu S D, et al. Small target detection in sonar images with multilevel feature screening and task dynamic alignment[J]. Opto-Electron Eng, 2024, 51(10): 240196. doi: 10.12086/oee.2024.240196
Citation: Wang Y, Wang H H, Liu S D, et al. Small target detection in sonar images with multilevel feature screening and task dynamic alignment[J]. Opto-Electron Eng, 2024, 51(10): 240196. doi: 10.12086/oee.2024.240196

Small target detection in sonar images with multilevel feature screening and task dynamic alignment

    Fund Project: Project supported by Tianjin Philosophy and Social Science Planning Project (TJGL19xSX-045)
More Information
  • To solve the problem of small target detection in sonar images, which is difficult, low precision, and prone to misdetection and omission detection, this paper proposes an improved algorithm for small target detection in sonar images based on YOLOv8s. Firstly, considering that small targets in sonar images usually have low contrast and are easily overwhelmed by noise, an efficient multi-level screening feature pyramid network (EMS-FPN) is proposed. Secondly, since the classification branch and localization branch of the decoupled head are independent, which will increase the number of parameters of the model, and at the same time, it is difficult to effectively adapt to the detection needs of targets of different scales, resulting in poor detection of small targets, the task dynamic alignment detection head module (TDADH) is designed. Finally, to verify the effectiveness of the model in this paper, the corresponding validation was carried out on URPC2021 and SCTD expanded sonar dataset, mAP0.5 improved by 0.3% and 1.8% compared with YOLOv8s, respectively, and the number of parameters was reduced by 22.5%. The results show that the method proposed in this paper not only improves the accuracy but also significantly reduces the number of model parameters in the task of target detection in sonar images.
  • 加载中
  • [1] Li H S, Xu C, Zhou T. High-resolution integrated detection of underwater topography and geomorphology based on multibeam interferometric echo sounder[J]. Appl Mech Mater, 2012, 212-213: 345−350 doi: 10.4028/www.scientific.net/AMM.212-213.345

    CrossRef Google Scholar

    [2] Wang L, Ye X F, Wang S L, et al. ULO: an underwater light-weight object detector for edge computing[J]. Machines, 2022, 10(8): 629. doi: 10.3390/machines10080629

    CrossRef Google Scholar

    [3] Wang Z Y, Ye X F, Han Y T, et al. Improved real-time target detection algorithm for similar multiple targets in complex underwater environment based on YOLOv3[C]//Global Oceans 2020: SingaporeU. S. Gulf Coast, Biloxi, 2020: 1–6. https://doi.org/10.1109/IEEECONF38699.2020.9389108.

    Google Scholar

    [4] Lange H, Vincent L M. Advanced gray-scale morphological filters for the detection of sea mines in side-scan sonar imagery[J]. Proc SPIE, 2000, 4038: 362−372. doi: 10.1117/12.396263

    CrossRef Google Scholar

    [5] Zhang W Y, Zhou T, Li J H, et al. An efficient method for detection and quantitation of underwater gas leakage based on a 300-kHz multibeam sonar[J]. Remote Sens, 2022, 14(17): 4301. doi: 10.3390/rs14174301

    CrossRef Google Scholar

    [6] Li J W, An W, Xu C, et al. Sunken oil detection and classification using MBES backscatter data[J]. Mar Pollut Bull, 2022, 180: 113795. doi: 10.1016/j.marpolbul.2022.113795

    CrossRef Google Scholar

    [7] Zhou T, Si J K, Wang L Y, et al. Automatic detection of underwater small targets using forward-looking sonar images[J]. IEEE Trans Geosci Remote Sens, 2022, 60: 4207912. doi: 10.1109/TGRS.2022.3181417

    CrossRef Google Scholar

    [8] Park C, Kim Y, Lee H, et al. Development of a 2 MHz sonar sensor for inspection of bridge substructures[J]. Sensors, 2018, 18(4): 1222. doi: 10.3390/s18041222

    CrossRef Google Scholar

    [9] 赵冬冬, 谢墩翰, 陈朋, 等. 基于ZYNQ的轻量化YOLOv5声呐图像目标检测算法及实现[J]. 光电工程, 2024, 51(1): 230284. doi: 10.12086/oee.2024.230284

    CrossRef Google Scholar

    Zhao D D, Xie D H, Chen P, et al. Lightweight YOLOv5 sonar image object detection algorithm and implementation based on ZYNQ[J]. Opto-Electron Eng, 2024, 51(1): 230284. doi: 10.12086/oee.2024.230284

    CrossRef Google Scholar

    [10] Abu A, Diamant R. A statistically-based method for the detection of underwater objects in sonar imagery[J]. IEEE Sensors J, 2019, 19(16): 6858−6871. doi: 10.1109/JSEN.2019.2912325

    CrossRef Google Scholar

    [11] Negahdaripour S. Application of forward-scan sonar stereo for 3-D scene reconstruction[J]. IEEE J Oceanic Eng, 2020, 45(2): 547−562. doi: 10.1109/JOE.2018.2875574

    CrossRef Google Scholar

    [12] Shang Z G, Zhao C H, Wan J. Application of multi-resolution analysis in sonar image denoising[J]. J Syst Eng Electron, 2008, 19(6): 1082−1089. doi: 10.1016/S1004-4132(08)60201-7

    CrossRef Google Scholar

    [13] Jin Y, Ku B, Ahn J, et al. Nonhomogeneous noise removal from side-scan sonar images using structural sparsity[J]. IEEE Geosci Remote Sens Lett, 2019, 16(8): 1215−1219. doi: 10.1109/LGRS.2019.2895843

    CrossRef Google Scholar

    [14] Wang Z, Zhang S W, Huang W Z, et al. Sonar image target detection based on adaptive global feature enhancement network[J]. IEEE Sensors J, 2022, 22(2): 1509−1530. doi: 10.1109/JSEN.2021.3131645

    CrossRef Google Scholar

    [15] 赵冬冬, 叶逸飞, 陈朋, 等. 基于残差和注意力网络的声呐图像去噪方法[J]. 光电工程, 2023, 50(6): 230017. doi: 10.12086/oee.2023.230017

    CrossRef Google Scholar

    Zhao D D, Ye Y F, Chen P, et al. Sonar image denoising method based on residual and attention network[J]. Opto-Electron Eng, 2023, 50(6): 230017. doi: 10.12086/oee.2023.230017

    CrossRef Google Scholar

    [16] 葛锡云, 魏柠阳, 周宏坤, 等. 基于侧扫声呐的水下小目标检测技术研究[J]. 数字海洋与水下攻防, 2023, 6(2): 155−161. doi: 10.19838/j.issn.2096-5753.2023.02.004

    CrossRef Google Scholar

    Ge X Y, Wei N Y, Zhou H K, et al. Research on small underwater target detection technology based on side-scan sonar[J]. Digit Ocean Underwater Warf, 2023, 6(2): 155−161. doi: 10.19838/j.issn.2096-5753.2023.02.004

    CrossRef Google Scholar

    [17] Wang C Y, Bochkovskiy A, Liao H Y M. Scaled-yolov4: scaling cross stage partial network[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13024–13033. https://doi.org/10.1109/CVPR46437.2021.01283.

    Google Scholar

    [18] Ge Z, Liu S T, Wang F, et al. YOLOX: exceeding yolo series in 2021[Z]. arXiv: 2107.08430, 2021. https://doi.org/10.48550/arXiv.2107.08430.

    Google Scholar

    [19] Chen Y F, Zhang C Y, Chen B, et al. Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases[J]. Comput Biol Med, 2024, 170: 107917. doi: 10.1016/j.compbiomed.2024.107917

    CrossRef Google Scholar

    [20] Hou Q B, Zhou D Q, Feng J S. Coordinate attention for efficient mobile network design[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13708–13717. https://doi.org/10.1109/CVPR46437.2021.01350.

    Google Scholar

    [21] Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning, Lille, 2015: 448–456.

    Google Scholar

    [22] Xu W, Wan Y. ELA: efficient local attention for deep convolutional neural networks[Z]. arXiv: 2403.01123, 2024. https://doi.org/10.48550/arXiv.2403.01123.

    Google Scholar

    [23] Hou Q B, Zhang L, Cheng M M, et al. Strip pooling: rethinking spatial pooling for scene parsing[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 4002–4011. https://doi.org/10.1109/CVPR42600.2020.00406.

    Google Scholar

    [24] Wu Y X, He K M. Group normalization[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, 2018: 3–19. https://doi.org/10.1007/978-3-030-01261-8_1.

    Google Scholar

    [25] Tian Z, Shen C H, Chen H, et al. FCOS: a simple and strong anchor-free object detector[J]. IEEE Trans Pattern Anal Mach Intell, 2022, 44(4): 1922−1933. doi: 10.1109/TPAMI.2020.3032166

    CrossRef Google Scholar

    [26] 周彦, 陈少昌, 吴可, 等. SCTD1.0: 声呐常见目标检测数据集[J]. 计算机科学, 2021, 48(11A): 334−339. doi: 10.11896/jsjkx.210100138

    CrossRef Google Scholar

    Zhou Y, Chen S C, Wu K, et al. SCTD 1.0: sonar common target detection dataset[J]. Comput Sci, 2021, 48(11A): 334−339. doi: 10.11896/jsjkx.210100138

    CrossRef Google Scholar

    [27] Xie K B, Yang J, Qiu K. A dataset with multibeam forward-looking sonar for underwater object detection[J]. Sci Data, 2022, 9(1): 739. doi: 10.1038/s41597-022-01854-w

    CrossRef Google Scholar

    [28] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, 2017: 2999–3007. https://doi.org/10.1109/ICCV.2017.324.

    Google Scholar

    [29] Kim K, Lee H S. Probabilistic anchor assignment with IoU prediction for object detection[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 355–371. https://doi.org/10.1007/978-3-030-58595-2_22.

    Google Scholar

    [30] Zhou X Y, Wang D Q, Krähenbühl P. Objects as points[Z]. arXiv: 1904.07850, 2019. https://doi.org/10.48550/arXiv.1904.07850.

    Google Scholar

    [31] Sun P Z, Zhang R F, Jiang Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 14449–14458. https://doi.org/10.1109/CVPR46437.2021.01422.

    Google Scholar

    [32] Chen Q, Wang Y M, Yang T, et al. You only look one-level feature[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13034–13043. https://doi.org/10.1109/CVPR46437.2021.01284.

    Google Scholar

    [33] Feng C J, Zhong Y J, Gao Y, et al. TOOD: task-aligned one-stage object detection[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, Montreal, 2021: 3490–3499. https://doi.org/10.1109/ICCV48922.2021.00349.

    Google Scholar

    [34] Zhang H Y, Wang Y, Dayoub F, et al. VarifocalNet: an IoU-aware dense object detector[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 8510–8519. https://doi.org/10.1109/CVPR46437.2021.00841.

    Google Scholar

    [35] Li Z K, Xie Z J, Duan P H, et al. Dual spatial attention network for underwater object detection with sonar imagery[J]. IEEE Sens J, 2024, 24(5): 6998−7008. doi: 10.1109/JSEN.2023.3336899

    CrossRef Google Scholar

  • Sonar technology has an important application value in the marine field, and is widely used in seabed geological exploration, marine environmental pollution monitoring, underwater target detection, marine resources development, and other fields. However, the detection of small targets in sonar images has always been a challenging problem due to the fact that sonar imaging is affected by a variety of factors, such as the marine environment and underwater target characteristics. Small targets, such as round cages and balls, often face difficulties such as weak signals, complex backgrounds, low resolution, and noise interference in sonar images, and their effective detection is crucial to ensure the safety of underwater navigation and the development of marine resources. To solve the problem of small target detection in sonar images, which is difficult, low precision, and prone to wrong detection and leakage, a lightweight sonar image small target detection algorithm based on YOLOv8s with efficient multilevel feature fusion is proposed. Firstly, considering that small targets in sonar images usually have low contrast and are easily overwhelmed by noise, an efficient multilevel screening feature fusion pyramid EMS-FPN module is proposed. It can highlight the important features through the screening mechanism, suppress irrelevant background noise, and extracting features from different scales achieve multilevel fusion so as to improve the detection capability of small targets. Secondly, since the classification branch and the localization branch of the decoupling head are independent, it will increase the number of parameters of the model and lead to the problem of lack of interaction between the two tasks. It is difficult to effectively adapt to the detection needs of targets at different scales, resulting in poor detection of small targets, therefore, the task dynamic align detection head (TDADH) module is designed to learn the task interaction features from multiple convolutional layers through a feature extractor to obtain joint features to effectively adapt to the detection needs of targets at different scales, and finally, to validate the effectiveness of the model in this paper, corresponding validation is carried out on the URPC2021 and SCTD sonar datasets, and the detection accuracy mAP50 is improved compared with that of YOLOv8s respectively by 0.3% and 1.8%, and the number of parameters is reduced by 22.5%. The results show that the sonar image target detection algorithm proposed in this paper improves the accuracy and significantly reduces the number of model parameters.

  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(11)

Tables(4)

Article Metrics

Article views() PDF downloads() Cited by()

Access History

Other Articles By Authors

Article Contents

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint