Citation: | Zheng H W, Wang F, Gao J B. DES-YOLO: a more accurate object detection method[J]. Opto-Electron Eng, 2024, 51(11): 240212. doi: 10.12086/oee.2024.240212 |
[1] | 张阳婷, 黄德启, 王东伟, 等. 基于深度学习的目标检测算法研究与应用综述[J]. 计算机工程与应用, 2023, 59(18): 1−13. doi: 10.3778/j.issn.1002-8331.2305-0310 Zhang Y T, Huang D Q, Wang D W, et al. Review on research and application of deep learning-based target detection algorithms[J]. Comput Eng Appl, 2023, 59(18): 1−13. doi: 10.3778/j.issn.1002-8331.2305-0310 |
[2] | 童康, 吴一全. 基于深度学习的小目标检测基准研究进展[J]. 电子学报, 2024, 52(3): 1016−1040. doi: 10.12263/DZXB.20230624 Tong K, Wu Y Q. Research advances on deep learning based small object detection benchmarks[J]. Acta Electron Sin, 2024, 52(3): 1016−1040. doi: 10.12263/DZXB.20230624 |
[3] | Jiang T, Mu X D, Wei X, et al. Research progress of single-stage small target detection based on deep learning[C]//2022 4th International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM), Hamburg, Germany, 2022: 893–898. https://doi.org/10.1109/AIAM57466.2022.00180. |
[4] | 付涵, 范湘涛, 严珍珍, 等. 基于深度学习的遥感图像目标检测技术研究进展[J]. 遥感技术与应用, 2022, 37(2): 290−305. doi: 10.11873/j.issn.1004-0323.2022.2.0290 Fu H, Fan X T, Yan Z Z, et al. Progress of object detection in remote sensing images based on deep learning[J]. Remote Sens Technol Appl, 2022, 37(2): 290−305. doi: 10.11873/j.issn.1004-0323.2022.2.0290 |
[5] | Wang X L, Ban Y, Guo H M, et al. Deep learning model for target detection in remote sensing images fusing multilevel features[C]//IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 2019: 250–253. https://doi.org/10.1109/IGARSS.2019.8898759. |
[6] | Dong R C, Xu D Z, Zhao J, et al. Sig-NMS-based faster R-CNN combining transfer learning for small target detection in VHR optical remote sensing imagery[J]. IEEE Trans Geosci Remote Sens, 2019, 57(11): 8534−8545. doi: 10.1109/TGRS.2019.2921396 |
[7] | Mehta A, Jain R. An analysis of fabric defect detection techniques for textile industry quality control[C]//2023 World Conference on Communication & Computing (WCONF), Raipur, India, 2023: 1–5. https://doi.org/10.1109/WCONF58270.2023.10235154. |
[8] | Karlekar V V, Biradar M S, Bhangale K B. Fabric defect detection using wavelet filter[C]//2015 International Conference on Computing Communication Control and Automation, Pune, India, 2015: 712–715. https://doi.org/10.1109/ICCUBEA.2015.145. |
[9] | Alimohamadi H, Ahmadyfard A, Shojaee E. Defect detection in textiles using morphological analysis of optimal Gabor wavelet filter response[C]//2009 International Conference on Computer and Automation Engineering, Bangkok, Thailand, 2009: 26–30. https://doi.org/10.1109/ICCAE.2009.43. |
[10] | 程汉权, 熊继平, 陈经纬. 布匹瑕疵检测算法研究进展[J]. 计算机时代, 2023, (11): 16−21. doi: 10.16644/j.cnki.cn33-1094/tp.2023.11.004 Cheng H Q, Xiong J P, Chen J W. Research progress of fabric defect detection[J]. Comput Era, 2023, (11): 16−21. doi: 10.16644/j.cnki.cn33-1094/tp.2023.11.004 |
[11] | Zhou H, Jang B, Chen Y X, et al. Exploring faster RCNN for fabric defect detection[C]//2020 Third International Conference on Artificial Intelligence for Industries (AI4I), Irvine, CA, USA, 2020: 52–55. https://doi.org/10.1109/AI4I49448.2020.00018. |
[12] | Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection, [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017: 936-944. https://doi.org/10.1109/CVPR.2017.106 |
[13] | Li J F, Zhu Y W, Chen M X, et al. Research on underwater small target detection algorithm based on improved YOLOv3[C]//2022 16th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 2022: 76–80. https://doi.org/10.1109/ICSP56322.2022.9965317. |
[14] | Cao K Y, Cui X, Piao J C. Smaller target detection algorithms based on YOLOv5 in safety helmet wearing detection[C]//2022 4th International Conference on Robotics and Computer Vision (ICRCV), Wuhan, China, 2022: 154–158. https://doi.org/10.1109/ICRCV55858.2022.9953233. |
[15] | 张冲, 黄影平, 郭志阳, 等. 基于语义分割的实时车道线检测方法[J]. 光电工程, 2022, 49(5): 210378. doi: 10.12086/oee.2022.210378 Zhang C, Huang Y P, Guo Z Y, et al. Real-time lane detection method based on semantic segmentation[J]. Opto-Electron Eng, 2022, 49(5): 210378. doi: 10.12086/oee.2022.210378 |
[16] | Luo M M, Huang J H, Sun X Y, et al. Small target forest fire recognition method based on deep learning[C]//2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 2023: 593–597. https://doi.org/10.1109/ICIBA56860.2023.10165608. |
[17] | Li R Z, Chen Y J, Sun C Y, et al. Improved algorithm for small target detection of traffic signs on YOLOv5s[C]//2023 4th International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI), Guangzhou, China, 2023: 339–344. https://doi.org/10.1109/ICHCI58871.2023.10278065. |
[18] | 陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测[J]. 光电工程, 2022, 49(3): 210372. doi: 10.12086/oee.2022.210372 Chen X, Peng D L, Gu Y. Real-time object detection for UAV images based on improved YOLOv5s[J]. Opto-Electron Eng, 2022, 49(3): 210372. doi: 10.12086/oee.2022.210372 |
[19] | Ge R, Mao Y L, Li S, et al. Research on ship small target detection in SAR image based on improved YOLO-v7[C]//2023 International Applied Computational Electromagnetics Society Symposium (ACES-China), Hangzhou, China, 2023: 1–3. https://doi.org/10.23919/ACES-China60289.2023.10249265. |
[20] | Chen Z G, Liu G X, Fan S W. Research on target detection algorithm based on improved YOLO[C]//2022 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML), Xi’an, China, 2022: 485–489. https://doi.org/10.1109/ICICML57342.2022.10009683. |
[21] | Zhang H Y, Deng L X, Bi L Y, et al. Small object detection algorithm based on improved yolov5[C]//2023 IEEE International Conference on Control, Electronics and Computer Technology (ICCECT), Jilin, China, 2023: 280–283. https://doi.org/10.1109/ICCECT57938.2023.10141436. |
[22] | Pu J T, Zhang H Y, Yuan M D, et al. ACN-YOLO: an algorithm for small target detection in aerial images[C]//2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Haikou, China, 2023: 158–163. https://doi.org/10.1109/PRAI59366.2023.10331968. |
[23] | Lin T. Focal loss for dense object detection[Z]. arXiv:1708.02002, 2017. https://doi.org/10.48550/arXiv.1708.02002 |
[24] | Wang F L, Su J Y. Based on the improved YOLOV3 small target detection algorithm[C]//2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China, 2021: 2155–2159. https://doi.org/10.1109/IMCEC51613.2021.9482076. |
[25] | 栾庆磊, 常昕昱, 吴叶, 等. PAW-YOLOv7: 河道微小漂浮物检测算法[J]. 光电工程, 2024, 51(4): 240025. doi: 10.12086/oee.2024.240025 Luan Q L, Chang X Y, Wu Y, et al. PAW-YOLOv7: algorithm for detection of tiny floating objects in river channels[J]. Opto-Electron Eng, 2024, 51(4): 240025. doi: 10.12086/oee.2024.240025 |
[26] | Xia Z F, Pan X R, Song S J, et al. Vision transformer with deformable attention[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022: 4784–4793. https://doi.org/10.1109/CVPR52688.2022.00475. |
[27] | Cheng G, Han J W, Zhou P C, et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. ISPRS J Photogramm Remote Sens, 2014, 98: 119−132. doi: 10.1016/j.isprsjprs.2014.10.002 |
[28] | Cheng G, Han J W. A survey on object detection in optical remote sensing images[J]. ISPRS J Photogramm Remote Sens, 2016, 117: 11−28. doi: 10.1016/j.isprsjprs.2016.03.014 |
[29] | Cheng G, Zhou P C, Han J W. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Trans Geosci Remote Sens, 2016, 54(12): 7405−7415. doi: 10.1109/TGRS.2016.2601622 |
[30] | 天池. 布匹瑕疵检测数据集[EB/OL]. 2020[2024-12-10].https://tianchi.aliyun.com/dataset/dataDetail?dataId=79336. Tianchi. Smart diagnosis of cloth flaw dataset[EB/OL]. 2020[2024-12-10]. https://tianchi.aliyun.com/dataset/dataDetail?dataId=79336. |
In image analysis, detecting objects accurately remains a significant challenge due to the complexity of backgrounds, the small size of targets, and their dense distribution. To address these issues, we propose an advanced detection method named DES-YOLO. This method incorporates several innovative techniques to enhance the performance of object detection in remote sensing imagery. Firstly, we introduce a deformable attention module (DAM), which allows the network to dynamically adjust its focus on crucial areas of the image. This module enables the network to better recognize and localize objects by concentrating on significant regions and ignoring irrelevant background noise. Secondly, we implement the efficient intersection over union (EIoU) loss function, designed to mitigate the influence of low-quality samples. This loss function improves the generalization ability and detection accuracy of the model, ensuring more precise object localization. Furthermore, we augment the network head with an additional shallow feature map layer of 160 pixel×160 pixel. This enhancement specifically targets extracting features from small objects, often challenging to detect in remote-sensing images. By capturing more detailed information, this layer significantly boosts the detection capability for small-sized targets. Additionally, we employ a stepwise training strategy to refine the model's performance progressively. This training approach helps stabilise the learning process and improves the robustness of the model, leading to superior detection outcomes. Our experimental results are compelling. The improved DES-YOLO model demonstrates a 1.4% increase in the mean average precision (mAP@0.5) on a standard remote sensing dataset. To further validate the model's effectiveness, we conducted extended experiments on a textile dataset, where the model achieved an impressive mAP@0.5 increase of 1.7%. These results not only highlight the improvements brought by our method but also confirm its versatility and applicability to various types of datasets. In conclusion, DES-YOLO represents a significant advancement in object detection, offering enhanced accuracy and reliability. Integrating the deformable attention module, EIoU loss function, shallow feature enhancement, and stepwise training collectively contribute to its superior performance. Our research demonstrates the potential of DES-YOLO to set a new benchmark in object detection, paving the way for future developments and applications.
YOLOv5 network structure
Deformable attention module
Improved network structure
Images of part of the NWPU VHR-10 dataset
Images of part of the fabric defect dataset
Comparison of detection accuracy of different network models
Comparison of detection effect of different models