YOLOv8-GAIS: improved object detection algorithm for UAV aerial photography

Li Kaixuan; Liu Xiaofeng; Chen Qiang; Zhang Zejiang

doi:10.12086/oee.2025.240295

Article navigation > Opto-Electronic Engineering > 2025 Vol. 52 > No. 4 > 240295

Next Article Previous Article

Li K X, Liu X F, Chen Q, et al. YOLOv8-GAIS: improved object detection algorithm for UAV aerial photography[J]. Opto-Electron Eng, 2025, 52(4): 240295. doi: 10.12086/oee.2025.240295

Citation:

Li K X, Liu X F, Chen Q, et al. YOLOv8-GAIS: improved object detection algorithm for UAV aerial photography[J]. Opto-Electron Eng, 2025, 52(4): 240295. doi: 10.12086/oee.2025.240295

YOLOv8-GAIS: improved object detection algorithm for UAV aerial photography

1.
School of Automotive and Transportation, Tianjin University of Technology and Education, Tianjin 300222, China
2.
National & Local Joint Engineering Research Center for Intelligent Vehicle Road Collaboration and Safety Technology, Tianjin 300222, China

Fund Project: Natural Science Key Project of Tianjin Education Committee (2022ZD016)

More Information

^*Corresponding author: xfliu@tute.edu.cn
CSTR: 32245.14.oee.2025.240295

Received Date 17 December 2024

Revised Date 06 March 2025

Accepted Date 07 March 2025

Published Date 25 April 2025

Abstract

Abstract

To address the issue of complex backgrounds in dim scenes, which cause object edge blurring and obscure small objects, leading to misdetection and omission, an improved YOLOv8-GAIS algorithm is proposed. The FAMFF (four-head adaptive multi-dimensional feature fusion) strategy is designed to achieve spatial filtering of conflicting information. A small object detection head is incorporated to address the issue of large object scale variation in aerial views. The SEAM (spatially enhanced attention mechanism) is introduced to enhance the network's attention and capture ability for occluded parts in low illumination situations. The InnerSIoU loss function is adopted to emphasize the core regions, thereby improving the detection performance of occluded objects. Field scenes are collected to expand the VisDrone2021 dataset, and the Gamma and SAHI (slicing aided hyper inference) algorithms are applied for preprocessing. This helps balance the distribution of different object types in low-illumination scenarios, optimizing the model's generalization ability and detection accuracy. Comparative experiments show that the improved model reduces the number of parameters by 1.53 MB, and increases mAP50 by 6.9%, mAP50-95 by 5.6%, and model computation by 7.2 GFLOPs compared to the baseline model. In addition, field experiments were conducted in Dagu South Road, Jinnan District, Tianjin City, China, to determine the optimal altitude for image acquisition by UAVs. The results show that, at a flight altitude of 60 m, the model achieves the detection accuracy of 77.8% mAP50.
- object detection /
- low-brightness image /
- four-head adaptive multi-dimensional feature fusion /
- unmanned aerial vehicle /
- YOLOv8

FullText(HTML)

References

[1]	梁礼明, 陈康泉, 王成斌, 等. 融合视觉中心机制和并行补丁感知的遥感图像检测算法[J]. 光电工程, 2024, 51(7): 240099. doi: 10.12086/oee.2024.240099 CrossRef Google Scholar Liang L M, Chen K Q, Wang C B, et al. Remote sensing image detection algorithm integrating visual center mechanism and parallel patch perception[J]. Opto-Electron Eng, 2024, 51(7): 240099. doi: 10.12086/oee.2024.240099 CrossRef Google Scholar
[2]	张文政, 吴长悦, 满卫东, 等. 双通道对抗网络在建筑物分割中的应用[J]. 计算机工程, 2024, 50(11): 297−307. doi: 10.19678/j.issn.1000-3428.0068527 CrossRef Google Scholar Zhang W Z, Wu C Y, Man W D, et al. Application of dual-channel adversarial network in building segmentation[J]. Comput Eng, 2024, 50(11): 297−307. doi: 10.19678/j.issn.1000-3428.0068527 CrossRef Google Scholar
[3]	赵太飞, 王一琼, 郭佳豪, 等. 无线紫外光中继协作无人机集结编队方法[J]. 激光技术, 2024, 48(4): 477−483. doi: 10.7510/jgjs.issn.1001-3806.2024.04.004 CrossRef Google Scholar Zhao T F, Wang Y Q, Guo J H, et al. The method of wireless ultraviolet relay cooperation UAV assembly formation[J]. Laser Technol, 2024, 48(4): 477−483. doi: 10.7510/jgjs.issn.1001-3806.2024.04.004 CrossRef Google Scholar
[4]	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580–587. https://doi.org/10.1109/CVPR.2014.81. Google Scholar
[5]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031 CrossRef Google Scholar
[6]	He K M, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, 2017: 2980–2988. https://doi.org/10.1109/ICCV.2017.322. Google Scholar
[7]	Redmon J. Yolov3: an incremental improvement[Z]. arXiv: 1804.02767, 2018. https://arxiv.org/abs/1804.02767. Google Scholar
[8]	Zhang Y, Guo Z Y, Wu J Q, et al. Real-time vehicle detection based on improved YOLO v5[J]. Sustainability, 2022, 14(19): 12274. doi: 10.3390/su141912274 CrossRef Google Scholar
[9]	Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision–ECCV 2016, 2016: 21–37. https://doi.org/10.1007/978-3-319-46448-0_2. Google Scholar
[10]	Liu J L, Liu X F, Chen Q, et al. A traffic parameter extraction model using small vehicle detection and tracking in low-brightness aerial images[J]. Sustainability, 2023, 15(11): 8505. doi: 10.3390/su15118505 CrossRef Google Scholar
[11]	曲立国, 张鑫, 卢自宝, 等. 基于改进YOLOv5的交通标志识别方法[J]. 光电工程, 2024, 51(6): 240055. Google Scholar Qu L G, Zhang X, Lu Z B, et al. A traffic sign recognition method based on improved YOLOv5[J]. Opto-Electron Eng, 2024, 51(6): 240055. Google Scholar
[12]	刘勇, 吕丰顺, 李学琨, 等. 基于YOLOv8-LGA的遥感图像目标检测算法[J]. 光电子·激光, 2024. https://link.cnki.net/urlid/12.1182.o4.20240709.1518.016. Google Scholar Liu Y, Lyu F S, Li X K, et al. Remote sensing image target detection algorithm based on YOLOv8-LGA[J]. J Optoelectron Laser, 2024. https://link.cnki.net/urlid/12.1182.o4.20240709.1518.016. Google Scholar
[13]	Golcarenarenji G, Martinez-Alpiste I, Wang Q, et al. Illumination-aware image fusion for around-the-clock human detection in adverse environments from unmanned aerial vehicle[J]. Expert Syst Appl, 2022, 204: 117413. doi: 10.1016/j.eswa.2022.117413 CrossRef Google Scholar
[14]	Talaat A S, El-Sappagh S. Enhanced aerial vehicle system techniques for detection and tracking in fog, sandstorm, and snow conditions[J]. J Supercomput, 2023, 79(14): 15868−15893. doi: 10.1007/s11227-023-05245-9 CrossRef Google Scholar
[15]	张润梅, 肖钰霏, 贾振楠, 等. 改进YOLOv7的无人机视角下复杂环境目标检测算法[J]. 光电工程, 2024, 51(5): 240051. doi: 10.12086/oee.2024.240051 CrossRef Google Scholar Zhang R M, Xiao Y F, Jia Z N, et al. Improved YOLOv7 algorithm for target detection in complex environments from UAV perspective[J]. Opto-Electron Eng, 2024, 51(5): 240051. doi: 10.12086/oee.2024.240051 CrossRef Google Scholar
[16]	Wu W T, Liu H, Li L L, et al. Application of local fully Convolutional Neural Network combined with YOLO v5 algorithm in small target detection of remote sensing image[J]. PLoS One, 2021, 16(10): e0259283. doi: 10.1371/journal.pone.0259283 CrossRef Google Scholar
[17]	潘玮, 韦超, 钱春雨, 等. 面向无人机视角下小目标检测的YOLOv8s改进模型[J]. 计算机工程与应用, 2024, 60(9): 142−150. doi: 10.3778/j.issn.1002-8331.2312-0043 CrossRef Google Scholar Pan W, Wei C, Qian C Y, et al. Improved YOLOv8s model for small object detection from perspective of drones[J]. Comput Eng Appl, 2024, 60(9): 142−150. doi: 10.3778/j.issn.1002-8331.2312-0043 CrossRef Google Scholar
[18]	范博淦, 王淑青, 陈开元. 基于改进YOLOv8的航拍无人机小目标检测模型[J]. 计算机应用, 2024. https://link.cnki.net/urlid/51.1307.tp.20241017.1040.004. Google Scholar Fan B G, Wang S Q, Chen K Y. Small target detection model for aerial photography UAV based on improved YOLOv8[J]. J Comput Appl, 2024. https://link.cnki.net/urlid/51.1307.tp.20241017.1040.004. Google Scholar
[19]	张泽江, 刘晓锋, 刘军黎. 基于机载激光雷达的低照度交通事故现场重建及实证研究[J]. 国外电子测量技术, 2024, 43(2): 174−182. doi: 10.19652/j.cnki.femt.2305456 CrossRef Google Scholar Zhang Z J, Liu X F, Liu J L. Reconstruction of low-brightness traffic accident scene based on UAV-borne LiDAR and empirical research[J]. Foreign Electron Meas Technol, 2024, 43(2): 174−182. doi: 10.19652/j.cnki.femt.2305456 CrossRef Google Scholar
[20]	Zhang Z J, Liu X F, Chen Q. Improving accident scene monitoring with multisource data fusion under low-brightness and occlusion conditions[J]. IEEE Access, 2024, 12: 136026−136040. doi: 10.1109/ACCESS.2024.3438158 CrossRef Google Scholar
[21]	刘建蓓, 单东辉, 郭忠印, 等. 无人机视频的交通参数提取方法及验证[J]. 公路交通科技, 2021, 38(8): 149−158. doi: 10.3969/j.issn.1002-0268.2021.08.020 CrossRef Google Scholar Liu J B, Shan D H, Guo Z Y, et al. An approach for extracting traffic parameters from UAV video and verification[J]. J Highw Transp Res Dev, 2021, 38(8): 149−158. doi: 10.3969/j.issn.1002-0268.2021.08.020 CrossRef Google Scholar
[22]	Li B Y, Liu X, Hu P, et al. All-in-one image restoration for unknown corruption[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 17431–17441. https://doi.org/10.1109/CVPR52688.2022.0169. Google Scholar
[23]	Cao Y R, He Z J, Wang L J, et al. VisDrone-DET2021: the vision meets drone object detection challenge results[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, 2021: 2847–2854. https://doi.org/10.1109/ICCVW54120.2021.00319. Google Scholar
[24]	Zhang M R, Lucas J, Hinton G, et al. Lookahead optimizer: k steps forward, 1 step back[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019: 861. Google Scholar
[25]	Pailla D R, Kollerathu V, Chennamsetty S S. Object detection on aerial imagery using CenterNet[Z]. arXiv: 1908.08244, 2019. https://arxiv.org/abs/1908.08244. Google Scholar

Overview

Overview

With the exponential growth of processor computing power and the continuous breakthroughs of deep learning algorithms, deep learning-based computer vision has become a key component in the field of object detection. In the field of traffic supervision, the use of a UAV remote sensing platform equipped with advanced sensors and deep learning algorithms achieves intelligent vehicle detection. This approach provides technical support for the construction of a new rapid response and accurate determination system for traffic accidents, laying a solid foundation for its implementation. Traditional object detection algorithms primarily depend on extracting distinctive features from the input data. These methods require stable lighting and minimal interference during data acquisition, limiting their effectiveness in dynamic scenarios where lighting, shadows, and weather changes affect detection accuracy. Therefore, this paper proposes an improved YOLOv8-GAIS object detection algorithm to achieve effective object detection in urban expressways under low-brightness scenes. The FAMFF (four-head adaptive multi-dimensional feature fusion) strategy is specifically designed to enhance the spatial filtering process, effectively managing conflicting information within the data. To further enhance performance, particularly in scenarios involving aerial views, a small object detection head is incorporated into the system. The SEAM (spatially enhanced attention mechanism) is incorporated to improve the network's ability to enhance the feature extraction ability, particularly in low-brightness environments and occluded parts. This enhancement enables the network to better detect parts of objects that might otherwise be missed due to occlusion. Additionally, the InnerSIoU loss function is utilized to give greater weight to the central areas of objects, thereby significantly enhancing the model's performance in detecting occluded objects more accurately. To enhance the generalization ability of the algorithm, this paper also expands the VisDrone2021 dataset by incorporating field collection scenarios. Comparative experiments reveal that the enhanced model achieves several improvements over the baseline. It reduces the parameter count by 1.53 MB, and increases mAP50 by 6.9%, and boosts mAP50-95 by 5.6%. Additionally, the model's calculation wreduced by 7.2 GFLOPs, leading to improved efficiency and faster processing times without compromising detection accuracy. To further assess its effectiveness, field experiments were conducted along Dagu South Road in Jinnan District, Tianjin City, China, to determine the optimal UAV flight altitude for image acquisition. The findings indicate that at a flight altitude of 60 m, the model reaches the detection accuracy of 77.8% mAP50. These results highlight the model's superior performance in real-world scenarios, demonstrating its robustness and adaptability to complex environments.