多层次精细化无人机图像目标检测

肖振久; 赖思宇; 曲海成

doi:10.12086/oee.2025.240287

摘要: 针对无人机图像中背景复杂、光线多变、目标遮挡及尺度不一导致的漏检、误检问题，提出一种多层次精细化无人机图像目标检测算法。首先，结合多尺度特征提取与特征融合增强策略，设计CSP-SMSFF (cross stage partial selective multi-scale feature fusion)模块，该模块通过递增卷积核与通道融合，精确捕获多尺度目标特征。其次，引入AFGCAttention (adaptive fine-grained channel attention)机制，通过动态调优机制优化通道特征表达，增强算法对多尺度重要样本特征的感知力与判别力及细粒度映射信息的保留能力，抑制背景噪声，改善漏检情况。而后，设计SGCE-Head (shared group convolution efficient head)检测头，利用EMSPConv (efficient multi-scale convolution)卷积实现对空间通道维度中全局重要特征与局部细节信息的精准捕获，增强对多尺度特征的定位与识别能力，改善误检问题。最后，提出Inner-Powerful-IoUv2损失函数，通过动态梯度加权与分层IoU优化，平衡不同质量样本的定位权重，增强模型对模糊目标的检测能力。采用数据集VisDrone2019和VisDrone2021进行实验，结果表明，该方法mAP@0.5数值达到了47.5%和45.3%，较基线模型分别提升5.7%和4.7%，优于对比算法。

Abstract: To address the challenges of missed detection and false detection caused by complex backgrounds, varying illumination, target occlusion, and scale diversity in UAV images, this paper proposes a multi-level refined object detection algorithm for UAV imagery. First, a CSP-SMSFF (cross-stage partial selective multi-scale feature fusion) module is designed by integrating multi-scale feature extraction and feature fusion enhancement strategies. This module employs incremental convolutional kernels and channel-wise fusion to precisely capture multi-scale target features. Second, an AFGCAttention (adaptive fine-grained channel attention) mechanism is introduced, which optimizes channel feature representations through a dynamic fine-tuning mechanism. This enhances the algorithm’s sensitivity to critical multi-scale sample features, improves discriminative capability, preserves fine-grained mapping information, and suppresses background noise to mitigate missed detection. Third, a SGCE-Head (shared group convolution efficient head) detection head is developed, leveraging EMSPConv (efficient multi-scale convolution) to achieve precise capture of global salient features and local details in spatial-channel dimensions, thereby enhancing localization and recognition of multi-scale features and reducing false positives. Finally, the Inner-Powerful-IoUv2 loss function is proposed, which balances localization weights for samples of varying quality through dynamic gradient weighting and hierarchical IoU optimization, thereby strengthening the model’s capability to detect ambiguous targets. Experimental results on the VisDrone2019 and VisDrone2021 datasets benchmark demonstrate that the proposed method achieves 47.5% and 45.3% in mAP@0.5 under two evaluation settings, surpassing baseline models by 5.7% and 4.7%, respectively, and outperforming existing comparative algorithms.

多层次精细化无人机图像目标检测

Multi-level refined UAV image target detection

相关链接

目录

多层次精细化无人机图像目标检测

Multi-level refined UAV image target detection

相关链接

目录

微信二维码