New website getting online, testing
    • 摘要: 针对无人机航拍图像前景目标尺度差异大、样本空间分布不均衡、背景冗余占比高所导致的漏检和误检问题,本文提出一种自适应前景聚焦无人机航拍图像目标检测算法。首先,构建全景特征细化分类层,通过重参数空间像素方差法及混洗操作,增强算法聚焦能力,提高前景样本特征的表示质量。其次,采用分离-学习-融合策略设计自适应双维特征采样单元,加强对前景焦点特征提取能力和背景细节信息保留能力,改善误检情况,加快推理速度。然后,结合多分支结构和广播自注意力机制构造多路径信息整合模块,解决下采样引起的歧义映射问题,优化特征的交互与整合,提高算法对多尺度目标的识别、定位能力,降低模型计算量。最终,引入自适应前景聚焦检测头,运用动态聚焦机制,增强前景目标检测精度,抑制背景干扰。在公开数据集VisDrone2019和VisDrone2021上进行相关实验,实验结果表明,该方法mAP@0.5数值达到了45.1%和43.1%,较基线模型分别提升6.6%和5.7%,且优于其他对比算法,表明该算法显著提升了检测精度,具备良好的普适性与实时性。

       

      Abstract: To address the issues of missed and false detections caused by significant scale differences of foreground targets, uneven sample spatial distribution, and high background redundancy in UAV aerial images, an adaptive foreground-focused UAV aerial image target detection algorithm is proposed. A panoramic feature refinement classification layer is constructed to enhance the algorithm's focusing capability and improve the representation quality of foreground sample features through the re-parameterization spatial pixel variance method and shuffling operation. An adaptive dual-dimensional feature sampling unit is designed using a separate-learn-merge strategy to strengthen the algorithm's ability to extract foreground focus features and retain background detail information, thereby improving false detection situations and accelerating inference speed. A multi-path information integration module is constructed by combining a multi-branch structure and a broadcast self-attention mechanism to solve the ambiguity mapping problem caused by downsampling, optimize feature interaction and integration, enhance the algorithm's ability to recognize and locate multi-scale targets, and reduce model computational load. An adaptive foreground-focused detection head is introduced, which employs a dynamic focusing mechanism to enhance foreground target detection accuracy and suppress background interference. Experiments on the public datasets VisDrone2019 and VisDrone2021 show that the proposed method achieves mAP@0.5 values of 45.1% and 43.1%, respectively, improving by 6.6% and 5.7% compared to the baseline model, and outperforming other comparison algorithms. These results demonstrate that the proposed algorithm significantly improves detection accuracy and possesses good generalizability and real-time performance.