MSAF-YOLO：基于YOLOv11的玉米病虫害检测算法

李志明; 贾改秀

doi:10.12086/oee.2026.250353

MSAF-YOLO：基于YOLOv11的玉米病虫害检测算法

MSAF-YOLO: A corn pest and disease detection algorithm based on YOLOv11

摘要: 针对玉米病虫害检测任务中复杂背景以及小目标检测场景所存在的检测精度低以及漏检误检的问题，本文提出了一种改进型的YOLOv11模型：MSAF-YOLO。引入类残差反馈路径与通道注意力，设计了异构特征融合模块（heterogeneous feature fusion module，HFFM）模块，实现了跨层特征校准与增强，通过反馈连接将深层语义特征注入浅层特征中，解决了浅层网络语义信息不足，小目标特征在采样中丢失的问题。结合空间自适应权重与多分支空洞卷积，设计了ADRF模块，实现了像素级感受野调节，能够实现动态调整卷积层的感受野大小。解决了传统固定感受野卷积神经网络难以适应不同环境下目标检测的问题。实验结果表明，MSAF-YOLO模型相较于原YOLOv11模型，在降低参数量的同时，mAP50和mAP50:95在自建数据集CORN上分别提升了3.2%和2.7%，在通用数据集diseases上分别提升了3.7%和3.1%，证明了MSAF-YOLO模型的有效性。

Abstract:

Objective Corn, as one of the world’s most vital staple crops, plays an indispensable role in agricultural production, industrial applications, and economic development. In China, corn planting area and yield currently account for over 40% of total grain output, a proportion that continues to grow steadily, underscoring its strategic significance for national food security and economic stability. However, throughout the growth cycle, corn is persistently threatened by various pests and diseases, including corn borer, leaf spot, rust, stalk rot, and armyworm, which severely restrict both yield and quality. Traditional pest and disease detection methods rely heavily on manual field inspection—a labor-intensive, time-consuming, and inefficient approach that struggles to meet the demands of large-scale cultivation. Moreover, manual detection is inherently subjective and discontinuous, often resulting in missed diagnoses and misjudgments. Consequently, developing an automated, efficient, and accurate detection method for corn pests and diseases has become an urgent necessity.

In recent years, deep learning-based object detection technologies have achieved remarkable success across various domains, with the YOLO (You Only Look Once) series of algorithms gaining particular prominence due to their exceptional balance of detection speed and accuracy. YOLOv11, the latest iteration in this series, incorporates further optimizations in training strategies, network architecture, and detection precision, providing a solid technical foundation for agricultural pest identification tasks. Nevertheless, when applied to corn pest detection in real-world field scenarios, mainstream models including YOLOv11 continue to exhibit critical limitations: 1) Complex environmental interference—the intricate backgrounds, variable lighting conditions, diverse pest morphologies, and frequent occlusions characteristic of field environments substantially impair the model’s ability to accurately localize and recognize targets; 2) Small-target detection difficulty—early-stage or physically small pests and disease lesions occupy extremely limited pixel regions in images and display indistinct features, leading to elevated rates of missed detections and false negatives.

Methods To address these challenges, this paper proposes an improved YOLOv11 model designated as MSAF-YOLO (Multi-Scale Adaptive Feedback YOLO). The core innovation lies in reconstructing the backbone network using a novel MSAF-Net (Multi-Scale Adaptive Feedback Network) architecture, which synergistically integrates two specially designed modules: the Hierarchical Feature Feedback Module (HFFM) and the Adaptive Dynamic Receptive Field Module (ADRF).

The Hierarchical Feature Feedback Module addresses the fundamental problem of semantic information deficiency in shallow feature layers and the progressive loss of small-target features during repeated downsampling operations. Conventional feedforward convolutional networks propagate information unidirectionally from shallow to deep layers; while deep layers acquire rich semantic representations, shallow layers remain dominated by fine-grained spatial details but lack high-level semantic context. HFFM introduces residual-style feedback pathways coupled with channel attention mechanisms to establish cross-layer feature calibration and enhancement. Specifically, deep feature maps containing abstracted semantic information are selectively fed back and fused with corresponding shallow feature maps through learnable gating mechanisms. The channel attention component adaptively recalibrates feature responses, emphasizing informative channels while suppressing less useful ones. This feedback-driven feature refinement significantly enhances the model’s capacity to extract and preserve discriminative characteristics of small pests and incipient disease symptoms, effectively mitigating the information attenuation typically observed across deep convolutional architectures.

The Adaptive Dynamic Receptive Field Module confronts the inherent inflexibility of standard convolutional neural networks, which employ fixed geometric receptive fields that remain constant regardless of input content. Such rigidity proves particularly disadvantageous when processing agricultural images containing objects at vastly different scales—from millimeter-scale rust pustules to centimeter-scale expanded leaf spot lesions—under heterogeneous background conditions. ADRF achieves pixel-level receptive field adaptation through an elegantly designed integration of spatially adaptive weights and multi-branch dilated convolutions. For each spatial location in the feature map, the module dynamically computes context-dependent weights that govern the relative contributions of parallel convolutional branches employing different dilation rates. This mechanism enables the network to expand its receptive field when processing large contextual regions while maintaining fine resolution for small-target localization, all modulated by the local structural characteristics of the input features. The resulting adaptive receptive field substantially improves model robustness and generalization across diverse pest and disease categories, growth stages, and environmental conditions.

Results and Discussions The proposed MSAF-YOLO model was rigorously evaluated on two datasets: a self-constructed corn pest and disease dataset (CORN) encompassing representative categories including corn borer, leaf spot, and rust under various field conditions, and the publicly available general plant disease dataset (diseases) to assess cross-domain generalization capability. Experimental results demonstrate substantial and consistent improvements across multiple evaluation metrics. On the CORN dataset, MSAF-YOLO achieved mean Average Precision at 50% IoU (mAP50) of 85.7% and mAP50:95 of 68.9%, representing significant gains of 3.2 and 2.7 percentage points respectively over the baseline YOLOv11 model. On the diseases dataset, mAP50 and mAP50:95 reached 80.9% and 62.8%, improving by 3.7% and 3.1% respectively. Notably, these accuracy enhancements were accomplished alongside parameter optimization, demonstrating that MSAF-YOLO achieves a favorable trade-off between detection precision and computational efficiency.

Conclusions In conclusion, this research makes three principal contributions: 1) it systematically identifies and characterizes the specific challenges confronting corn pest and disease detection in complex field environments, particularly the coupled difficulties of small-target detection and background interference; 2) it proposes two novel modules—HFFM and ADRF—that respectively address semantic feature deficiency and fixed receptive field limitations through innovative feedback mechanisms and dynamic convolution strategies; 3) it delivers a practically deployable model that achieves state-of-the-art detection performance on corn pest and disease tasks while maintaining architectural efficiency suitable for real-time agricultural applications. The proposed MSAF-YOLO framework not only advances the methodological frontier of object detection in agricultural contexts but also offers a viable technical pathway toward intelligent, automated crop health monitoring systems.

MSAF-YOLO：基于YOLOv11的玉米病虫害检测算法

MSAF-YOLO: A corn pest and disease detection algorithm based on YOLOv11

相关链接

目录

MSAF-YOLO：基于YOLOv11的玉米病虫害检测算法

MSAF-YOLO: A corn pest and disease detection algorithm based on YOLOv11

相关链接

目录

微信二维码