Abstract:
Aiming at the problem of insufficient accuracy of existing models in detecting infrared road targets, this paper proposes a YOLOv8-AFDP model. First, the model utilizes the multi-scale attention feature fusion idea of the ASF-YOLO model to optimize the neck network, and adds a small target detection layer to further enhance the network's detection performance for targets at different scales. Second, the feature pyramid shared convolution (FPSC) module is used to replace the spatial pyramid pooling fast (SPPF) module to capture more fine-grained information when processing multiscale feature maps. At the same time, the dynamic up-sampling (DySample) method is introduced to retain more extracted features of infrared road targets while controlling the computational cost. Finally, the PIoU (Powerful-IoU) loss function is used instead of the CIoU loss function to improve the model's detection accuracy and convergence speed. The experimental results showed that YOLOv8-AFDP achieves an 8.3 percentage point increase in mAP
@0.5 on the infrared data set M3FD compared to YOLOv8, reaching 83.8%, and the model's parameter count is reduced by 12.3%. The YOLOv8-AFDP model designed in this paper achieves accurate detection of infrared road targets and reduces road safety hazards.