Abstract:
To address the detection challenges posed by the complex backgrounds and significant variations in target scales in road damage images captured from drone aerial perspectives, a road damage detection method called MAS-YOLOv8n, incorporating a multi-branch hybrid attention mechanism, is proposed. Firstly, to address the problem of the residual structure in the YOLOv8n model being prone to interference, resulting in information loss, a multi-branch mixed attention (MBMA) mechanism is introduced. This MBMA structure is integrated into the C2f structure, strengthening the feature representation capabilities. It not only captures richer feature information but also reduces the impact of noise on the detection results. Secondly, to address the issue of poor detection performance resulting from significant variations in road damage morphologies, the TaskAlignedAssigner label assignment algorithm used in the YOLOv8n model is improved by utilizing ShapeIoU (shape-intersection over union), making it more suitable for targets with diverse shapes and further enhancing detection accuracy. Experimental evaluations of the MAS-YOLOv8n model on the China-Drone dataset of road damages captured by drones reveal that compared to the baseline YOLOv8n model, our model achieves a 3.1% increase in mean average precision (mAP) without incurring additional computational costs. To further validate the model's generalizability, tests on the RDD2022_Chinese and RDD2022_Japanese datasets also demonstrate improved accuracy. Compared to YOLOv5n, YOLOv8n, YOLOv10n, GOLD-YOLO, Faster-RCNN, TOOD, RTMDet-Tiny, and RT-DETR, our model exhibits superior detection accuracy and performance, showcasing its robust generalization capabilities.