Abstract:
In order to solve the problems of the fatigue driving detection model's high computational volume, high parameter count requirement, and insufficient extraction ability of key points of the human face, an algorithm based on improved YOLOv8n is proposed. Firstly, a dilated reparameterization block (DRB) network is introduced to replace the original backbone network, and by expanding the receptive field, the DRB module can effectively capture key features at different scales, which significantly improves the feature extraction ability. Secondly, a lightweight fully convolutional one-stage object detector (FCOS) architecture is integrated into the detection head. Through shared convolution, the architecture significantly reduces the number of parameters, improves the localization and classification performance, and reduces the computational requirements. The total number of training rounds in the experiment is 100, and the mean average precision (mAP) of the improved model under the same hardware configuration reaches 81.2%, which is 1.9% higher than that of the benchmark model; the amount of computation (GFLOPs) and the weight file (model size) are reduced by 32.1% and 33.3%, respectively, and the latency is shortened by 71.53%, with an increase in detection speed of 35 f/s. The improved algorithm has some reference value in the driver's facial target detection task.