Abstract:
To address the challenges posed by complex interference in retinal microsurgery, this study presents a deep learning-based algorithm for surgical instrument detection. The RET1 dataset was first constructed and meticulously annotated to provide a reliable basis for training and evaluation. Building upon the YOLO framework, this study introduces the SGConv and RGSCSP feature extraction modules, specifically designed to enhance the model's capability to capture fine-grained image details, especially in scenarios involving degraded image quality. Furthermore, to address the issues of slow convergence in IoU loss and inaccuracies in bounding box regression, the DeltaIoU bounding box loss function was proposed to improve both detection precision and training efficiency. Additionally, the integration of dynamic and decoupled heads optimizes feature fusion, further enhancing the detection performance. Experimental results demonstrate that the proposed method achieves 72.4% mAP
50-95 on the RET1 dataset, marking a 3.8% improvement over existing algorithms. The method also exhibits robust performance in detecting surgical instruments under various complex surgical scenarios, underscoring its potential to support automatic tracking in surgical microscopes and intelligent surgical navigation systems.