New website getting online, testing
    • 摘要: 针对视网膜图像质量分级任务中各等级样本数量差异大和分级效率不高的问题,提出一种多频Transformer引导图聚合视网膜图像质量分级算法。该算法首先对图像采取对比度受限直方图均衡化操作,突出关键细节特征,并采取ResNet50网络进行多级特征提取。然后设计频率通道重组Transformer模块,引入频域信息辅助建模全局特征,以优化全局与局部特征。随后构建图交叉特征聚合模块,采用跨尺度交叉注意力机制引导图聚合,实现不同源特征对齐,进而增强模型对多层次特征敏感性。最后搭建加权损失函数,聚焦模型对少数类样本关注度。在Eye-Quality和RIQA-RFMiD数据集上进行实验验证,其准确率分别为88.71%和84.95%,精确率分别为87.78%和74.22%。实验结果表明,所提算法在视网膜图像质量评估领域具有一定应用价值。

       

      Abstract: To address the issues of significant sample imbalance among different quality levels and low grading efficiency in retinal image quality grading tasks, this paper proposes a multi-frequency Transformer-guided graph-based feature aggregation method for retinal image quality grading. First, contrast-limited adaptive histogram equalization (CLAHE) is applied to enhance key details in the images. Then, a ResNet50 network is employed for multi-level feature extraction. Next, a frequency-channel transformer module is designed, which incorporates frequency-domain information to assist in global feature modeling, thereby optimizing the balance between international and local features. Subsequently, a graph cross-feature aggregation module is introduced, leveraging a cross-scale cross-attention mechanism to guide image aggregation, aligning multi-source features, and enhancing the model’s sensitivity to multi-level features. Finally, a weighted loss function increases the model’s attention to minority-class samples. Experiments conducted on the Eye-Quality and RIQA-RFMiD datasets achieved accuracy rates of 88.71% and 84.95%, with precision rates of 87.78% and 74.22%, respectively. The experimental results demonstrate that the proposed algorithm holds significant application value in retinal image quality assessment.