融合注意力的多尺度Faster RCNN的裂纹检测

陈海永,赵鹏,闫皓炜. 融合注意力的多尺度Faster RCNN的裂纹检测[J]. 光电工程,2021,48(1):200112. doi: 10.12086/oee.2021.200112
引用本文: 陈海永,赵鹏,闫皓炜. 融合注意力的多尺度Faster RCNN的裂纹检测[J]. 光电工程,2021,48(1):200112. doi: 10.12086/oee.2021.200112
Chen H Y, Zhao P, Yan H W. Crack detection based on multi-scale Faster RCNN with attention[J]. Opto-Electron Eng, 2021, 48(1): 200112. doi: 10.12086/oee.2021.200112
Citation: Chen H Y, Zhao P, Yan H W. Crack detection based on multi-scale Faster RCNN with attention[J]. Opto-Electron Eng, 2021, 48(1): 200112. doi: 10.12086/oee.2021.200112

融合注意力的多尺度Faster RCNN的裂纹检测

  • 基金项目:
    国家自然科学基金资助项目(61873315)
详细信息

Crack detection based on multi-scale Faster RCNN with attention

  • Fund Project: National Natural Science Foundation of China (61873315)
More Information
  • 电致发光(Electroluminescence, EL)下的光伏电池EL图像背景表现为复杂的非均匀纹理特征,且存在与裂纹相似的晶粒伪缺陷,同时裂纹表现为形状多样的多尺度特征,以上难点为检测任务带来了极大的挑战。因此,本文提出融合注意力的多尺度Faster-RCNN模型,一方面,采用改进的特征金字塔网络获取多尺度的高级语义特征图,以此来提高网络对多尺度裂纹缺陷的特征表达能力。另一方面,采用改进的注意力区域推荐网络A-RPN,提高模型对裂纹缺陷的关注并抑制复杂背景及晶粒伪缺陷的特征。同时,在RPN网络训练过程中,采用损失函数Focal loss,以此来降低训练过程中简单样本所占比重,使其更加关注难以区分的样本。实验结果表明,改进的算法使得EL图像裂纹缺陷检测的准确率提高,达到接近95%。

  • Overview: Electroluminescence (EL) images of photovoltaic cells have a non-uniformly textured complex background, and the background contains grain pseudo-defects that are highly similar to the crack structure. At the same time, the cracks are characterized by various sizes and shapes. Existing target detection algorithms based on convolutional neural networks cannot adapt to the above problems. From the perspective of suppressing interference from complex background and improving the adaptability of the model to multi-scale crack defect detection, this paper proposes a multi-scale Faster RCNN model that integrates attention. In photovoltaic cell EL images, the scale of the cracks varies greatly, including a large number of small target cracks. In order to improve the network's ability to express multi-scale crack defects, a path aggregation feature pyramid network (PA-FPN) is proposed. Based on the combination of the residual network ResNet50 and the feature pyramid network FPN, PA-FPN adds a bottom-up path to fuse features. PA-FPN effectively retains shallow feature information, which improves the model's adaptability to multi-scale cracks in EL images and especially the detection results of small-scale cracks. In order to improve the model's attention to crack defects and suppress the characteristics of complex background and grain pseudo-defects, this paper proposes a regional recommendation network A-RPN that incorporates convolutional block attention module (CBAM). CBAM is composed of a channel attention module and a spatial attention module. In this paper, it is experimentally verified that the detection result of the RPN network fused with CBAM is better than that of using an attention modules alone. K-means clustering is used to cluster the crack sizes in the data set to guide the RPN to set the anchor box closer to the actual crack size, which improves the speed and accuracy of the target box regression in the defect detection process. In addition, in the RPN network training process, the loss function Focal loss is used to replace the original cross-entropy loss function, so as to reduce the proportion of simple samples in the training process and make the model pay more attention to the samples that are difficult to distinguish. The entire network can achieve end-to-end training. In order to verify the effectiveness of the improved algorithm, the performance of the original Faster RCNN model, RetinaNet, and CenterNet on multi-scale crack detection of EL images is compared. Through training and testing of 1024 pixels×1024 pixels of photovoltaic cell EL images, experimental results show that the improved Faster RCNN is better than the above mentioned target detection algorithms in accuracy, and has good robustness to the strip-shaped multi-scale cracks, which can be adapted to the EL image with changing complex background.

  • 加载中
  • 图 1  EL成像采集系统

    Figure 1.  EL imaging acquisition system

    图 2  非均匀纹理随机背景的EL图像。

    Figure 2.  EL image of a random background with a non-uniform texture. The rectangular frame is the grain, the triangular frame marks the pseudo-defects of the grain that are highly similar to the crack, and the ellipse marks the crack

    图 3  融合注意力的多尺度Faster-RCNN模型

    Figure 3.  Multi-scale Faster-RCNN model with attention

    图 4  路径聚合特征金字塔PA-FPN

    Figure 4.  Path aggregation feature pyramid PA-FPN

    图 5  融合注意力CBAM的检测模型

    Figure 5.  Detection model with integrated CBAM

    图 6  特征图可视化对比

    Figure 6.  Visual comparison of feature maps

    图 7  RPN结合注意力CBAM前后的特征图

    Figure 7.  Feature map before and after RPN combined with attention CBAM

    图 8  不同算法在光伏电池EL图像上的检测结果对比图

    Figure 8.  Comparison of detection results of different algorithms on photovoltaic cell EL images

    表 1  光伏电池EL图像数据集

    Table 1.  Photovoltaic cell EL image data set

    分辨率 训练集 测试集 合计
    1024×1024 476 236 712
    下载: 导出CSV

    表 2  模型的参数配置

    Table 2.  Parameter configuration of the model

    Image_resize Weight_decay Learning_rate Network_batch_size
    1024×1024 0.0005 0.0001 1
    Momentum RPN_proposals_train RPN_proposals_test RPN batch_size
    0.9 2000 1000 256
    Max_iteration ROI_foreground threshold ROI_background threshold RPN_nms threshold
    20000 (0.7, 1) (0, 0.3) 0.7
    下载: 导出CSV

    表 3  基于Faster-RCNN算法的EL图像检测性能

    Table 3.  EL image detection performance based on Faster-RCNN algorithm

    Faster-RCNN Focal loss 注意力 PA-FPN AP
    ResNet50 - - - 87.68
    - - 88.93
    - 92.26
    94.75
    下载: 导出CSV

    表 4  不同算法在光伏电池EL图像上的检测性能

    Table 4.  Detection performance of different algorithms on photovoltaic cell EL images

    Method 骨干网络 AP
    原始Faster RCNN ResNet50 87.68
    CenterNet[5] ResNet18 85.07
    DLA 87.25
    RetinaNet[6] ResNet50 84.53
    改进的Faster RCNN ResNet50 94.75
    下载: 导出CSV
  • [1]

    Anwar S A, Abdullah M Z. Micro-crack detection of multicrystalline solar cells featuring shape analysis and support vector machines[C]//Proceedings of 2012 IEEE International Conference on Control System, Computing and Engineering, 2012: 143-148.

    [2]

    Su B Y, Chen H Y, Zhu Y F, et al. Classification of manufacturing defects in multicrystalline solar cells with novel feature descriptor[J]. IEEE Trans Instrum Meas, 2019, 68(12): 4675-4688. doi: 10.1109/TIM.2019.2900961

    [3]

    Luo Q W, Sun Y C, Li P C, et al. Generalized completed local binary patterns for time-efficient steel surface defect classification[J]. IEEE Trans Instrum Meas, 2019, 68(3): 667-679. doi: 10.1109/TIM.2018.2852918

    [4]

    Tsai D M, Chang C C, Chao S M. Micro-crack inspection in heterogeneously textured solar wafers using anisotropic diffusion[J]. Image Vis Comput, 2010, 28(3): 491-501. doi: 10.1016/j.imavis.2009.08.001

    [5]

    Cha Y J, Choi W, Büyüköztürk O. Deep learning‐based crack damage detection using convolutional neural networks[J]. Comput Aided Civ Inf Eng, 2017, 32(5): 361-378. doi: 10.1111/mice.12263

    [6]

    Lin H, Li B, Wang X G, et al. Automated defect inspection of LED chip using deep convolutional neural network[J]. J Intell Manuf, 2019, 30(6): 2525-2534. doi: 10.1007/s10845-018-1415-x

    [7]

    Duan K W, Bai S, Xie L X, et al. Centernet: keypoint triplets for object detection[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, 2019: 6568-6577.

    [8]

    Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, 2017: 2999-3007.

    [9]

    Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, 2015: 1440-1448.

    [10]

    Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 91-99.

    [11]

    Cha Y J, Choi W, Suh G, et al. Autonomous structural visual inspection using region‐based deep learning for detecting multiple damage types[J]. Comput Aided Civ Inf Eng, 2018, 33(9): 731-747. doi: 10.1111/mice.12334

    [12]

    高琳, 陈念年, 范勇. 融合多尺度上下文卷积特征的车辆目标检测[J]. 光电工程, 2019, 46(4): 180331. doi: 10.12086/oee.2019.180331

    Gao L, Chen N N, Fan Y. Vehicle detection based on fusing multi-scale context convolution features[J]. Opto-Electron Eng, 2019, 46(4): 180331. doi: 10.12086/oee.2019.180331

    [13]

    Liu S, Qi L, Qin H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.

    [14]

    Corbetta M, Shulman G L. Control of goal-directed and stimulus-driven attention in the brain[J]. Nat Rev Neurosci, 2002, 3(3): 201-215. doi: 10.1038/nrn755

    [15]

    Frazão M, Silva J A, Lobato K, et al. Electroluminescence of silicon solar cells using a consumer grade digital camera[J]. Measurement, 2017, 99: 7-12. doi: 10.1016/j.measurement.2016.12.017

    [16]

    Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.

    [17]

    Everingham M, Van Gool L, Williams C K I, et al. The PASCAL visual object classes (VOC) challenge[J]. Int J Comput Vis, 2010, 88(2): 303-338. doi: 10.1007/s11263-009-0275-4

    [18]

    Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 3-19.

  • 加载中

(8)

(4)

计量
  • 文章访问数:  6926
  • PDF下载数:  2867
  • 施引文献:  0
出版历程
收稿日期:  2020-04-02
修回日期:  2020-06-15
刊出日期:  2021-01-15

目录

/

返回文章
返回