Optical tissue images segmentation method of ironmaking coke based on MD-UNet

Abstract

Addressing the challenge of unclear segmentation boundaries arising from multi-component aliasing in coke optical tissue images, this paper proposes a MD-UNet semantic segmentation model. This model employs VGG16 as its backbone network and incorporates the CloAttention module at the deepest level of the encoder. By leveraging context-aware local enhancement and a global attention mechanism, CloAttention enables the model to focus better on critical image regions and enhances the perception of the complex textures inherent in coke optical tissues. Furthermore, a multi-branch dilated fusion (MBDF) module has been designed to replace the conventional convolution modules in the decoder. This substitution aims to effectively preserve and integrate multi-scale information, thereby enriching feature representation and mitigating information loss and detail blurring. Finally, the GELU activation function is adopted in place of ReLU to address the vanishing gradient problem encountered during network training. Comparative experiments on semantic segmentation models demonstrate that the proposed MD-UNet model achieves the most superior segmentation performance on coke optical tissues, reaching mIoU and F1-Score values of 88.72% and 94.28%, respectively. These results significantly outperform traditional semantic segmentation models, thereby validating the effectiveness of MD-UNet in enhancing the segmentation accuracy of coke optical tissues.