Abstract:
With the wide application of point clouds in virtual reality, computer vision, robotics and other fields, the assessment of distortions resulted from point cloud acquisition and processing is becoming an important research topic. Considering that the three-dimensional information of point clouds is sensitive to geometric distortion and the two-dimensional projection of point clouds contains rich texture and semantic information, a no-reference point cloud quality assessment method based on the fusion of three-dimensional and two-dimensional features is proposed to effectively combine the three-dimensional and two-dimensional feature information of point cloud and improve the accuracy of point cloud quality assessment. For 3D feature extraction, the farthest point sampling is firstly implemented on the point cloud, and then the non-overlapping point cloud sub-models centered on the selected points are generated, to cover the whole point cloud model as much as possible and use a multi-scale 3D feature extraction network to extract the features of voxels and points. For 2D feature extraction, the point cloud is first projected with orthogonal hexahedron projection, and then the texture and semantic information are extracted by a multi-scale 2D feature extraction network. Finally, considering the process of segmentation and interweaving fusion that occurs when the human visual system processes different types of information, a symmetric cross-modal attention module is designed to integrate 3D and 2D features. The experimental results on five public point cloud quality assessment datasets show that the Pearson’s linear correlation coefficient (PLCC) of the proposed method reaches 0.9203, 0.9463, 0.9125, 0.9164 and 0.9209 respectively, indicating that the proposed method has advanced performance compared with the existing representative point cloud quality assessment methods.