基于BiLevelNet的实时语义分割算法

吴马靖; 张永爱; 林珊玲; 林志贤; 林坚普

doi:10.12086/oee.2024.240030

摘要: 针对语义分割网络参数量过大导致其难以部署在内存受限的边缘设备等问题，本文提出一种基于BiLevelNet的轻量级实时语义分割算法。首先，利用空洞卷积扩大感受野，并结合特征复用策略增强网络的区域感知能力。接着，嵌入两阶段的PBRA注意力机制，建立远距离相关物体之间的依赖关系以增强网络的全局感知能力。最后，引入结合浅层特征的FADE算子以改善图像上采样效果。实验结果表明，在输入图像分辨率为 512×1024的情况下，本文网络在Cityscapes数据集上以121 f/s的速率获得了75.1%的平均交并比，模型大小仅为0.7 M。同时在输入图像分辨率为360×480的情况下，在Camvid数据集上取得68.2%的平均交并比。同当前其他实时语义分割方法相比，该网络性能取得速度与精度的均衡，符合自动驾驶应用场景对实时性的要求。

Abstract: In response to the problem of the large parameter size of semantic segmentation networks, making it difficult to deploy on memory-constrained edge devices, a lightweight real-time semantic segmentation algorithm is proposed based on BiLevelNet. Firstly, dilated convolutions are employed to augment the receptive field, and feature reuse strategies are integrated to enhance the network's region awareness. Next, a two-stage PBRA (Partial Bi-Level Route Attention) mechanism is incorporated to establish dependencies between distant objects, thereby augmenting the network's global perception capability. Finally, the FADE operator is introduced to combine shallow features to improve the effectiveness of image upsampling. Experimental results show that, at an input image resolution of 512×1024, the proposed network achieves an average Intersection over Union (IoU) of 75.1% on the Cityscapes dataset at a speed of 121 frames per second, with a model size of only 0.7 M. Additionally, at an input image resolution of 360×480, the network achieves an average IoU of 68.2% on the CamVid dataset. Compared with other real-time semantic segmentation methods, this network achieves a balance between speed and accuracy, meeting the real-time requirements for applications like autonomous driving.

基于BiLevelNet的实时语义分割算法

Real-time semantic segmentation algorithm based on BiLevelNet

相关链接

目录

基于BiLevelNet的实时语义分割算法

Real-time semantic segmentation algorithm based on BiLevelNet

相关链接

目录

微信二维码