New website getting online, testing
Light field image super-resolution network based on angular difference enhancement
  • Abstract

    Based on the advanced imaging technology, light field camera can obtain the spatial information and the angular information of the scene synchronously. It achieves higher dimensional scene representation by sacrificing the spatial resolution. In order to improve the spatial resolution of the light field image, a light field super-resolution reconstruction network based on angle difference enhancement is built in this paper. In the proposed network, eight multi-branch residual blocks are used to extract shallow features. Then, four enhanced angular deformable alignment modules are used to extract deep features. Finally six simplified residual feature distillation modules and pixel shuffle modules are used to complete data reconstruction. The proposed network takes advantage of the angle difference of the light field to complete the spatial information super-resolution. In order to obtain more features difference between different views, the own feature of the single view is emphasized during the feature extraction. The performance of the proposed network is verified on five public light field data sets. The proposed algorithm obtains high-resolution light field sub-aperture images with higher PSNR and SSIM.

    Keywords

  • 传统相机记录3D空间的2D投影,照片的强度反映了空间光线的能量积分。为了更加充分地记录空间信息,常通过变换相机位置及调整镜头焦距,实现多视角图像及多聚焦图像的获取[]。与传统相机相比,光场相机的成像效率更高,基于单次曝光捕获的光场原图,通过图像处理即可计算同一场景的多张重聚焦图像及多视角图像[]。光场相机成像效率的提升得益于相机成像模型的优化[],核心技术在于通过分光技术实现4D光场的获取,即空间光线位置信息和角度信息的同步获取,但角度信息的获取是以牺牲图像空间分辨率为代价的。为了提高光场图像的空间分辨率,研究有效的光场超分辨率重构技术具有重要意义[-]。近年来,各种先进的深度学习网络框架被应用于图像超分辨率重构,同时在光场超分辨率重构中也展示了其优越的性能。悉尼大学Yeung等人提出空间、角度信息可分离的光场超分(light field super-resolution using deep efficient spatial-angular separable convolution, LFSSR)网络,该模型呈沙漏形状,设计了空间-角度可分离卷积来提取空间与角度的联合特征,具有运算速度快的优点[]。北京交通大学张硕等人提出残差光场超分(residual networks for light field image super-resolution, ResLF)网络,该方法隐式地挖掘多视点图像间的内在对应关系,通过学习子孔径图像阵列中水平、竖直和对角方向上的视差信息,实现了重建图像的高频信息补充[]。但ResLF算法由于没有用到光场中的全部视角图像,从而导致对遮挡边缘部分的重建结果有些失真。香港城市大学Jin等人提出结合几何嵌入与结构一致性的光场超分网络,首先利用多对一模块学习子孔径图像间的互补信息,得到初级超分图像,再采用正则化模块强化视差结构,得到空间结构一致性增强的高分辨率图像[]。网络的优点在于充分利用视图之间的互补信息和保留光场视差结构。以宏像素图像阵列为输入,国防科技大学王应谦等人提出空间、角度交互光场超分(spatial-angular interaction for light field image super-resolution, LF-InterNet)网络,提出空间特征提取器与角度特征提取器分别提取光场图像的空间特征与角度特征,将其反复交互融合得到高分辨率光场图像[]。并且空间角度交互机制是一个通用框架,可以高效地处理光场数据。之后,他们又提出了可变形卷积光场图像超分(light field image super-resolution using deformable convolution, LF-DFnet)网络,设计了角度可变形对准模块来学习侧视图与中心视图的偏移量,并将该偏移量编码到每个视图的特征中提高网络超分性能[]。可变形卷积光场图像超分网络虽然考虑到了所有视图之间的视差信息,但是在深层特征提取部分却忽略了对自身特征深入学习。国防科技大学梁政宇等人提出角度分辨率灵活变化的光场超分(angular-flexible network for light field image super-resolution, LF-AFnet)网络,将解耦和融合模块以残差的形式连接,实现了任意角度分辨率光场图像的超分辨率重构[]。其最大的优势在于该网络可以处理由具有任意角度分辨率的不同类型设备捕获的光场数据。国防科技大学莫宇等人提出稠密双注意力网络(dense dual-attention network, DDAN),设计了视图注意力模块和通道注意力模块,实现了跨视点和跨通道有用信息的提取,提高了超分过程中子孔径图像信息的利用率[]。其优点在于能够以更少的网络参数量达到更好的超分辨率效果。上海交通大学赵圆圆等人提出融合多尺度特征的光场图像超分辨率网络[],该算法在空洞空间金字塔池化(atrous spatial pyramid pooling, Aspp)模块结构基础上加入了残差式的设计,组成了残差空洞空间金字塔池化(residual atrous spatial pyramid pooling, ResAspp)块的网络子结构,该算法没有充分利用视图的差异特征。

    光场超分网络以4D光场重排后的2D图像阵列为输入,大多通过学习光场子孔径图像间的空间特征和角度特征来进行超分得到高分辨率图像。性能优越的LF-DFnet网络在学习过程中引入了不同视图间差异特征的学习,却忽略了视图自身特征的学习,而在传统单幅图像超分辨率重建过程中,通过图像自身特征学习也可得到质量较高的高分辨率图像。基于此,本文搭建了自身特征强化的光场超分辨率重构网络,采用多分支残差块实现浅层特征提取,在深层特征提取时采用优化的角度可变形对准模块来强化中心视图特征和侧视图特征的学习,最后采用简化的残差特征蒸馏模块实现子孔径图像重构。

    基于光场双平面表示模型如图1(a)所示,空间任一光线可由穿过$ (u,v) $平面和$ (s,t) $平面的两个点确定,由光场相机捕获的4D光场由$ {L_{\rm{F}}}(s,t,u,v) $表示,其中$ s \in [1,S] $,$ t \in [1,T] $,$ u \in [1,U] $,$ v \in [1,V] $,$ U \times V $代表4D光场的角度分辨率,$ S \times T $表示4D光场的空间分辨率,一般情况下,$ U = V = A $。为了实现4D光场可视化,常将4D光场重排为子孔径图像阵列或宏像素阵列,如图1(b)1(c)所示,一般光场超分辨率重建网络以这两种数据格式为输入。

    考虑到光场相机角度信息的获取是以牺牲图像空间分辨率为代价,本文搭建的光场超分辨率重构网络主要用于光场图像空间分辨率超分,整体网络结构图如图2所示。网络的输入为4D光场重排后的子孔径图像阵列,该阵列由1幅中心视图和$ {A^2} - 1 $幅侧视图组成,网络包括浅层特征提取、深层特征提取、数据重构三部分。其中浅层特征提取采用8个多分支残差(multi-branch residual, MBR)块实现,深层特征提取采用4个强化的角度可变形对准(enhanced angular deformable alignment, EADA)模块实现,数据重构中特征融合采用6个简化的残差特征蒸馏(simplified residual feature distillation, SRFD)模块实现,特征上采样通过像素洗牌(pixel shuffle)[-]完成。

    Figure 1. 4D light field acquisition and rearrangement. (a) Biplanar representation model of light field; (b) Subaperture image array; (c) Macropixel array
    Full-Size Img PowerPoint

    4D light field acquisition and rearrangement. (a) Biplanar representation model of light field; (b) Subaperture image array; (c) Macropixel array

    Figure 2. Overall network structure diagram
    Full-Size Img PowerPoint

    Overall network structure diagram

    Figure 3. Multi-branch residual block
    Full-Size Img PowerPoint

    Multi-branch residual block

    本文搭建的光场超分辨率重构网络在进行浅层特征提取时,采用1×1卷积和8个MBR块生成浅层特征,所设计的MBR块包括3个支路,其数据处理流程图如图3所示,支路1包含2个卷积核大小为3×3的空洞卷积和2个卷积核大小为1×1普通卷积,空洞卷积的膨胀因子为12,支路2和支路3通过跳跃连接直接将残差块的输入向后传递,并且使用跳跃连接的方法还可以将初级特征与高级特征融合,很好地解决了网络退化问题。使用空洞卷积可以提取丰富的层次信息并扩大了特征感受野。

    3) 循环收集所有侧视图的偏移量特征;

    Fksc = HDCB([HDFE2(Fk1s)],ΔPksc),

    4D光场数据较传统2D图像,具有丰富的角度信息,故光场超分辨率重构网络在深层特征提取时,有效利用角度信息的相关性可以提高网络超分性能。文献[]提出的LF-DFnet网络,在深层特征提取部分用到了角度可变形对准(angular deformable alignment, ADA)模块,其结构如图4(a)所示,该模块利用可变形卷积块(deformable convolution block, DCB)来进行特征收集和特征发散。特征收集部分如图4(b)所示,数据处理流程包括:

    Figure 5. EADA module details. (a) EADA feature collection details; (b) EADA feature distribution details
    Full-Size Img PowerPoint

    EADA module details. (a) EADA feature collection details; (b) EADA feature distribution details

    Fks = Hsqueezek([HDCB(Ffuse,ks,ΔPks),Fk1s]),

    1) 将侧视图特征1与中心视图特征对齐,学习对齐后的侧视图特征1与中心视图特征来得到偏移量特征(如图4(c)所示),将侧视图特征1与偏移量特征送入DCB进行特征收集,得到侧视图到中心视图的对齐特征;

    其中:$ F_k^{{\rm{s}} \to {\rm{c}}} $表示支路1第$ k \in [1 \sim {A^2} - 1] $幅侧视图(side-view)对中心视图(center-view)的对齐特征1,$ F_k^{'{\rm{s}} \to {\rm{c}}} $表示支路2第$ k \in [1 \sim {A^2} - 1] $幅侧视图(side-view)对中心视图(center-view)的对齐特征2,$ {H_{{\rm{DCB}}}} $表示可变形卷积块,$ {H_{{\rm{DFE1}}}} $表示支路1的普通卷积和两个空洞卷积操作,$ {H_{{\rm{DFE2}}}} $表示支路2的普通卷积和空洞卷积操作。$ F_{k - 1}^{\rm{s}} $代表支路1强化后的侧视图特征1,$ F_{k - 1}^{'{\rm{s}}} $代表支路2强化后的侧视图特征2。$ \Delta P_k^{{\rm{s}} \to {\rm{c}}} $和$ \Delta P_k^{'{\rm{s}} \to {\rm{c}}} $分别代表支路1和支路2的偏移量特征。最后将两个支路得到的对齐特征进行通道拼接后,经过卷积核大小为1×1普通卷积得到对齐特征3。在特征发散部分,本文在DCB之前添加了1个卷积核大小为1×1普通卷积和2个卷积核大小为3×3的空洞卷积,来对侧视图特征强化学习,空洞卷积的膨胀因子分别为2与4。侧视图的最终发散特征可以根据式(3)来表示:

    Figure 4. ADA module details. (a) ADA module data processing process; (b) Feature collection; (c) Offset acquisition in feature collection; (d) Update the central view; (e) Feature distribution; (f) Offset acquisition in feature distribution
    Full-Size Img PowerPoint

    ADA module details. (a) ADA module data processing process; (b) Feature collection; (c) Offset acquisition in feature collection; (d) Update the central view; (e) Feature distribution; (f) Offset acquisition in feature distribution

    4) 所有侧视图特征和更新后的中心视图特征通过1×1卷积进行角度信息合并,得到融合特征。

    其中:$ F_k^{\rm{s}} $表示特征发散后的侧视图特征,$ H_{{\rm{squeeze}}}^k $代表通道降维,$ F_{{\rm{fuse}},k}^{\rm{s}} $表示融合特征,$ \Delta P_k^{\rm{s}} $代表偏移量特征,$ F_{k - 1}^{\rm{s}} $代表最原始的侧视图特征。EADA模块充分挖掘了各个侧视图的特征信息,可以很好地将角度信息融合并编码到每个视图的特征中,有利于获取更好的光场超分辨率图像。

    Fksc = HDCB([HDFE1(Fk1s)],ΔPksc),

    特征发散部分如图4(e)所示,先将融合特征与侧视图特征进行通道拼接,经卷积、激活、残差空洞空间金字塔池化(residual atrous spatial pyramid pooling, ResAspp)等操作学习侧视图特征与融合特征的偏移量(如图4(f)所示),然后将偏移量特征与融合特征一起送入DCB进行特征发散。通过这种方式,角度信息可以合并到每个侧视图中,使得所有视角的超分性能得到均匀改善。

    2) 更新中心视图特征(具体操作如图4(d)所示),将侧视图2的特征与中心视图特征对齐,之后数据处理与过程1) 相同;

    为了充分利用4D光场多视角图像间角度信息的相关性,本文在深层特征提取部分也采用可变形卷积来进行特征收集和特征发散。考虑到ADA模块在深层特征提取中没有充分对视图自身特征进行挖掘,本文对ADA模块进行了改进,提出了强化侧视图自身特征学习的EADA模块,以进一步提高网络性能。EADA模块中特征收集部分的结构如图5(a)所示,与ADA模块中的特征收集相比,EADA模块在利用DCB实现特征收集和特征发散之前,先对每个侧视图特征进行强化学习。具体来说,在特征收集部分,先用1×1卷积操作提取侧视图特征,然后再用两支路来进一步特征收集,支路1增加了2个卷积核大小为3×3的空洞卷积,空洞卷积的膨胀因子分别为2与4;支路2增加了1个卷积核大小为1×1普通卷积和1个卷积核大小为3×3的空洞卷积,空洞卷积的膨胀因子为2。然后将强化后的侧视图特征1与中心视图特征对齐来得到偏移量特征1(过程与图4(c)相同),偏移量特征1与强化后的侧视图特征1一起馈送入DCB模块得到对齐特征1,同理得到对齐特征2。对齐特征1和对齐特征2可以用式(1)和式(2)来表示:

    数据经过融合操作后,需将得到的重构特征馈送到上采样模块来实现光场图像的超分辨率。本文的上采样部分主要由像素洗牌操作完成[-],融合数据先通过1×1卷积将深度扩展到$ {\alpha ^2}C $(其中$ \alpha $为超分系数,$ C $代表通道数),然后进行像素洗牌操作,将重构特征的目标分辨率提升到$ \alpha S \times \alpha T $,最后进行1×1卷积将特征通道数压缩到1,生成超分辨率后的子孔径图像阵列。

    Figure 6. RFD module simplification. (a) RFD module details; (b) SRFD module details
    Full-Size Img PowerPoint

    RFD module simplification. (a) RFD module details; (b) SRFD module details

    数据重构一般包括特征融合和特征上采样两部分,这两部分逐步将网络提取的有效信息解码成最终的高清图像。本文在数据重构过程中,借鉴了残差特征蒸馏的思想完成特征融合。残差特征蒸馏(Residual feature distillation, RFD)模块[]的结构如图6(a)所示,第一级平行的浅层残差块和1×1卷积块将输入特征分成两部分,一部分保留,另一部分馈送到下一层的浅层残差块和1×1卷积进行新的蒸馏操作,最后将各级蒸馏特征进行通道拼接,采用对比度感知的通道注意力(contrast-aware channel attention, CCA)机制完成有效特征提取。考虑到RFD模块常用于特征提取部分,CCA层强化多通道有效特征学习,而本文特征融合部分更侧重于前期学习特征的解码而不是特征学习,故省去了RFD模块中的CCA层,形成了SRFD模块。在SRFD模块中,最前端增加了3×3卷积和ReLU激活来对输入特征进行细化学习,同时在浅层残差块中将激活函数ReLU直接放到了卷积之后,使得当前层的特征不经激活直接馈入下一蒸馏环节,浅层信息和深层信息得到了更直接的结合。

    本文使用5个公共光场数据集(EPFL[], HCInew[], HCIold[], INRIA[]和STFgantry[])来进行训练和评估本文方法的有效性,训练集和测试集的场景个数如表1所示。先将原始的光场图像使用双三次插值下采样生成低分辨率光场图像,通过本文搭建的光场超分辨率重构网络进行学习,生成高分辨率的光场图像,再将得到的高分辨率光场图像与原始光场图像进行对比。在训练阶段,对每个子孔径图像裁剪,并使用双三次下采样方法生成大小为32×32的低分辨率块,通过数据增强作为训练数据集。网络使用L1损失函数进行训练,并使用Adam方法[]进行优化。用零值初始化偏移量生成分支中最后一个卷积层的权重和偏差,并使用Kaiming方法[]初始化网络的其它部分。本文所有实验均在Pytorch平台下使用Python语言进行,使用带有NVIDIA GeForce RTX 3080 GPU的图像处理设备对网络进行训练和测试。批尺寸设置为2,初始学习率设置为$ 2 \times {10^{ - 4}} $,每15个周期学习率降低一半,训练在第50个周期后结束。

    Five public light field datasets used in our experiment

    实验使用的5个公共光场数据集

    数据集EPFL[]HCInew[]HCIold[]INRIA[]STFgantry[]总共
    #训练702010359144
    #测试10425223
    CSV Show Table

    PSNR/SSIM values achieved by different shallow feature extraction modules for 4× SR

    基于不同浅层特征提取模块的光场图像4倍超分PSNR/SSIM值

    模型EPFL[]HCInew[]HCIold[]INRIA[]STFgantry[]
    无残差25.26/0.832427.71/0.851732.58/0.934426.95/0.886726.09/0.8452
    MBR28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    CSV Show Table

    为了证明MBR块中多支路连接的有效性,在深层特征提取模块、数据重构模块保持不变的基础上,将MBR块与省略支路2、支路3跳跃连接的MBR块分别作为浅层特征提取模块,进行了对比实验。实验过程中对光场图像进行4倍超分辨率重构,基于不同浅层特征提取模块的网络得到的光场超分子孔径图像的平均PSNR、SSIM值如表2所示。对比表2的数据可以看出,MBR块使用了跳跃连接提取初级特征,获得了更丰富的上下文信息,使得最终的网络模型性能明显更好,验证了本文设计浅层特征提取模块的有效性。

    PSNR/SSIM values achieved by different deep feature extraction modules for 4× SR

    基于不同深层特征提取模块的光场图像4倍超分PSNR/SSIM值

    模型EPFL[]HCInew[]HCIold[]INRIA[]STFgantry[]
    ADA模块28.77/0.917231.26/0.919837.41/0.972330.80/0.950731.17/0.9497
    单支路EADA模块28.78/0.916631.21/0.918637.31/0.971730.85/0.950231.03/0.9474
    EADA模块28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    CSV Show Table

    为了验证所设计用于深层特征提取的EADA模块的有效性,在浅层特征提取模块、数据重构模块保持不变的基础上,将EADA模块与ADA模块、单支路EADA模块分别作为深层特征提取模块,进行了对比实验。单支路EADA模块是EADA模块对支路2的省略。实验中仍然对光场图像进行4倍超分辨率重构,基于不同深层特征提取模块的网络得到的光场超分子孔径图像的平均PSNR、SSIM值如表3所示。对比表3的数据可以看出,EADA模块充分挖掘了各个侧视图的特征信息,很好地将角度信息融合并编码到每个视图的特征中,获取了更好的光场超分辨率图像。因此,验证了本文提出的深层特征提取模块的有效性。

    PSNR/SSIM values achieved by different feature fusion modules for 4× SR

    基于不同特征融合模块的光场图像4倍超分PSNR/SSIM值

    模型EPFL[]HCInew[]HCIold[]INRIA[]STFgantry[]
    RFD模块29.01/0.918331.32/0.919837.39/0.971831.08/0.950931.14/0.9499
    SRFD模块28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    CSV Show Table

    为了验证所设计用于特征融合部分的SRFD模块的有效性,在浅层特征提取模块、深层特征提取模块保持不变的基础上,将RFD模块与SRFD模块分别作为特征融合模块进行了对比实验。实验中仍然对光场图像进行4倍超分辨率重构,基于不同特征融合模块的网络得到的光场超分子孔径图像的平均PSNR、SSIM值如表4所示。对比表4的数据可以看出,对RFD模块进行简化后的SRFD模块并没有降低超分后光场图像的PSNR、SSIM值,说明本文采用的SRFD模块在保证特征融合效果的前提下,有效节约了计算成本。

    为了对本文所提光场超分辨率重构网络性能进行评价,本文对5个公共光场数据集的23个场景进行2倍超分和4倍超分,分别从超分后图像主观视觉效果展示、PSNR和SSIM值计算、算法复杂度分析等三个方面对本文算法进行了定性分析与定量评价,并与其它图像超分辨率重构方法进行了对比,以验证本文算法的优越性。

    分析其原因,EDSR和RCAN算法仅超分光场中的单个视图,并没有使用光场的4D信息,网络超分性能最差;ResLF、LFSSR、LF-InterNet算法侧重于独立学习空间特征和角度特征,没有挖掘不同视图间的差异,限制其超分性能;LF-DFnet算法提出视差机制,充分挖掘了不同视图间的视差信息,超分性能有所提升,本文算法在挖掘不同视图间的视差信息前,对侧视图自身特征进行了强化学习,因此超分性能表现最好。

    Figure 10. Visual contrast of the
    Full-Size Img PowerPoint

    Visual contrast of the "Lego Knights" scene with 4× SR

    图7图8展示的是HCInew[]数据集中的“Origami”和“Herbs”场景的2倍超分效果,在图7红色实线矩形框所框圆形桶的水平条纹(椭圆形虚线框所框区域)和三角形花纹区域(矩形虚线框所框区域),及图8红色实线矩形框所框两簇植物(椭圆形及矩形虚线框所框区域),ResLF、LFSSR、LF-InterNet、LF-DFnet、本文算法的超分效果明显高于EDSR和RCAN算法的超分效果,且在这些区域本文算法的超分效果较ResLF、LFSSR、LF-InterNet和LF-DFnet算法有轻微改善。图9图10展示的是INRIA[]数据集中的“Bee”场景和STFgantry[]数据集中“Lego Knights”场景的4倍超分效果,在图9蓝色实线矩形框所框叶子的茎(矩形虚线框所框区域),LF-InterNet和本文算法的超分效果高于其它算法;在图10蓝色实线矩形框所框小人的盾牌(椭圆形及矩形虚线框所框区域),LF-DFnet、本文算法的超分效果明显高于其它算法。基于以上视觉效果对比,说明本文所提超分算法能够获得较清晰的光场高分辨率图像。

    为了定性分析本文超分辨率重构网络的性能,将本文提出的算法与当下较为先进的两种单图像超分辨率方法(EDSR[]和RCAN[])和四种光场图像超分辨率方法(ResLF[]、LFSSR[]、LF-InterNet[]和LF-DFnet[])得到的超分图像进行了视觉效果对比,如图7~图10所示。

    Figure 7. Visual contrast of the
    Full-Size Img PowerPoint

    Visual contrast of the "Origami" scene with 2× SR

    Figure 8. Visual contrast of the
    Full-Size Img PowerPoint

    Visual contrast of the "Herbs" scene with 2× SR

    Figure 9. Visual contrast of the
    Full-Size Img PowerPoint

    Visual contrast of the "Bee" scene with 4× SR

    PSNR/SSIM values achieved by different methods for 2× SR

    不同算法对光场图像2倍超分PSNR/SSIM值

    超分方法EPFL[]HCInew[]HCIold[]INRIA[]STFgantry[]Average
    EDSR[]33.09/0.963134.83/0.959441.01/0.987534.97/0.976536.29/0.981936.04/0.9728
    RCAN[]33.16/0.963534.98/0.960241.05/0.987535.01/0.976936.33/0.982536.11/0.9741
    ResLF[]32.75/0.967236.07/0.971542.61/0.992234.57/0.978436.89/0.987336.58/0.9793
    LFSSR[]33.69/0.974836.86/0.975343.75/0.993935.27/0.983438.07/0.990237.53/0.9835
    LF-InterNet[]34.14/0.976137.28/0.976944.45/0.994535.80/0.984638.72/0.991638.08/0.9847
    LF-DFnet[]34.44/0.976637.44/0.978644.23/0.994336.36/0.984139.61/0.993538.42/0.9854
    本文方法34.58/0.977237.92/0.979644.84/0.994836.59/0.985440.11/0.993938.81/0.9862
    CSV Show Table

    PSNR/SSIM values achieved by different methods for 4× SR

    不同算法对光场图像4倍超分PSNR/SSIM值

    超分方法EPFL[]HCInew[]HCIold[]INRIA[]STFgantry[]Average
    EDSR[]27.84/0.885829.60/0.887435.18/0.953829.66/0.925928.70/0.907530.20/0.9121
    RCAN[]27.88/0.886329.63/0.888035.20/0.954029.76/0.927328.90/0.911030.27/0.9133
    ResLF[]27.46/0.889929.92/0.901136.12/0.965129.64/0.933928.99/0.921430.43/0.9223
    LFSSR[]28.27/0.908030.72/0.912436.70/0.969030.31/0.944630.15/0.938531.23/0.9345
    LF-InterNet[]28.67/0.914330.98/0.916537.11/0.971530.64/0.948630.53/0.942631.59/0.9387
    LF-DFnet[]28.77/0.916531.23/0.919637.32/0.971830.83/0.950331.15/0.949431.86/0.9415
    本文方法28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.951131.92/0.9429
    CSV Show Table

    定性评价受人眼视觉感知特性限制,不易客观的反映图像质量。为了进一步准确评价本文所提算法的超分辨率性能,分别计算了5个公共光场数据集中每一测试场景2倍超分和4倍超分后的PSNR值和SSIM值,并针对不同数据集取了平均值,最后计算了所有数据集的均值,不同算法的实验结果如表5表6所示。表中的蓝色加粗数字表示当前列的最大值,黑色加粗字体表示当前列的次大值,其所在位置分别代表了对应客观评价指标下的最优算法和次优算法。从表5表6可以看出,除了在INRIA数据集4倍超分的PSNR值是次大值外,本文算法获取的高分辨率光场图像的PSNR、SSIM值均优于其它算法,进一步验证了本文所提光场超分辨率重构网络算法的有效性。

    Comparisons of the number of parameters and FLOPs by different methods for 2× SR and 4× SR

    不同算法(2倍超分/4倍超分)复杂度对比

    MethodParameters/MFLOPs/G
    EDSR[]38.62/38.8939.56×25/40.66×25
    RCAN[]15.31/15.3615.59×25/15.65×25
    ResLF[]6.35/6.7937.06/39.70
    LFSSR[]0.81/1.6125.70/128.44
    LF-InterNet[]4.80/5.2347.46/50.10
    LF-DFnet[]3.94/3.9957.22/57.31
    本文方法12.74/12.80238.92/240.51
    CSV Show Table

    在验证了本文所提算法对光场图像的超分效果后,进一步对算法的复杂度进行了分析,分别对比了本文算法与其它几种算法的参数量(Parameters)、每秒浮点运算次数(Flo/s),具体数据如表7所示。由表7可知,本文算法的参数量、每秒浮点运算次数低于EDSR、RCAN算法,但高于ResLF、LFSSR、LF-InterNet、LF-DFnet算法。与性能优越的LF-InterNet、LF-DFnet算法相比,本文算法的参数量及每秒浮点运算次数大约增加了3倍,在合理控制复杂度的前提下,获得了较高的光场超分效果。

    本文提出了一种基于角度差异强化的光场图像超分网络,以提高光场子孔径图像的空间分辨率。在浅层特征提取部分,采用了多分支残差(multi-branch residual, MBR)块来提取4D光场中的固有结构信息;在深层特征提取部分,设计了强化的角度可变形对准(enhanced angular deformable alignment, EADA)模块强化各侧视图自身特征的挖掘,使得视图间角度的差异得到更好的学习,并很好地将角度信息融合并编码到每个视图特征中;在数据重构部分,采用了结构更为简单的残差特征蒸馏(simplified residual feature distillation, SRFD)模块,使得特征融合中浅层信息和深层信息可以更直接结合。消融实验验证了本文所提网络各个模块的有效性。与其它光场超分辨率重构算法相比,本文算法在合理控制参数量的情况下,主观视觉测试和定量评价均表现出较高性能。另外,针对本文网络模型参数量较高的问题,后期将进一步简化数据重构部分的残差特征蒸馏模块,在保证超分性能的前提下进一步提高网络运算速度。

    所有作者声明无利益冲突

  • References

    [1]

    Ng R, Levoy M, Brédif M, et al. Light field photography with a hand-held plenoptic camera[J]. Comput Sci Tech Rep CSTR, 2005, 2(11): 1−11.

    Google Scholar

    [2]

    Tao M W, Hadap S, Malik J, et al. Depth from combining defocus and correspondence using light-field Cameras[C]//Proceedings of 2013 IEEE International Conference on Computer Vision, Sydney, 2013: 673–680. https://doi.org/10.1109/ICCV.2013.89.

    https://doi.org/10.1109/ICCV.2013.89.

    " target="_blank">Google Scholar

    [3]

    Levoy M, Hanrahan P. Light field rendering[C]//Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, New York, 1996: 31–42. https://doi.org/10.1145/237170.237199.

    https://doi.org/10.1145/237170.237199.

    " target="_blank">Google Scholar

    [4]

    Wang Y Q, Wang L G, Yang J G, et al. Spatial-angular interaction for light field image super-resolution[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 290–308. https://doi.org/10.1007/978-3-030-58592-1_18.

    https://doi.org/10.1007/978-3-030-58592-1_18.

    " target="_blank">Google Scholar

    [5]

    Wang Y Q, Yang J G, Wang L G, et al. Light field image super-resolution using deformable convolution[J]. IEEE Trans Image Process, 2020, 30: 1057−1071.

    DOI: 10.1109/TIP.2020.3042059

    CrossRef Google Scholar

    [6]

    Mo Y, Wang Y Q, Xiao C, et al. Dense dual-attention network for light field image super-resolution[J]. IEEE Trans Circuits Syst Video Technol, 2022, 32(7): 4431−4443.

    DOI: 10.1109/TCSVT.2021.3121679

    CrossRef Google Scholar

    View full references list
  • Cited by

    Periodical cited type(5)

    1. 梁丹,张海苗,邱钧. 基于自监督学习的光场空间域超分辨成像. 激光与光电子学进展. 2024(04): 172-184 .
    2. 熊娅维,王安志,张凯丽. 基于深度学习的光场超分辨率算法综述. 激光与光电子学进展. 2024(18): 1-12 .
    3. 王斌,白永强,朱仲杰,郁梅,蒋刚毅. 联合空角信息的无参考光场图像质量评价. 光电工程. 2024(09): 69-81 . 本站查看
    4. 李豪宇,陈晔曜,蒋志迪,蒋刚毅,郁梅. 基于子光场遮挡融合的无监督光场深度估计. 光电工程. 2024(10): 56-68 . 本站查看
    5. 黄莉,吕天琪,武迎春,陈佳妮. 基于双路引导更新的光场图像超分网络. 光电工程. 2024(12): 63-75 . 本站查看

    Other cited types(0)

  • Author Information

  • Copyright

    The copyright belongs to the Institute of Optics and Electronics, Chinese Academy of Sciences, but the article content can be freely downloaded from this website and used for free in academic and research work.
  • About this Article

    DOI: 10.12086/oee.2023.220185
    Cite this Article
    Lv Tianqi, Wu Yingchun, Zhao Xianling. Light field image super-resolution network based on angular difference enhancement. Opto-Electronic Engineering 50, 220185 (2023). DOI: 10.12086/oee.2023.220185
    Download Citation
    Article History
    • Received Date July 27, 2022
    • Revised Date October 16, 2022
    • Accepted Date October 20, 2022
    • Available Online February 15, 2023
    • Published Date February 24, 2023
    Article Metrics
    Article Views(4323) PDF Downloads(760)
    Share:
  • Related Articles

  • 数据集EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]总共
    #训练702010359144
    #测试10425223
    View in article Downloads
  • 模型EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]
    无残差25.26/0.832427.71/0.851732.58/0.934426.95/0.886726.09/0.8452
    MBR28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    View in article Downloads
  • 模型EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]
    ADA模块28.77/0.917231.26/0.919837.41/0.972330.80/0.950731.17/0.9497
    单支路EADA模块28.78/0.916631.21/0.918637.31/0.971730.85/0.950231.03/0.9474
    EADA模块28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    View in article Downloads
  • 模型EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]
    RFD模块29.01/0.918331.32/0.919837.39/0.971831.08/0.950931.14/0.9499
    SRFD模块28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    View in article Downloads
  • 超分方法EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]Average
    EDSR[20]33.09/0.963134.83/0.959441.01/0.987534.97/0.976536.29/0.981936.04/0.9728
    RCAN[21]33.16/0.963534.98/0.960241.05/0.987535.01/0.976936.33/0.982536.11/0.9741
    ResLF[8]32.75/0.967236.07/0.971542.61/0.992234.57/0.978436.89/0.987336.58/0.9793
    LFSSR[7]33.69/0.974836.86/0.975343.75/0.993935.27/0.983438.07/0.990237.53/0.9835
    LF-InterNet[4]34.14/0.976137.28/0.976944.45/0.994535.80/0.984638.72/0.991638.08/0.9847
    LF-DFnet[5]34.44/0.976637.44/0.978644.23/0.994336.36/0.984139.61/0.993538.42/0.9854
    本文方法34.58/0.977237.92/0.979644.84/0.994836.59/0.985440.11/0.993938.81/0.9862
    View in article Downloads
  • 超分方法EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]Average
    EDSR[20]27.84/0.885829.60/0.887435.18/0.953829.66/0.925928.70/0.907530.20/0.9121
    RCAN[21]27.88/0.886329.63/0.888035.20/0.954029.76/0.927328.90/0.911030.27/0.9133
    ResLF[8]27.46/0.889929.92/0.901136.12/0.965129.64/0.933928.99/0.921430.43/0.9223
    LFSSR[7]28.27/0.908030.72/0.912436.70/0.969030.31/0.944630.15/0.938531.23/0.9345
    LF-InterNet[4]28.67/0.914330.98/0.916537.11/0.971530.64/0.948630.53/0.942631.59/0.9387
    LF-DFnet[5]28.77/0.916531.23/0.919637.32/0.971830.83/0.950331.15/0.949431.86/0.9415
    本文方法28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.951131.92/0.9429
    View in article Downloads
  • MethodParameters/MFLOPs/G
    EDSR[20]38.62/38.8939.56×25/40.66×25
    RCAN[21]15.31/15.3615.59×25/15.65×25
    ResLF[8]6.35/6.7937.06/39.70
    LFSSR[7]0.81/1.6125.70/128.44
    LF-InterNet[4]4.80/5.2347.46/50.10
    LF-DFnet[5]3.94/3.9957.22/57.31
    本文方法12.74/12.80238.92/240.51
    View in article Downloads
[1]

Ng R, Levoy M, Brédif M, et al. Light field photography with a hand-held plenoptic camera[J]. Comput Sci Tech Rep CSTR, 2005, 2(11): 1−11.

Google Scholar

[2]

Tao M W, Hadap S, Malik J, et al. Depth from combining defocus and correspondence using light-field Cameras[C]//Proceedings of 2013 IEEE International Conference on Computer Vision, Sydney, 2013: 673–680. https://doi.org/10.1109/ICCV.2013.89.

https://doi.org/10.1109/ICCV.2013.89.

" target="_blank">Google Scholar

[3]

Levoy M, Hanrahan P. Light field rendering[C]//Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, New York, 1996: 31–42. https://doi.org/10.1145/237170.237199.

https://doi.org/10.1145/237170.237199.

" target="_blank">Google Scholar

[4]

Wang Y Q, Wang L G, Yang J G, et al. Spatial-angular interaction for light field image super-resolution[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 290–308. https://doi.org/10.1007/978-3-030-58592-1_18.

https://doi.org/10.1007/978-3-030-58592-1_18.

" target="_blank">Google Scholar

[5]

Wang Y Q, Yang J G, Wang L G, et al. Light field image super-resolution using deformable convolution[J]. IEEE Trans Image Process, 2020, 30: 1057−1071.

DOI: 10.1109/TIP.2020.3042059

CrossRef Google Scholar

[6]

Mo Y, Wang Y Q, Xiao C, et al. Dense dual-attention network for light field image super-resolution[J]. IEEE Trans Circuits Syst Video Technol, 2022, 32(7): 4431−4443.

DOI: 10.1109/TCSVT.2021.3121679

CrossRef Google Scholar

[7]

Yeung H W F, Hou J H, Chen X M, et al. Light field spatial super-resolution using deep efficient spatial-angular separable convolution[J]. IEEE Trans Image Process, 2019, 28(5): 2319−2330.

DOI: 10.1109/TIP.2018.2885236

CrossRef Google Scholar

[8]

Zhang S, Lin Y F, Sheng H. Residual networks for light field image super-resolution[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 11046–11055. https://doi.org/10.1109/CVPR.2019.01130.

https://doi.org/10.1109/CVPR.2019.01130.

" target="_blank">Google Scholar

[9]

Jin J, Hou J H, Chen J, et al. Light field spatial super-resolution via deep combinatorial geometry embedding and structural consistency regularization[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 2260−2269. https://doi.org/10.1109/CVPR42600.2020.00233.

https://doi.org/10.1109/CVPR42600.2020.00233.

" target="_blank">Google Scholar

[10]

Liang Z Y, Wang Y Q, Wang L G, et al. Angular-flexible network for light field image super-resolution[J]. Electron Lett, 2021, 57(24): 921−924.

DOI: 10.1049/ell2.12312

CrossRef Google Scholar

[11]

赵圆圆, 施圣贤. 融合多尺度特征的光场图像超分辨率方法[J]. 光电工程, 2020, 47(12): 200007.

DOI: 10.12086/oee.2020.200007

Zhao Y Y, Shi S X. Light-field image super-resolution based on multi-scale feature fusion[J]. Opto-Electron Eng, 2020, 47(12): 200007.

DOI: 10.12086/oee.2020.200007

CrossRef Google Scholar

[12]

Liu J, Tang J, Wu G S. Residual feature distillation network for lightweight image super-resolution[C]//Proceedings of the European Conference on Computer Vision, Glasgow, 2020: 41–55. https://doi.org/10.1007/978-3-030-67070-2_2.

https://doi.org/10.1007/978-3-030-67070-2_2.

" target="_blank">Google Scholar

[13]

Rerabek M, Ebrahimi T. New light field image dataset[C]//Proceedings of the 8th International Conference on Quality of Multimedia Experience, Lisbon, 2016.

Google Scholar

[14]

Honauer K, Johannsen O, Kondermann D, et al. A dataset and evaluation methodology for depth estimation on 4D light fields[C]//Proceedings of the 13th Asian Conference on Computer Vision, Cham, 2016: 19–34. https://doi.org/10.1007/978-3-319-54187-7_2.

https://doi.org/10.1007/978-3-319-54187-7_2.

" target="_blank">Google Scholar

[15]

Wanner S, Meister S, Goldluecke B. Datasets and benchmarks for densely sampled 4D light fields[M]//Bronstein M, Favre J, Hormann K. Vision, Modeling and Visualization. Eurographics Association, 2013: 225–226. https://doi.org/10.2312/PE.VMV.VMV13.225-226.

https://doi.org/10.2312/PE.VMV.VMV13.225-226.

" target="_blank">Google Scholar

[16]

Le Pendu M, Jiang X R, Guillemot C. Light field inpainting propagation via low rank matrix completion[J]. IEEE Trans Image Process, 2018, 27(4): 1981−1993.

DOI: 10.1109/TIP.2018.2791864

CrossRef Google Scholar

[17]

Vaish V, Adams A. The (new) Stanford light field archive, computer graphics laboratory, Stanford University[EB/OL]. 2008. http://lightfield.stanford.edu

http://lightfield.stanford.edu

" target="_blank">Google Scholar

[18]

Kingma D P, Ba J. Adam: a method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations, San Diego, 2015.

Google Scholar

[19]

He K M, Zhang X Y, Ren S Q, et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification[C]//Proceedings of 2015 International Conference on Computer Vision, Santiago, 2015: 1026–1034. https://doi.org/10.1109/ICCV.2015.123.

https://doi.org/10.1109/ICCV.2015.123.

" target="_blank">Google Scholar

[20]

Lim B, Son S, Kim H, et al. Enhanced deep residual networks for single image super-resolution[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, 2017: 136−144. https://doi.org/10.1109/CVPRW.2017.151.

https://doi.org/10.1109/CVPRW.2017.151.

" target="_blank">Google Scholar

[21]

Zhang Y L, Li K P, Li K, et al. Image super-resolution using very deep residual channel attention networks[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, 2018: 294–310. https://doi.org/10.1007/978-3-030-01234-2_18.

https://doi.org/10.1007/978-3-030-01234-2_18.

" target="_blank">Google Scholar

Related Articles
Show full outline

Catalog

    Zhao Xianling

    1. On this Site
    2. On Google Scholar
    3. On PubMed
    Light field image super-resolution network based on angular difference enhancement
    • Figure  1

      4D light field acquisition and rearrangement. (a) Biplanar representation model of light field; (b) Subaperture image array; (c) Macropixel array

    • Figure  2

      Overall network structure diagram

    • Figure  3

      Multi-branch residual block

    • Figure  4

      ADA module details. (a) ADA module data processing process; (b) Feature collection; (c) Offset acquisition in feature collection; (d) Update the central view; (e) Feature distribution; (f) Offset acquisition in feature distribution

    • Figure  5

      EADA module details. (a) EADA feature collection details; (b) EADA feature distribution details

    • Figure  6

      RFD module simplification. (a) RFD module details; (b) SRFD module details

    • Figure  7

      Visual contrast of the "Origami" scene with 2× SR

    • Figure  8

      Visual contrast of the "Herbs" scene with 2× SR

    • Figure  9

      Visual contrast of the "Bee" scene with 4× SR

    • Figure  10

      Visual contrast of the "Lego Knights" scene with 4× SR

    • Figure  1
    • Figure  2
    • Figure  3
    • Figure  4
    • Figure  5
    • Figure  6
    • Figure  7
    • Figure  8
    • Figure  9
    • Figure  10
    Light field image super-resolution network based on angular difference enhancement
    • 数据集EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]总共
      #训练702010359144
      #测试10425223
    • 模型EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]
      无残差25.26/0.832427.71/0.851732.58/0.934426.95/0.886726.09/0.8452
      MBR28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    • 模型EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]
      ADA模块28.77/0.917231.26/0.919837.41/0.972330.80/0.950731.17/0.9497
      单支路EADA模块28.78/0.916631.21/0.918637.31/0.971730.85/0.950231.03/0.9474
      EADA模块28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    • 模型EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]
      RFD模块29.01/0.918331.32/0.919837.39/0.971831.08/0.950931.14/0.9499
      SRFD模块28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.9511
    • 超分方法EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]Average
      EDSR[20]33.09/0.963134.83/0.959441.01/0.987534.97/0.976536.29/0.981936.04/0.9728
      RCAN[21]33.16/0.963534.98/0.960241.05/0.987535.01/0.976936.33/0.982536.11/0.9741
      ResLF[8]32.75/0.967236.07/0.971542.61/0.992234.57/0.978436.89/0.987336.58/0.9793
      LFSSR[7]33.69/0.974836.86/0.975343.75/0.993935.27/0.983438.07/0.990237.53/0.9835
      LF-InterNet[4]34.14/0.976137.28/0.976944.45/0.994535.80/0.984638.72/0.991638.08/0.9847
      LF-DFnet[5]34.44/0.976637.44/0.978644.23/0.994336.36/0.984139.61/0.993538.42/0.9854
      本文方法34.58/0.977237.92/0.979644.84/0.994836.59/0.985440.11/0.993938.81/0.9862
    • 超分方法EPFL[13]HCInew[14]HCIold[15]INRIA[16]STFgantry[17]Average
      EDSR[20]27.84/0.885829.60/0.887435.18/0.953829.66/0.925928.70/0.907530.20/0.9121
      RCAN[21]27.88/0.886329.63/0.888035.20/0.954029.76/0.927328.90/0.911030.27/0.9133
      ResLF[8]27.46/0.889929.92/0.901136.12/0.965129.64/0.933928.99/0.921430.43/0.9223
      LFSSR[7]28.27/0.908030.72/0.912436.70/0.969030.31/0.944630.15/0.938531.23/0.9345
      LF-InterNet[4]28.67/0.914330.98/0.916537.11/0.971530.64/0.948630.53/0.942631.59/0.9387
      LF-DFnet[5]28.77/0.916531.23/0.919637.32/0.971830.83/0.950331.15/0.949431.86/0.9415
      本文方法28.81/0.919031.30/0.920637.39/0.972530.81/0.951331.29/0.951131.92/0.9429
    • MethodParameters/MFLOPs/G
      EDSR[20]38.62/38.8939.56×25/40.66×25
      RCAN[21]15.31/15.3615.59×25/15.65×25
      ResLF[8]6.35/6.7937.06/39.70
      LFSSR[7]0.81/1.6125.70/128.44
      LF-InterNet[4]4.80/5.2347.46/50.10
      LF-DFnet[5]3.94/3.9957.22/57.31
      本文方法12.74/12.80238.92/240.51
    • Table  1

      Five public light field datasets used in our experiment

        1/7
    • Table  2

      PSNR/SSIM values achieved by different shallow feature extraction modules for 4× SR

        2/7
    • Table  3

      PSNR/SSIM values achieved by different deep feature extraction modules for 4× SR

        3/7
    • Table  4

      PSNR/SSIM values achieved by different feature fusion modules for 4× SR

        4/7
    • Table  5

      PSNR/SSIM values achieved by different methods for 2× SR

        5/7
    • Table  6

      PSNR/SSIM values achieved by different methods for 4× SR

        6/7
    • Table  7

      Comparisons of the number of parameters and FLOPs by different methods for 2× SR and 4× SR

        7/7