Citation: | Wu F, Chen J C, Yang J, et al. Remote-sensing images reconstruction based on adaptive dual-domain attention network[J]. Opto-Electron Eng, 2025, 52(4): 240297. doi: 10.12086/oee.2025.240297 |
[1] | 陈明惠, 芦焱琦, 杨文逸, 等. OCT图像多教师知识蒸馏超分辨率重建[J]. 光电工程, 2024, 51(7): 240114. doi: 10.12086/oee.2024.240114 Chen M H, Lu Y Q, Yang W Y, et al. Super-resolution reconstruction of retinal OCT image using multi-teacher knowledge distillation network[J]. Opto-Electron Eng, 2024, 51(7): 240114. doi: 10.12086/oee.2024.240114 |
[2] | 肖振久, 张杰浩, 林渤翰. 特征协同与细粒度感知的遥感图像小目标检测[J]. 光电工程, 2024, 51(6): 240066. doi: 10.12086/oee.2024.240066 Xiao Z J, Zhang J H, Lin B H. Feature coordination and fine-grained perception of small targets in remote sensing images[J]. Opto-Electron Eng, 2024, 51(6): 240066. doi: 10.12086/oee.2024.240066 |
[3] | Wu L S, Fang L Y, Yue J, et al. Deep bilateral filtering network for point-supervised semantic segmentation in remote sensing images[J]. IEEE Trans Image Process, 2022, 31: 7419−7434. doi: 10.1109/TIP.2022.3222904 |
[4] | Yang J Q, Du B, Xu Y H, et al. Can spectral information work while extracting spatial distribution?—An online spectral information compensation network for HSI classification[J]. IEEE Trans Image Process, 2023, 32: 2360−2373. doi: 10.1109/TIP.2023.3244414 |
[5] | Sitaula C, Kc S, Aryal J. Enhanced multi-level features for very high resolution remote sensing scene classification[J]. Neural Comput Appl, 2024, 36(13): 7071−7083. doi: 10.1007/s00521-024-09446-y |
[6] | Zhang C, Lam K M, Liu T S, et al. Structured adversarial self-supervised learning for robust object detection in remote sensing images[J]. IEEE Trans Geosci Remote Sens, 2024, 62: 5613720. doi: 10.1109/TGRS.2024.3375398 |
[7] | 禹文奇, 程塨, 王美君, 等. MAR20: 遥感图像军用飞机目标识别数据集[J]. 遥感学报, 2023, 27(12): 2688−2696. doi: 10.11834/jrs.20222139 Yu W Q, Cheng G, Wang M J, et al. MAR20: a benchmark for military aircraft recognition in remote sensing images[J]. Natl Remote Sens Bull, 2023, 27(12): 2688−2696. doi: 10.11834/jrs.20222139 |
[8] | 张文雪, 罗一涵, 刘雅卿, 等. 基于主动位移成像的图像超分辨率重建[J]. 光电工程, 2024, 51(1): 230290. doi: 10.12086/oee.2024.230290 Zhang W X, Luo Y H, Liu Y Q, et al. Image super-resolution reconstruction based on active displacement imaging[J]. Opto-Electron Eng, 2024, 51(1): 230290. doi: 10.12086/oee.2024.230290 |
[9] | 程德强, 马祥, 寇旗旗, 等. 基于多路特征校准的轻量级图像超分辨率重建算法[J]. 计算机辅助设计与图形学学报, 2025, 36(12): 241211 doi: 10.3724/SP.J.1089.2024-00306 Cheng D Q, Ma X, Kou Q Q, et al. Lightweight image super-resolution reconstruction algorithm based on multi-path feature calibration[J]. J Comput-Aided Des Comput Graph, 2025, 36(12): 241211 doi: 10.3724/SP.J.1089.2024-00306 |
[10] | Dong C, Loy C C, He K M, et al. Learning a deep convolutional network for image super-resolution[C]//Proceedings of the 13th European Conference on Computer Vision, 2014: 184‒199. https://doi.org/10.1007/978-3-319-10593-2_13. |
[11] | Mei Y Q, Fan Y C, Zhou Y Q. Image super-resolution with non-local sparse attention[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 3517–3526. https://doi.org/10.1109/CVPR46437.2021.00352. |
[12] | Kim J, Lee J K, Lee K M. Accurate image super-resolution using very deep convolutional networks[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 1646–1654. https://doi.org/10.1109/CVPR.2016.182. |
[13] | Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000–6010. |
[14] | Chen X Y, Wang X T, Zhou J T, et al. Activating more pixels in image super-resolution Transformer[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 22367–22377. https://doi.org/10.1109/CVPR52729.2023.02142. |
[15] | Yang F Z, Yang H, Fu J L, et al. Learning texture Transformer network for image super-resolution[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 5791–5800. https://doi.org/10.1109/CVPR42600.2020.00583. |
[16] | Lu Z S, Li J C, Liu H, et al. Transformer for single image super-resolution[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022: 457–466. https://doi.org/10.1109/CVPRW56347.2022.00061. |
[17] | Liang J Y, Cao J Z, Sun G L, et al. SwinIR: image restoration using Swin Transformer[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision Workshops, 2021: 1833–1844. https://doi.org/10.1109/ICCVW54120.2021.00210. |
[18] | Zhang X D, Zeng H, Guo S, et al. Efficient long-range attention network for image super-resolution[C]//Proceedings of the 17th European Conference on Computer Vision, 2022: 649–667. https://doi.org/10.1007/978-3-031-19790-1_39. |
[19] | 王志浩, 钱沄涛. 基于Swin Transformer的双流遥感图像时空融合超分辨率重建[J]. 计算机工程, 2024, 50(9): 33−45. doi: 10.19678/j.issn.1000-3428.0068296 Wang Z H, Qian Y T. Super-resolution reconstruction of spatiotemporal fusion for dual-stream remote sensing images based on Swin Transformer[J]. Comput Eng, 2024, 50(9): 33−45. doi: 10.19678/j.issn.1000-3428.0068296 |
[20] | Lei S, Shi Z W, Zou Z X. Super-resolution for remote sensing images via local–global combined network[J]. IEEE Geosci Remote Sens Lett, 2017, 14(8): 1243−1247. doi: 10.1109/LGRS.2017.2704122 |
[21] | Jiang K, Wang Z Y, Yi P, et al. Deep distillation recursive network for remote sensing imagery super-resolution[J]. Remote Sens, 2018, 10(11): 1700. doi: 10.3390/rs10111700 |
[22] | Jiang K, Wang Z Y, Yi P, et al. Edge-enhanced GAN for remote sensing image superresolution[J]. IEEE Trans Geosci Remote Sens, 2019, 57(8): 5799−5812. doi: 10.1109/TGRS.2019.2902431 |
[23] | Jiang K, Wang Z Y, Yi P, et al. Hierarchical dense recursive network for image super-resolution[J]. Pattern Recognit, 2020, 107: 107475. doi: 10.1016/j.patcog.2020.107475 |
[24] | He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778. https://doi.org/10.1109/CVPR.2016.90. |
[25] | Lim B, Son S, Kim H, et al. Enhanced deep residual networks for single image super-resolution[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017: 136–144. https://doi.org/10.1109/CVPRW.2017.151. |
[26] | Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning, 2015: 448–456. |
[27] | Dong C, Loy C C, He K M, et al. Image super-resolution using deep convolutional networks[J]. IEEE Trans Pattern Anal Mach Intell, 2016, 38(2): 295−307. doi: 10.1109/TPAMI.2015.2439281 |
[28] | Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 7132–7141. https://doi.org/10.1109/CVPR.2018.00745. |
[29] | Niu B, Wen W L, Ren W Q, et al. Single image super-resolution via a holistic attention network[C]//Proceedings of the 16th European Conference on Computer Vision, 2020: 191–207. https://doi.org/10.1007/978-3-030-58610-2_12. |
[30] | Zhang Y L, Li K P, Li K, et al. Image super-resolution using very deep residual channel attention networks[C]//Proceedings of the 15th European Conference on Computer Vision, 2018: 286–301. https://doi.org/10.1007/978-3-030-01234-2_18. |
[31] | Lei S, Shi Z W. Hybrid-scale self-similarity exploitation for remote sensing image super-resolution[J]. IEEE Trans Geosci Remote Sens, 2022, 60: 5401410. doi: 10.1109/TGRS.2021.3069889 |
[32] | Zhang W D, Zhao W Y, Li J, et al. CVANet: cascaded visual attention network for single image super-resolution[J]. Neural Netw, 2024, 170: 622−634. doi: 10.1016/j.neunet.2023.11.049 |
[33] | 祝冰艳, 陈志华, 盛斌. 基于感知增强Swin Transformer的遥感图像检测[J]. 计算机工程, 2024, 50(1): 216−223. doi: 10.19678/j.issn.1000-3428.0066941 Zhu B Y, Chen Z H, Sheng B. Remote sensing image detection based on perceptually enhanced Swin Transformer[J]. Comput Eng, 2024, 50(1): 216−223. doi: 10.19678/j.issn.1000-3428.0066941 |
[34] | 张艳月, 张宝华, 赵云飞, 等. 基于双通道深度密集特征融合的遥感影像分类[J]. 激光技术, 2021, 45(1): 73−79. doi: 10.7510/jgjs.issn.1001-3806.2021.01 Zhang Y Y, Zhang B H, Zhao Y F, et al. Remote sensing image classification based on dual-channel deep dense feature fusion[J]. Laser Technol, 2021, 45(1): 73−79. doi: 10.7510/jgjs.issn.1001-3806.2021.01 |
[35] | Salvetti F, Mazzia V, Khaliq A, et al. Multi-image super resolution of remotely sensed images using residual attention deep neural networks[J]. Remote Sens, 2020, 12(14): 2207. doi: 10.3390/rs12142207 |
[36] | Dong X Y, Sun X, Jia X P, et al. Remote sensing image super-resolution using novel dense-sampling networks[J]. IEEE Trans Geosci Remote Sens, 2021, 59(2): 1618−1633. doi: 10.1109/TGRS.2020.2994253 |
[37] | Xiao Y, Yuan Q Q, Jiang K, et al. TTST: a top-k token selective transformer for remote sensing image super-resolution[J]. IEEE Trans Image Process, 2024, 33: 738−752. doi: 10.1109/TIP.2023.3349004 |
[38] | Liu H X, Dai Z H, So D R, et al. Pay attention to MLPs[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems, 2021: 704. |
[39] | Dauphin Y N, Fan A, Auli M, et al. Language modeling with gated convolutional networks[C]//Proceedings of the 34th International Conference on Machine Learning, 2017: 933–941. |
[40] | Song Y D, He Z Q, Qian H, et al. Vision transformers for single image Dehazing[J]. IEEE Trans Image Process, 2023, 32: 1927−1941. doi: 10.1109/TIP.2023.3256763 |
[41] | Ba J L, Kiros J R, Hinton G E. Layer normalization[Z]. arXiv: 1607.06450, 2016. https://doi.org/10.48550/arXiv.1607.06450. |
[42] | Yang Y, Newsam S. Bag-of-visual-words and spatial extensions for land-use classification[C]//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2010: 270–279. https://doi.org/10.1145/1869790.1869829. |
[43] | Xia G S, Hu J W, Hu F, et al. AID: a benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Trans Geosci Remote Sens, 2017, 55(7): 3965−3981. doi: 10.1109/TGRS.2017.2685945 |
[44] | Lei S, Shi Z W, Mo W J. Transformer-based multistage enhancement for remote sensing image super-resolution[J]. IEEE Trans Geosci Remote Sens, 2022, 60: 5615611. doi: 10.1109/TGRS.2021.3136190 |
[45] | Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Trans Image Process, 2004, 13(4): 600−612. doi: 10.1109/TIP.2003.819861 |
[46] | Zhang R, Isola P, Efros A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 586–595. https://doi.org/10.1109/CVPR.2018.00068. |
[47] | Kingma D P, Ba J. Adam: a method for stochastic optimization[Z]. arXiv: 1412.6980, 2014. https://doi.org/10.48550/arXiv.1412.6980. |
[48] | Dong C, Loy C C, Tang X O. Accelerating the super-resolution convolutional neural network[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 391‒407. https://doi.org/10.1007/978-3-319-46475-6_25. |
[49] | Haut J M, Paoletti M E, Fernández-Beltran R, et al. Remote sensing single-image superresolution based on a deep compendium model[J]. IEEE Geosci Remote Sens Lett, 2019, 16(9): 1432−1436. doi: 10.1109/LGRS.2019.2899576 |
With the rapid development of convolutional neural networks (CNNs) and Transformer models, significant progress has been made in the task of remote sensing image super-resolution reconstruction (RSISR). However, existing methods have limitations in handling features of objects at different scales and fail to fully exploit the implicit relationships between channel and spatial dimensions, which restricts further improvement in reconstruction performance. To address these issues, an adaptive dual-domain attention network (ADAN) is proposed, aiming to enhance feature extraction capabilities by integrating self-attention information from both channel and spatial domains. Additionally, it combines multi-scale feature mining and local feature representation to improve the performance of remote sensing image super-resolution reconstruction.
The research aims to address the shortcomings of existing methods in multi-scale feature extraction and insufficient exploration of channel-spatial relationships in remote sensing image super-resolution tasks. To this end, the ADAN network designs a multi-scale feed-forward network (MSFFN) to capture rich multi-scale features and incorporates a novel gate information selective module (GISM) to enhance local feature representation. Furthermore, the network adopts a U-shaped architecture to achieve efficient multi-level feature fusion. Specifically, ADAN introduces a convolutionally enhanced spatial-wise transformer module (CESTM) and a convolutionally enhanced channel-wise transformer module (CECTM) to extract channel and spatial features in parallel, comprehensively exploring the interactions and dependencies between features.
Experimental results demonstrate that ADAN significantly outperforms state-of-the-art algorithms on multiple public remote sensing datasets in terms of quantitative metrics (e.g., PSNR and SSIM) and visual quality, validating its effectiveness and superiority. The main contributions are as follows: 1) Proposing a novel method, ADAN, tailored for remote sensing image super-resolution tasks; 2) Designing parallel channel and spatial feature extraction modules along with a gated convolution module to comprehensively explore features across channel, spatial, and convolutional dimensions; 3) Introducing a multi-scale feed-forward network (MSFFN) to effectively explore potential scale relationships and enhance global representation capabilities; 4) Experimentally validating the superior performance of ADAN in remote sensing image super-resolution reconstruction. This research provides new insights and technical pathways for remote sensing image super-resolution reconstruction.
Overall architecture and module structures of ADAN
Convolution-enhanced spatial-wise self-attention module
Convolution-enhanced channel-wise self-attention module
Multi-scale feedforward neural network
Loss function analysis results on UCMerced LandUse with upscale factor of ×2
PSNR analysis results on UCMerced LandUse with upscale factor of ×2
Visual comparison on UCMerced Landuse with an upscaling factor of ×2
Visual comparison on UCMerced Landuse with an upscaling factor of ×3
Visual comparison on UCMerced Landuse with an upscaling factor of ×4
Residual comparison on UCMerced Landuse with an upscaling factor of ×2