Abstract:
Objective Existing low-light image enhancement methods generally suffer from high model complexity, considerable deployment cost, and semantic coupling between luminance and chrominance information, which often leads to fusion artifacts and restricts their application on resource-constrained devices. To address these issues, this paper proposes a lightweight low-light image enhancement network that combines multi-channel parallel attention and cross-fusion mechanisms. In the YCbCr color space, luminance and chrominance information are modeled separately. A lightweight denoising module and a multi-head self-attention mechanism are introduced to extract illumination structure features, while a multi-channel parallel attention module is introduced to enhance color-context modeling. In addition, a luminance-guided cross-fusion module is introduced to achieve collaborative enhancement of structure and color under joint channel-spatial attention optimization.
Methods The input image is first converted into the YCbCr color space and decomposed into the Y, Cb, and Cr channels. The Y channel is processed by the DenoiseY module to suppress noise, and then passed through a multi-head self-attention module (MSA) to extract structural features. To improve contextual modeling efficiency, a pooling operation is inserted before MSA to reduce the number of tokens. Meanwhile, the Cb and Cr channels are enhanced by the MCPA module, which performs multi-scale attention modeling to extract richer color-context information.
Subsequently, in the LCA module, the denoised Y channel is used as a guidance feature to fuse with the Cb and Cr channels. This module incorporates channel attention based on the squeeze-and-excitation (SE) mechanism together with element-wise interaction, enabling luminance-guided structure-color fusion.
Finally, the fused features are restored to the RGB space through convolution layers to generate the enhanced image. By separately modeling luminance and chrominance information and introducing cross-guided fusion between the two branches, the proposed method effectively improves the overall quality of low-light images in terms of detail preservation, illumination enhancement, and color restoration.
Results and Discussions The proposed network achieves 25.66 dB PSNR / 0.84 SSIM on LOLv1, 24.61 dB PSNR / 0.85 SSIM on LOLv2, and 20.26 dB PSNR / 0.57 SSIM on LSRW, outperforming recent methods such as Retinexformer. Meanwhile, the model contains only 0.059 million parameters and requires 10.06 GFLOPs, demonstrating its lightweight nature and computational efficiency. In addition, low-light face detection experiments conducted on the DARK FACE dataset show that all three detection metrics exceed 50%, indicating good generalization capability under low-light conditions.
Experimental results further show that MCPA is embedded into both the luminance and chrominance branches to extract contextual information at different scales and perform feature reweighting, thereby improving the network’s adaptive modeling ability for multi-scale structural patterns and color variations. The LCA module enables joint modeling of structure and color, where structural information provided by the Y branch dynamically guides the adjustment of chrominance features, effectively improving detail fidelity and color consistency in the enhanced images.
Conclusions The proposed method achieves high-quality low-light image enhancement under resource-constrained conditions. Experimental results on multiple public datasets verify its superior performance, especially in achieving a favorable balance among structural fidelity, color naturalness, and computational efficiency. This study demonstrates that separate modeling of luminance and chrominance information, together with a luminance-guided cross-fusion strategy, can effectively alleviate the fusion artifacts caused by luminance-color coupling in conventional methods.