• 摘要: 针对现有对抗性图像隐写方法跨模型泛化能力不足且含密图像质量低的问题,本文提出一种融合被动防御与主动攻击的双策略图像隐写模型。被动防御模块基于生成对抗网络,生成器采用双流U-Net架构同步处理载体图像与其边缘信息,结合SE-Net注意力机制动态分配特征权重,生成更适合信息嵌入的载体图像;主动攻击模块基于神经元归因结果,定位并扰动各类隐写分析模型判别时所依赖的共同关键特征,针对性优化嵌入方案,从而引导各类隐写分析模型做出错误判断。通过动态调整损失权重系数,驱动两个模块性能递进,最终实现全局优化。实验结果表明,生成的含密图像平均PSNR达到40.89 dB,平均SSIM为0.9783,相比于CR-AIS、Natias检测准确率ACC分别降低了3.69%、1.91%,实现了跨模型泛化能力与图像质量的协同提升。

       

      Abstract:
      Objective As a core technology in the field of information security, image steganography holds irreplaceable strategic value in areas such as covert communication and digital copyright protection. Current adversarial image steganography methods commonly suffer from insufficient cross-model generalization capability and low quality of stego images. This is primarily due to the over-reliance of these methods on the decision boundaries of the target steganalysis model, which results in adversarial perturbations being highly coupled with the decision boundaries of the target model, making it difficult to generalize effectively to unknown models. Moreover, when generating adversarial perturbations, the fusion of secret information with the texture of the cover image is often insufficiently natural, especially in low-texture regions where statistical anomalies are prone to emerge, thereby compromising the visual quality and statistical naturalness of the stego images.
      Methods To address the aforementioned issues, this paper proposes a dual-strategy image steganography model that integrates passive defense and active attack. Through the synergistic end-to-end optimization training of cover image enhancement and adversarial embedding, the model significantly improves cross-model generalization capability and the quality of stego images. At the passive defense level, cover image optimization is performed: A dual-stream U-Net structure is introduced into the generator of the generative adversarial network (GAN). This dual-branch network simultaneously processes the original image and edge information derived from conformal monogenic phase congruency (CMPC). Combined with the SE-Net attention mechanism to optimize feature distribution, texture enhancement and detail preservation are achieved during adversarial training. By jointly optimizing the discriminator loss and the pixel-space mean squared error loss, an optimized cover image with high visual quality and statistical naturalness is generated, constructing a secure and robust cover environment for secret information embedding. At the active attack level, adversarial embedding is conducted: integrated gradients are utilized for neuron attribution to locate the critical discriminative features commonly relied upon by various models, followed by dynamic perturbation. The neuron attribution loss, together with the discriminator loss and pixel-space mean squared error loss, jointly optimizes the embedding strategy. This ensures the indistinguishability between the stego image and the original cover image while significantly reducing the detection accuracy of various steganalysis models. Considering the phased differences in the optimization objectives of the proposed model, the cover image optimization network dominates the training in the initial stage, as a high-quality cover image is the foundation of steganography. In the later stage, the adversarial embedding network takes the lead, continuously adjusting the embedding distortion by interfering with the discriminative logic of steganalysis models to enhance the concealment of the stego image. Therefore, a linear dynamic adjustment strategy is adopted, allowing the weights of the two components to transition smoothly and uniformly during the first 100 training epochs. This matches the requirement for phased dominance switching between the two networks, driving the model's performance to advance progressively by stages and achieving a global optimization of concealment, usability, and security.
      Results and Discussions In the loss function weight configuration experiment, considering the phased differences in the model's optimization objectives, a linear dynamic adjustment strategy was adopted to change the weight coefficients k1 and k2, while systematically investigating the effects of different configurations of λ1 and the ratio λ2:λ3:λ4. By comparing the quality of stego images and the detection accuracy of non-targeted steganalysis models, the optimal weight configuration was determined as: k1 decreasing from 0.8 to 0.2, k2 increasing from 0.2 to 0.8, λ1=0.2, and λ2:λ3:λ4=8:3:1. The ablation study, by comparing the ACC against non-targeted steganalysis models as well as the average PSNR and SSIM of stego-images under different architectures, demonstrates that the configuration of the proposed method achieves the best overall performance. Furthermore, comparing the effects of adding the SE-Net attention mechanism at different positions within the dual-stream U-Net network showed that adding it only during the encoding stage yielded the best results. In the experiment analyzing generated image quality and efficiency, the average PSNR and SSIM obtained from 1,000 randomly selected stego images were 40.89 dB and 0.9783, respectively. Three sets of cover images and their corresponding stego images were randomly selected for comparison. The results indicated that the two types of images are not only visually indistinguishable but also exhibit highly similar histogram distributions. To compare generation efficiency, the average generation time for a single stego image was tested. Due to its complex network architecture, the proposed method is slightly less efficient in generation speed compared to other schemes, but it significantly enhances the quality of cover images. This demonstrates that the proposed method provides an effective guarantee for improving security and image quality at an acceptable efficiency cost. The cross-dataset generalization capability analysis experiment showed that the proposed method achieved the lowest average ACC, indicating good cross-dataset generalization ability. This is because the proposed dual-strategy model generates optimized covers with consistent structural properties and perturbs common discriminative features, effectively reducing the model's dependence on statistical features specific to a particular training dataset. Consequently, it exhibits stable and good anti-detection performance across different datasets. In the experiment on anti-steganalysis capability, a comparison was made with two state-of-the-art adversarial embedding methods (CR-AIS and Natias). The experimental results showed that, in terms of resilience against non-targeted steganalysis models, the proposed method achieved average detection accuracy (ACC) reduction margins of 3.69% and 1.91% compared to CR-AIS and Natias, respectively. This indicates that the proposed method possesses strong resilience against non-targeted steganalysis models and exhibits good cross-model generalization capability.
      Conclusions Through the deep integration of passive defense (dual-stream U-Net-guided cover optimization) and active attack (neuron attribution-driven adversarial embedding), the proposed image steganography model significantly reduces the detection accuracy of various steganalysis models, enhances cross-model generalization capability, and maintains high image quality, thereby achieving synergistic improvement in steganographic security and visual fidelity.