• 摘要: 本文针对荧光引导手术中腔镜复合成像的实时性与融合质量需求,提出一种基于虚拟双相机模式的近红外荧光与白光复合成像技术。通过基于FPGA的光源与相机精确同步控制及视频流硬件解复用架构,结合双模态联合自动曝光调节算法,实现异构图像流的实时分离与双模态图像曝光平衡。实验表明,该系统同步控制精度达帧级 (16.67 ms),自动曝光收敛时间低于0.8 s,所摄图像结构相似性 (SSIM)为0.9994,峰值信噪比 (PSNR)为46.23 dB。本方案在保持系统小型化、低成本的基础上,实现了接近双相机模式的成像性能,为术中实时影像导航提供了有效支持。

       

      Abstract:
      Objective Fluorescence-guided surgery (FGS) has emerged as a critical technique in modern medical procedures, utilizing fluorophores such as indocyanine green (ICG) to visualize anatomical structures and pathological tissues that are invisible to the naked eye. While effective, the clinical utility of fluorescence imaging relies heavily on its integration with white-light (WL) imaging to provide necessary anatomical context. Conventional dual-camera systems, which employ beam splitters and separate sensors, offer high performance but suffer from bulky form factors, high costs, and physical registration errors. Conversely, single-camera solutions that utilize time-division multiplexing often face significant technical bottlenecks. These include data congestion when processing high-bandwidth 4K video streams via software, and, more critically, the inability to balance exposure between high-dynamic-range white-light scenes and weak fluorescence signals using a single sensor. The primary objective of this study was to overcome these limitations by developing a "virtual dual-camera" architecture. This system aimed to achieve the performance and independent image processing capabilities of a physical dual-camera setup within a single-sensor hardware framework, specifically targeting real-time performance, hardware-level synchronization, and optimized automatic exposure control (AEC) for composite endoscopic imaging.
      Methods The proposed system integrated a custom hardware platform with a novel control architecture. The optical front-end utilized a high-resolution CMOS image sensor (Sony IMX334) paired with a dual-bandpass filter capable of blocking excitation light (780 nm) while transmitting visible light and fluorescence signals (above 820 nm). Illumination was provided by a synchronized dual-LED source delivering high-power white light and 780 nm near-infrared excitation.
      The core innovation lay in the field-programmable gate array (FPGA)-based control logic. The FPGA functioned as the central timing master, generating precise frame synchronization signals. It coordinated the CMOS exposure window with the alternating strobe of white and near-infrared LEDs, achieving frame-level synchronization accuracy. To resolve the bottleneck of software-based frame separation, the system implemented a hardware-level video stream demultiplexing architecture. The FPGA intercepted the raw MIPI CSI-2 video stream from the sensor. Based on the current lighting state (odd frames for white light, even frames for fluorescence), the FPGA routed the image data to two independent direct memory access (DMA) channels and image signal processor (ISP) pipelines. This hardware separation allowed for the decoupling of image processing parameters; the white-light channel utilized standard color correction and gamma settings, while the fluorescence channel employed aggressive black level correction, higher gamma values for contrast enhancement, and temporal noise reduction, which relied on the frame continuity preserved by the hardware splitting.
      Furthermore, a dual-modal joint automatic exposure control (AEC) algorithm was developed to address the drastic intensity disparity between the two imaging modes. The algorithm operated in two distinct phases separated by a single frame delay. In Phase 1, the system calculated the luminance of the white-light frame and adjusted the white LED intensity and the sensor’s global analog gain. Crucially, a constraint strategy prioritized the prevention of white-light overexposure. In Phase 2, utilizing the analog gain fixed during Phase 1, the system optimized the fluorescence image brightness by independently adjusting the near-infrared excitation power and applying digital gain. This decoupled approach ensured that the shared analog gain did not compromise the white-light image quality while still allowing sufficient amplification for the weaker fluorescence signal.
      Results and Discussions Experimental validation was conducted using a porcine liver model injected with ICG, imaged at varying working distances (10 mm, 25 mm, and 40 mm) to simulate clinical conditions. The hardware synchronization mechanism achieved a control precision of 16.67 ms, strictly aligning the exposure windows with the respective light sources and eliminating channel crosstalk.
      The proposed AEC algorithm demonstrated superior stability and speed compared to conventional ISP-based exposure algorithms. Quantitative analysis of convergence time revealed that the system recovered to a stable brightness state within 0.8 seconds (approximately 23 frames for white light and an additional 15–20 frames for fluorescence) even under extreme lighting changes. In contrast, standard AEC algorithms exhibited significant oscillation and "flashing" artifacts due to the coupled feedback loop between the alternating bright and dark frames. The proposed method maintained the white-light luminance within a tight tolerance (50 luma) without the overshoot observed in conventional methods.
      Image quality was assessed using structural similarity index (SSIM) and peak signal-to-noise ratio (PSNR) metrics. At a standard working distance of 10 mm, the system achieved an SSIM of 0.9994 and a PSNR of 46.23 dB for white-light images, indicating negligible information loss compared to reference images. Fluorescence images at the same distance yielded an SSIM of 0.9893 and a PSNR of 42.25 dB. As the working distance increased to 40 mm, the fluorescence image quality naturally degraded due to the reliance on higher digital gain to compensate for signal attenuation, resulting in an SSIM of 0.9224. However, the system successfully maintained diagnostic visibility without introducing significant noise artifacts or false positives, a benefit attributed to the independent ISP pipeline tuning enabled by the virtual dual-camera architecture. Visual inspection confirmed that the fused images retained clear anatomical textures from the white-light channel and distinct, high-contrast tumor margins from the fluorescence channel.
      Conclusions The study successfully implemented a virtual dual-camera imaging system that resolves the fundamental conflicts inherent in single-sensor fluorescence endoscopy. By shifting the frame synchronization and video demultiplexing tasks to the FPGA hardware layer, the design eliminated the latency and bandwidth limitations associated with software processing, enabling smooth 4K real-time imaging. The novel two-stage automatic exposure control algorithm effectively decoupled the brightness regulation of white-light and fluorescence modes, allowing for optimal dynamic range utilization for both anatomy and pathology without physical filter switching or dual-sensor alignment.
      The experimental results confirmed that this architecture delivers image quality comparable to dual-camera systems, with high structural similarity and signal-to-noise ratios, while retaining the cost and size advantages of a single-chip solution. The fast convergence time of the exposure algorithm ensures seamless transitions during intraoperative maneuvering. Consequently, this technology provides a robust, low-cost, and high-performance solution for fluorescence-guided surgery, offering surgeons reliable real-time navigation aids with precise anatomical and functional information fusion. The proposed architecture establishes a viable technical pathway for the next generation of miniaturized, multi-modal endoscopic platforms.