Abstract:
Objective The manufacturing of printed circuit boards (PCBs) constitutes the foundational infrastructure of the contemporary electronics industry. As the sector evolves in response to consumer demands for increasingly miniaturized, sophisticated, and high-performance devices, the physical characteristics of PCBs have undergone significant transformation. Modern boards are characterized by high-density interconnects, multi-layered architectures, and microscopic component placement. In this context, the role of quality control has shifted from a supplementary process to a critical determinant of product viability and yield. Traditionally, defect detection has relied on manual inspection or standard automated optical inspection (AOI) systems. However, these conventional methods are increasingly challenged by the escalating complexity of modern PCB designs. Manual inspection is labor-intensive, subjective, and prone to fatigue-induced errors, rendering it unsuitable for mass production. Similarly, traditional AOI systems based on pattern matching or simple image processing often struggle with the subtle variations and complex backgrounds found in high-density PCBs. Consequently, there is a growing necessity for more intelligent and adaptive inspection solutions capable of discerning defects with high precision. Current state-of-the-art detection methods, particularly those leveraging deep learning paradigms, encounter a distinct set of challenges when applied to the specific constraints of industrial manufacturing environments. The first challenge is the significant computational burden associated with high-precision models. Accurately detecting minute defects typically necessitates high-resolution imagery; however, processing such data with deep convolutional neural networks (CNNs) or complex Transformer models consumes substantial computational resources and memory. This computational demand often translates to prohibitive hardware costs for widespread deployment on factory floors. The second challenge is scale variation, which presents a persistent hurdle. PCB defects manifest in widely varying sizes, from tiny pinholes barely visible to the naked eye to larger structural errors like open circuits. A detection model must possess the multi-scale capability to identify these diverse anomalies with uniform accuracy. The third challenge is the imperative for real-time performance. In high-velocity production lines, inspection algorithms must operate at speeds commensurate with manufacturing throughput to avoid creating bottlenecks. Unfortunately, many existing high-accuracy models are computationally too intensive for real-time inference, forcing manufacturers to make an undesirable trade-off between inspection speed and detection precision. These limitations highlight the need for an optimized architecture that balances resource efficiency with robust detection capabilities.
Methods To address these pressing challenges effectively, this study proposes LightRT-DETR, a lightweight real-time detection transformer specifically optimized for PCB defect detection. Building upon the architectural principles of the real-time detection Transformer (RT-DETR), our approach introduces a series of strategic modifications designed to mitigate computational costs significantly while maintaining, and in some aspects enhancing, detection accuracy. The design of LightRT-DETR is characterized by three primary technical innovations: the implementation of the EfficientFastGLU-18 backbone, the integration of a cascaded group attention (CGA) mechanism, and the development of an efficient selective feature pyramid network (ES-FPN). These components work in concert to refine the model's ability to extract relevant features from complex backgrounds without incurring the heavy computational penalties associated with standard transformer architectures. The overarching goal is to create a model that is both agile enough for edge deployment and accurate enough to meet rigorous quality standards. To resolve the inefficiency of traditional backbones like ResNet-18 when deployed on edge computing devices, we propose the integration of the EfficientFastGLU-18 backbone. This novel structure synthesizes partial convolution (PConv) with convolutional gated linear units (GLU). PConv operates by applying convolution to only a subset of input channels while leaving the remainder untouched for the next layer. This strategy effectively reduces memory access and floating-point operations (FLOPs) without compromising the flow of information. By coupling this efficient convolution strategy with GLU, which provides robust non-linear transformation capabilities, the backbone maintains strong feature extraction abilities. This is particularly crucial for identifying defects against the complex, texture-rich backgrounds of PCBs, where differentiating between a valid trace and a defect requires discerning subtle feature variations. Furthermore, to address the computational expense of standard multi-head attention mechanisms, we introduce a cascaded group attention (CGA) mechanism. Instead of processing all feature channels simultaneously, CGA splits feature maps into distinct groups and processes them through a cascaded structure. This approach allows the network to refine features progressively, as the output of one group's attention computation can inform the next, creating a hierarchical refinement process. This mechanism appears particularly beneficial for defects possessing intricate or fine-grained details, such as "mouse bites" or "spurs", which are frequently overlooked by coarser, global attention mechanisms. Finally, we tackle multi-scale defect detection through the efficient selective feature pyramid network (ES-FPN). This redesigned module employs selective feature integration and incorporates RetBlockC3 modules enhanced with manhattan self-attention (MaSA). Unlike standard self-attention, MaSA is specifically designed to capture long-range dependencies in horizontal and vertical directions. Given that PCB layouts consist largely of orthogonal lines, this domain-specific adaptation enables more effective recognition of defects across widely varying scales by aligning the attention mechanism with the geometric priors of the PCB domain.
Results and Discussions The efficacy of LightRT-DETR was validated through extensive experiments on a public PCB defect dataset, revealing a significant improvement in both efficiency and accuracy compared to current state-of-the-art models, including the widely utilized YOLO series. In terms of quantitative performance, LightRT-DETR achieved a mean average precision (mAP50) of 98.3% and a stricter metric, mAP50-95, of 52.9%. When benchmarked against the advanced YOLOv9e model, our approach demonstrated an improvement of 3.0% in mAP50 and 1.8% in mAP50-95. These metrics suggest that the model is not only capable of detecting defects but also does so with high localization accuracy, minimizing false positives and false negatives—a critical requirement for industrial quality control. The experimental setup ensured fair comparisons by utilizing consistent hyperparameters and training environments, further reinforcing the validity of the observed performance gains.
Conclusions Perhaps the most distinct advantage of LightRT-DETR lies in its computational efficiency. The model requires approximately 37.2 GFLOPs, representing an 80.3% reduction in computational cost compared to the baseline RT-DETR model. Similarly, the parameter count is reduced to 12.8 million, a 77.7% decrease. This substantial reduction has practical implications: it facilitates deployment on cost-effective hardware with limited computing power or allows the system to operate at significantly higher frame rates on standard hardware.