FPGA在辐照环境下的故障注入系统研究

薛晓良, 苏海冰, 舒怀亮, 等. FPGA在辐照环境下的故障注入系统研究[J]. 光电工程, 2019, 46(12): 180549. doi: 10.12086/oee.2019.180549
引用本文: 薛晓良, 苏海冰, 舒怀亮, 等. FPGA在辐照环境下的故障注入系统研究[J]. 光电工程, 2019, 46(12): 180549. doi: 10.12086/oee.2019.180549
Xue Xiaoliang, Su Haibing, Shu Huailiang, et al. Research on fault injection system of FPGA in irradiation environment[J]. Opto-Electronic Engineering, 2019, 46(12): 180549. doi: 10.12086/oee.2019.180549
Citation: Xue Xiaoliang, Su Haibing, Shu Huailiang, et al. Research on fault injection system of FPGA in irradiation environment[J]. Opto-Electronic Engineering, 2019, 46(12): 180549. doi: 10.12086/oee.2019.180549

FPGA在辐照环境下的故障注入系统研究

详细信息
    作者简介:
    通讯作者: 苏海冰(1969-),男,博士,研究员,主要从事电子学系统设计与仿真测试的研究。E-mail:suhaibing@msn.com
  • 中图分类号: V302.8

Research on fault injection system of FPGA in irradiation environment

More Information
  • 对Xilinx SRAM型FPGA的配置RAM的帧物理组织进行了研究,给出了提取帧结构的方法,并给出了比特流中帧的排列顺序;分析了SEM IP核的中间文件的结构并给出了提取必要位的方法,通过对必要位进行0/1翻转,用以模拟辐射环境下FPGA易出现的单粒子翻转问题;设计了PC端界面以实现完整的人机交互。故障注入系统在FPGA片上实现,通过内部ICAP接口实现对配置数据的读写,无需处理器参与。通过对待测电路的必要位逐位进行翻转及修复测试后对每个位进行了分类,分类结果可用于在后续故障修复中对特殊位进行重点防护。

  • Overview: SRAM FPGAs have attracted increasing attentions in aerospace applications due to their low cost, rich logic resources, and reconfigurability. However, SRAM cells are highly susceptible to the effects of radiations, manifested as single event upsets (SEU), thus hindering the applicability of FPGA in the aerospace field. The configuration RAM (CRAM) is the largest number of memory cells in FPGA chip. Considering the direct impact of CRAM on the user circuit logic, the research object of this paper is CRAM. In order to test the failure rate of CRAM in radiation environment, the FPGA needs to be irradiated under the accelerator beam, which can simulate the space environment more realistically. However, it is expensive and the test period is long. Therefore, the artificially designed fault injection system to simulate the SEU can quickly and inexpensively test the reliability of the design on the FPGA. Injecting fault into CRAM can be achieved through the external interface (JTAG or SelectMAP) or the internal interface (internal configuration access port, ICAP). For internal fault injection, most designs use on-board processors. Starting from Virtex-6/Spartan-6, Xilinx provides a PicoPlaze-based SEM (soft error mitigation) IP core, which can implement fault injection, fault repair, fault classification, and other functions. Since the PicoPlaze processor does not have an official C compiler and the instruction space is extremely small (1024 words), the SEM controller cannot be flexibly reprogrammed to design a different fault repair mechanism. The Virtex-5 series FPGAs studied in this paper do not have a dedicated SEM IP core which is officially provided, so a self-designed fault injection system is required. This article studied the frame structure of Xilinx FPGA CRAM, giving the method of extracting the frame structure and providing the order of frames in the bit stream file. The structure of the intermediate file of SEM IP core is also analyzed to get the positions of essential bits. Performing 0/1 flipping on the essential bits is a way to simulate the SEU problem. A PC-side interface is designed to implement a human-machine interaction. The fault injection system is implemented on FPGA chip, and the read and write of the CRAM data are realized through ICAP without the need of the processor. The fault injection system is placed on resources that are not used by the circuit under test, occupying about one percent of the FPGA resources, which greatly saves resource overhead. The operation of flipping and repairing test classifies essential bits into the following categories: the non-critical and repairable, the non-critical and unrepairable, the critical and repairable, the critical and unrepairable, and the residual bits that affect other non-masked bits in the same frame. The classification results can be used to protect key bits in subsequent fault repairing. In addition, a fault injection test on the triple modular redundancy (TMR) circuit is performed to verify the effectiveness of TMR for SEU protection. For TMR circuit, the proportion of its key bits will be greatly reduced but not to zero, which indicates that the TMR can reduce failure rate caused by SEU but cannot completely avoid this fault. Since TMR cannot eliminate the accumulation of SEU faults, it is necessary to supplement other fault-tolerant measures such as internal scrubbing and external scrubbing in practical engineering applications.

  • 加载中
  • 图 1  帧结构解析过程

    Figure 1.  The process of parsing frame structure

    图 2  故障注入系统及DUT的必要位在FPGA上的分布组织。(a) Matlab所绘图示;(b) PlanAhead下图示;(c) FPGA editor下图示

    Figure 2.  Distributions of the fault injection system and DUT-related essential bits on FPGA (a) by Matlab, (b) under PlanAhead tool, (c) under FPGA editor tool

    图 3  故障注入系统架构图

    Figure 3.  The architecture diagram of the fault injection system

    图 4  故障注入系统FPGA端程序框图

    Figure 4.  Block diagram of the FPGA-side fault injection system

    图 5  故障注入系统PC端程序框图

    Figure 5.  Block diagram of the PC-side fault injection system

    图 6  故障注入系统PC端界面

    Figure 6.  The interface of PC-side fault injection system

    表 1  XC5VFX70T器件每行的帧结构

    Table 1.  Frame structure of each raw of the XC5VFX70T device

    列类型 每列帧数 列个数 对应的主地址
    IOB 54 3 0, 24, 44
    BRAM配置 30 6 5, 12, 19, 30, 39, 49
    DSP 28 2 33, 36
    Clock 4 1 25
    PPC 32 1 50
    CLB 36 38 其他
    BRAM内容 128 6 5, 12, 19, 30, 39, 49
    下载: 导出CSV

    表 2  故障注入测试结果统计

    Table 2.  Results of fault injection testing

    DUT 必要位个数(1) 非关键可修复位占比(2)/% 关键可修复位占比(3)/% 影响了同帧的其他非掩码的位占比/%
    ADDER4B 969 58.204 41.589 0.206
    ADDER4B_TMR 3006 99.201 0.749 0.050
    ADDER8B 4170 57.530 42.422 0.048
    ADDER8B_TMR 12287 98.836 1.139 0.024
    MUL8B 17759 44.316 55.183 0.501
    MUL8B_TMR 55309 97.273 2.238 0.488
    注:(1)必要位即为与DUT设计相关的位;(2)非关键可修复位即对该位进行翻转不会影响DUT的功能逻辑,且该位翻转后可被修复,其占比为非关键可修复位个数与必要位个数之比;(3)关键可修复位即对该位进行翻转会导致DUT功能测试期间出现错误,但该位翻转后可被修复,其占比为关键可修复位个数与必要位个数之比。
    下载: 导出CSV
  • [1]

    兰风宇. Xilinx Virtex-7 FPGA软错误减缓技术研究[D].哈尔滨: 哈尔滨工业大学, 2016.

    Lan F Y. Soft error mitigation techniques for Xilinx Virtex-7 FPGA[D]. Harbin: Harbin University of Technology, 2016.

    [2]

    王忠明. SRAM型FPGA的单粒子效应评估技术研究[D].北京: 清华大学, 2011.

    Wang Z M. Techniques for evaluating single-event effect in SRAM-based FPGAs[D]. Beijing: Tsinghua University, 2011.

    [3]

    Xilinx Inc. Device reliability report[R]. UG116(v10.8), 2017.

    [4]

    Xilinx Inc. LogiCORE IP soft error mitigation controller[R]. PG036(v4.1), 2017.

    [5]

    Hussein J, Swift G. Mitigating single-event upsets[R]. WP395(v1.1), Xilinx Inc., 2015.

    [6]

    Gong L K, Wu T, Nguyen N T H, et al. A Programmable Configuration Controller for fault-tolerant applications[C]//Proceedings of 2016 International Conference on Field-Programmable Technology, 2016: 117–124.

    [7]

    Xilinx Inc. Virtex-5 FPGA configuration user guide[R]. UG191(v3.12), Xilinx Inc., 2017.

    [8]

    Soni R K. Open-source bitstream generation for FPGAs[D]. Blacksburg, Virginia: Virginia Polytechnic Institute and State University, 2013.

    [9]

    Le R. Soft error mitigation using prioritized essential bits[R]. XAPP538(v1.0), Xilinx Inc., 2012.

    [10]

    Chapman K. SEU strategies for virtex-5 devices[R]. XAPP864(v2.0), Xilinx Inc., 2010.

    [11]

    Xilinx Inc. Virtex-5 libraries guide for HDL designs[R]. UG621(v14.7), Xilinx Inc., 2013.

    [12]

    Nunes J L, Cunha J C, Barbosa R, et al. Evaluating Xilinx SEU Controller Macro for fault injection[C]//Proceedings of the 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2013: 1–2.

  • 加载中

(6)

(2)

计量
  • 文章访问数:  6136
  • PDF下载数:  2261
  • 施引文献:  0
出版历程
收稿日期:  2018-10-26
修回日期:  2019-01-15
刊出日期:  2019-12-01

目录

/

返回文章
返回