基于和表的互相关计算方法在超声弹性成像中的性能分析

彭博, 罗莎莎, 杨烽, 等. 基于和表的互相关计算方法在超声弹性成像中的性能分析[J]. 光电工程, 2019, 46(6): 180437. doi: 10.12086/oee.2019.180437
引用本文: 彭博, 罗莎莎, 杨烽, 等. 基于和表的互相关计算方法在超声弹性成像中的性能分析[J]. 光电工程, 2019, 46(6): 180437. doi: 10.12086/oee.2019.180437
Peng Bo, Luo Shasha, Yang Feng, et al. Performance analysis of a sum-table-based method for computing cross-correlation in GPU-accelerated ultrasound strain elastography[J]. Opto-Electronic Engineering, 2019, 46(6): 180437. doi: 10.12086/oee.2019.180437
Citation: Peng Bo, Luo Shasha, Yang Feng, et al. Performance analysis of a sum-table-based method for computing cross-correlation in GPU-accelerated ultrasound strain elastography[J]. Opto-Electronic Engineering, 2019, 46(6): 180437. doi: 10.12086/oee.2019.180437

基于和表的互相关计算方法在超声弹性成像中的性能分析

  • 基金项目:
    四川省苗子工程重点项目(2018RZ0093);四川省南充市校科技战略合作项目(NC17SY4020)
详细信息
    作者简介:
  • 中图分类号: TB872

Performance analysis of a sum-table-based method for computing cross-correlation in GPU-accelerated ultrasound strain elastography

  • Fund Project: Supported by Scientific Innovation Program of Sichuan Province (Major Engineering Project: 2018RZ0093) and Nanchong Scientific Council (Strategic Cooperation Program Between University and City: NC17SY4020)
More Information
  • 互相关计算方法的性能对超声弹性成像运动的计算效率起着决定性的作用。在串行计算环境下,基于和表的快速互相关算法在维持计算精度的同时可以获得更快的计算效率。然而,在并行计算环境下,尤其是GPU平台上,基于和表的快速互相关算法的实现以及性能还没有相关的报道。在本研究中,以二维超声弹性成像的运动追踪应用为目标,基于和表的快速互相关算法(ST-NCC)在GPU平台上得以实现,并且从计算效率及计算精度和传统的互相关算法进行了详细比较。初步结果显示,虽然基于和表的快速互相关算法(ST-NCC)在串行计算环境下获得了较好的计算效率,但是在GPU环境下,两种方法的计算效率没有较大的差距。

  • Overview: In our ultrasound strain elastography system, a modified block-matching algorithm is adopted to assess tissue motion. Then, local strains are assessed and used as surrogates of tissue elasticity. The calculation of correlation under the framework of the block-matching algorithm is a critical step and very computationally intensive. Because the correlation calculation is largely independent, graphics processing units (GPUs) have been utilized to improve computational efficiency through massive parallel programming. It is known in the literature that the sum-table based method can greatly reduce the computing burden when the calculation of the normalized correlation coefficient is needed in a serial computing environment. The sum-table based method is abbreviated as ST-NCC below. However, the performance of ST-NCC is yet to be investigated given a parallel computing platform, particularly, in a GPU environment. Consequently, our objective of this study is to investigate the performance of the ST-NCC method for the above-mentioned GPU-accelerated ultrasound strain elastography. More specifically, a published ST-NCC method by Luo et al. and the conventional NCC method were both programmed using CUDA (Version 9.0, NVIDIA Inc., CA, USA) and tested on an NVIDIA GeForce GTX TITAN X card. During the CUDA implementation, in order to achieve the best computational efficiency, two basic CUDA programming strategies were employed to improve computational efficiency for all CUDA implementation. First, in order to increase the memory bandwidth of GPUs, TEXTURE (memory) access was used for storing 2-D RF signals prior to the calculation of cross correlation. Second, programming variables that require frequent access (e.g., axial and lateral search ranges) were locked in read-only memory for rapid access. In terms of advanced CUDA programming strategies, on the one hand, a classic parallel scan method was adopted to generate those sum-table data for the ST-NCC method. On the other hand, a few different on-ship memory optimization strategies were used to implement the classic NCC method and they were compared against each other. Only the computationally most efficient implementation was used to compare with the above-mentioned GPU-accelerated ST-NCC method. Finally, performance assessments were conducted using simulated ultrasound data. Ultrasound data simulations involve both finite element modeling and acoustic simulations. Both displacement tracking accuracy and computational efficiency were evaluated during the performance assessments. Based on data investigated, we found that, under the GPU platform, the implemented ST-NCC method did not further improve the computational efficiency, as compared to the classic NCC method implemented into the same GPU platform. Comparable displacement tracking accuracy was obtained by both methods.

  • 加载中
  • 图 1  超声散斑运动追踪流程示意图

    Figure 1.  Diagram of ultrasonic speckle motion tracking process

    图 2  参考窗口和表的并行扫描构建

    Figure 2.  An illustration of the parallel scan method for calculating the sum-table

    图 3  不同片上优化策略下算法的计算时间对比

    Figure 3.  Comparison of computation time for method 1 under different optimization strategies

    图 4  两种方法的运动追踪性能比较。(a)和(d)为用方法一的CPU串行实现计算得到横向和轴向位移;(b)和(e)为由(a)和(d)计算得到的对应横向和轴向应变;(c)和(f)为两种方法的CPU串行实现之间的横向和轴向位移差;(g)和(i)为方法一的CPU串行实现与GPU并行实现之间的横向和轴向位移差;(h)和(j)为两种方法的GPU并行实现之间的横向和轴向位移差

    Figure 4.  A comparative performance analysis of CPU and GPU implementations. (a) and (d) are lateral and axial displacements obtained using CPU implementation of method 1; (b) and (e) are corresponding strain images; (c) and (f) are difference images of lateral and axial displacements between two methods on CPU; (g) and (i) are difference images of lateral and axial displacements between the CPU and GPU implementations of method 1; (h) and (j) are the difference images of displacement between GPU implementation of method 1 and method 2

    图 5  不同互相关追踪窗口两种算法计算时间。(a)两种方法CPU实现随互相关追踪窗口变化时的计算时间;(b)两种方法GPU实现随互相关追踪窗口变化时的计算时间

    Figure 5.  Comparison of computation time of two method under different cross-correlation tracking windows. (a) Computation time of CPU implementation of the two methods; (b) Computation time of GPU implementation of the two methods

    图 6  不同搜索范围条件下两种算法计算时间。(a)两种方法CPU实现随搜索范围变化时的计算时间;(b)两种方法GPU实现随搜索范围变化时的计算时间

    Figure 6.  Comparison of computation time of two methods under different search ranges. (a) Computation time of CPU implementation of two methods; (b) Computation time of GPU implementation of two methods

  • [1]

    Jiang J, Hall T J. A parallelizable real-time motion tracking algorithm with applications to ultrasonic strain imaging[J]. Physics in Medicine & Biology, 2007, 52(13): 3773-3790. doi: 10.1088/0031-9155/52/13/008

    [2]

    Chen L J, Treece G M, Lindop J E, et al. A quality-guided displacement tracking algorithm for ultrasonic elasticity imaging[J]. Medical Image Analysis, 2009, 13(2): 286-296. doi: 10.1016/j.media.2008.10.007

    [3]

    Peng B, Wang Y Q, Hall T J, et al. A GPU-accelerated 3-D coupled subsample estimation algorithm for volumetric breast strain elastography[J]. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2017, 64(4): 694-705. doi: 10.1109/TUFFC.2017.2661821

    [4]

    Zhou Y J, Zheng Y P. A motion estimation refinement framework for real-time tissue axial strain estimation with freehand ultrasound[J]. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2010, 57(9): 1943-1951. doi: 10.1109/TUFFC.2010.1642

    [5]

    Luo J W, Konofagou E E. A fast normalized cross-correlation calculation method for motion estimation[J]. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2010, 57(6): 1347-1357. doi: 10.1109/TUFFC.2010.1554

    [6]

    Zhu Y N, Hall T J. A modified block matching method for real-time freehand strain imaging[J]. Ultrasonic Imaging, 2002, 24(3): 161-176. doi: 10.1177/016173460202400303

    [7]

    D'Hooge J, Bijnens B, Thoen J, et al. Echocardiographic strain and strain-rate imaging: a new tool to study regional myocardial function[J]. IEEE Transactions on Medical Imaging, 2002, 21(9): 1022-1030. doi: 10.1109/TMI.2002.804440

    [8]

    Konofagou E E, D'Hooge J, Ophir J. Myocardial elastography--a feasibility study in vivo[J]. Ultrasound in Medicine & Biology, 2002, 28(4): 475-482. doi: 10.1016/S0301-5629(02)00488-X

    [9]

    Lewis J P. Fast template matching[J]. Proceeding of Vision Interface, 1995, 32(4): 351-361. http://d.old.wanfangdata.com.cn/Periodical/shjtdxxb200005031

    [10]

    Yang X, Deka S, Righetti R. A hybrid CPU-GPGPU approach for real-time elastography[J]. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2011, 58(12): 2631-2645. doi: 10.1109/TUFFC.2011.2126

    [11]

    彭博, 黄丽. GPU加速的高精度位移估计方法及超声弹性成像应用[J].光电工程, 2016, 43(6): 83-88. doi: 10.3969/j.issn.1003-501X.2016.06.014

    Peng B, Huang L. GPU-accelerated sub-sample displacement estimation method for real-time ultrasound elastography[J]. Opto-Electronic Engineering, 2016, 43(6): 83-88. doi: 10.3969/j.issn.1003-501X.2016.06.014

    [12]

    彭博, 谌勇, 刘东权.基于GPU的超声弹性成像并行实现研究[J].光电工程, 2013, 40(5): 97-105. doi: 10.3969/j.issn.1003-501X.2013.05.014

    Peng B, Chen Y, Liu D Q. Investigation of GPU-based ultrasound elastography[J]. Opto-Electronic Engineering, 2013, 40(5): 97-105. doi: 10.3969/j.issn.1003-501X.2013.05.014

    [13]

    Rosenzweig S, Palmeri M, Nightingale K. GPU-based real-time small displacement estimation with ultrasound[J]. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2011, 58(2): 399-405. doi: 10.1109/TUFFC.2011.1817

    [14]

    Chang L W, Hsu K H, Li P C. GPU-based color Doppler ultrasound processing[C]//2009 IEEE International Ultrasonics Symposium. Rome, Italy, 2009.

    [15]

    Sun X, Wang S S, Song J J, et al. Toward parallel optimal computation of ultrasound computed tomography using GPU[J]. Proceedings of SPIE, 2018, 10580: 105800R.

    [16]

    Sengupta S, Harris M, Garland M, et al. Efficient parallel scan algorithms for GPUs[M]//Kurzak J, Bader D A, Dongarra J. Scientific Computing with Multicore and Accelerators. Boca Raton: Taylor & Francis, 2008.

    [17]

    Blelloch G E. Scans as primitive parallel operations[J]. IEEE Transactions on Computers, 2002, 38(11): 1526-1538. doi: 10.1109/12.42122

    [18]

    Jensen J A. Field: A program for simulating ultrasound systems[J]. Medical & Biological Engineering & Computing, 1996, 34(1): 351-352. http://d.old.wanfangdata.com.cn/Periodical/nygcxb201624016

    [19]

    Luo J W, Bai J, He P, et al. Axial strain calculation using a low-pass digital differentiator in ultrasound elastography[J]. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2004, 51(9): 1119-1127. doi: 10.1109/TUFFC.2004.1334844

    [20]

    Du H N, Liu J, Pellot-Barakat C, et al. Optimizing multicompression approaches to elasticity imaging[J]. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2006, 53(1): 90-99. doi: 10.1109/TUFFC.2006.1588394

  • 加载中

(6)

计量
  • 文章访问数:  6191
  • PDF下载数:  2023
  • 施引文献:  0
出版历程
收稿日期:  2018-05-17
修回日期:  2018-12-25
刊出日期:  2019-06-01

目录

/

返回文章
返回