Duan Bo (段勃),Wang Wendi,Tan Guangming,Meng Dan.[J].高技术通讯(英文),2014,20(4):333~345
Single-particle 3D reconstruction on specialized stream architecture and comparison with GPGPUs
英文关键词: Stream architecture, general purpose graphic processing unit (GPGPU), field programmable gate array (FPGA), cryo-EM
Author NameAffiliation
Duan Bo (段勃)  
Wang Wendi  
Tan Guangming  
Meng Dan  
Hits: 916
Download times: 728
      The wide acceptance and data deluge in medical imaging processing require faster and more efficient systems to be built. Due to the advances in heterogeneous architectures recently, there has been a resurgence in the first research aimed at FPGA-based as well as GPGPU-based accelerator design. This paper quantitatively analyzes the workload, computational intensity and memory performance of a single-particle 3D reconstruction application, called EMAN, and parallelizes it on CUDA GPGPU architectures and decouples the memory operations from the computing flow and orchestrates the thread-data mapping to reduce the overhead of off-chip memory operations. Then it exploits the trend towards FPGA-based accelerator design, which is achieved by offloading computing-intensive kernels to dedicated hardware modules. Furthermore, a customized memory subsystem is also designed to facilitate the decoupling and optimization of computing dominated data access patterns. This paper evaluates the proposed accelerator design strategies by comparing it with a parallelized program on a 4-cores CPU. The CUDA version on a GTX480 shows a speedup of about 6 times. The performance of the stream architecture implemented on a Xilinx Virtex LX330 FPGA is justified by the reported speedup of 2.54 times. Meanwhile, measured in terms of power efficiency, the FPGA-based accelerator outperforms a 4-cores CPU and a GTX480 by 7.3 times and 3.4 times, respectively.
View Full Text   View/Add Comment  Download reader
