Hiroshi Inoue  Hiroshi Inoue photo         

contact information

Ph.D., Research Staff Member
IBM Research - Tokyo


Professional Associations

Professional Associations:  ACM SIGPLAN  |  Information Processing Society of Japan (IPSJ)

"Efficient Tomographic Reconstruction For Commodity Processors with Limited Memory Bandwidth"
Hiroshi Inoue
The 2016 IEEE International Symposium on Biomedical Imaging (ISBI 2016), Prague, Czech Republic, April 13-16, 2016. pp 747-750.

Full text [PDF]: ISBI2016_RabbitCT.pdf
Poster [PDF]: ISBI2016_poster.pdf

Three-dimensional (3D) computed tomography (CT) is one of the key components of many clinical workflows. Because CT reconstruction has been known as a compute-intensive workload, accelerating this workload using special-purpose accelerators, such as GPUs and FPGAs, or multi-socket server-grade processors has been widely studied. Due to recent advances in semiconductor technologies, even commodity processors, such as those used in PCs, can provide sufficient computing power for CT reconstruction by multiple cores with vector processing units. Despite their huge computing power, commodity processors often provide limited system memory bandwidth compared to server-grade processors due to constraints in cost and energy consumption. In this paper, we describe our memory-optimization technique and its implementation targeting on general-purpose processors with limited memory bandwidth. By reducing the memory-bandwidth requirement with batch processing, the memory optimization achieved up to 80% performance improvements in RabbitCT, a widely-used CT benchmark, on a quad-core processor with limited memory bandwidth. Without the memory optimization, the performance did not scale with more than two cores. The implementation can process about 40 projection images per second for the most common problem size of 512^3 with only four cores used. It is therefore practical to use such commodity processors in real CT systems without additional accelerators, which trade greatly increased cost and energy consumption for higher throughput.

Copyright (c) 2016 by IEEE. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.