Advanced Compiler Technologies Publications
2015
Data Access Optimization in a Processing-in-Memory System
Z Sura, A Jacob, T Chen, B Rosenburg, O Sallenave, C Bertolli, S Antao, J Brunheroto, Y Park, K O'Brien, R Nair
Proceedings of the 12th ACM Conference on Computing Frontiers (CF), 2015
Z Sura, A Jacob, T Chen, B Rosenburg, O Sallenave, C Bertolli, S Antao, J Brunheroto, Y Park, K O'Brien, R Nair
Proceedings of the 12th ACM Conference on Computing Frontiers (CF), 2015
2014
Using Multiple Threads to Accelerate Single Thread Performance
Zehra Sura, Kevin OBrien, Jose Brunheroto
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pp. 985--994
Zehra Sura, Kevin OBrien, Jose Brunheroto
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pp. 985--994
Coordinating GPU Threads for OpenMP 4.0 in LLVM
C Bertolli, S Antao, A Eichenberger, K O'Brien, Z Sura, A Jacob, T Chen, O Sallenave
Proceedings of the Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), 2014
C Bertolli, S Antao, A Eichenberger, K O'Brien, Z Sura, A Jacob, T Chen, O Sallenave
Proceedings of the Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), 2014
2011
Loop Transformations: Convexity, Pruning, and Optimization
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan, and Nicolas Vasilache
ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), 2011
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan, and Nicolas Vasilache
ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), 2011
2010
DMATiler: Revisiting Loop Tiling for Direct Memory Access (Extended Abstract)
HaiBo Lin, Tao Liu, Huoding Li, Tong Chen, Lakshminarayanan Renganarayana, Kevin O'Brien, Ling Shao
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
HaiBo Lin, Tao Liu, Huoding Li, Tong Chen, Lakshminarayanan Renganarayana, Kevin O'Brien, Ling Shao
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
Can CPUs Match GPUs on Performance with Productivity?: Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC25033, IBM T.J. Watson Research Center, Yorktown Heights, New York, 2010
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC25033, IBM T.J. Watson Research Center, Yorktown Heights, New York, 2010
COMIC++: A software SVM system for heterogeneous multicore accelerator clusters
J Lee, J Lee, S Seo, J Kim, S Kim, Z Sura
High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on, pp. 1--12
J Lee, J Lee, S Seo, J Kim, S Kim, Z Sura
High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on, pp. 1--12
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan
Supercomputing (SC) 2010
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan
Supercomputing (SC) 2010
A Model for Fusion and Code Motion in an Automatic Parallelizing Compiler
Uday Bondhugula, Oktay Gunluk, Sanjeeb Dash, Lakshminarayanan Renganarayana
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
Uday Bondhugula, Oktay Gunluk, Sanjeeb Dash, Lakshminarayanan Renganarayana
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
Believe it or Not! Multicore CPUs can Match GPUs for FLOP-intensive Applications!
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC24982, IBM TJ Watson Research Center, Yorktown Heights, New York, 2010
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC24982, IBM TJ Watson Research Center, Yorktown Heights, New York, 2010
Automatic Creation of Tile Size Selection Models
Tomofumi Yuki, Lakshminarayanan Renganarayana, Sanjay Rajopadhye, Charles Anderson, Alexandre Eichenberger and Kevin O'Brien
International Symposium on Code Generation and Optimization (CGO), 2010
Tomofumi Yuki, Lakshminarayanan Renganarayana, Sanjay Rajopadhye, Charles Anderson, Alexandre Eichenberger and Kevin O'Brien
International Symposium on Code Generation and Optimization (CGO), 2010
2009
Design and implementation of software-managed caches for multicores with local memory
S Seo, J Lee, Z Sura
High Performance Computer Architecture, 2009, pp. 55--66
S Seo, J Lee, Z Sura
High Performance Computer Architecture, 2009, pp. 55--66
DBDB: optimizing DMATransfer for the cell be architecture
Tao Liu, Haibo Lin, Tong Chen, Kevin O'Brien, Ling Shao
ICS, 2009
Tao Liu, Haibo Lin, Tong Chen, Kevin O'Brien, Ling Shao
ICS, 2009
Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors
Q Lu, C Alias, U Bondhugula, T Henretty, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan, Y Chen, H Lin, others
Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, 2009
Q Lu, C Alias, U Bondhugula, T Henretty, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan, Y Chen, H Lin, others
Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, 2009
Hybrid Iterative and Model-Driven Optimization in the Polyhedral Model
L N Pouchet, U Bondhugula, C Bastoul, A Cohen, R Ramanujam, P Sadayappan
INRIA Research Report 6269, INRIA Saclay, France, 2009
L N Pouchet, U Bondhugula, C Bastoul, A Cohen, R Ramanujam, P Sadayappan
INRIA Research Report 6269, INRIA Saclay, France, 2009
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors
M Baskaran, N Vydyanathan, Uday Bondhugula, J Ramanujam, A Rountev, P Sadayappan
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 219--228, 2009
M Baskaran, N Vydyanathan, Uday Bondhugula, J Ramanujam, A Rountev, P Sadayappan
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 219--228, 2009
Compact Multi-Dimensional Kernel Extraction for Register Tiling
L Renganarayana, Uday Bondhugula, Salem Dersavi, Alexandre E. Eichenberger, Kevin O'Brien
Proceedings of the 22nd International Conference on High Performance Networking and Computing (SC), pp. 1--12, 2009
L Renganarayana, Uday Bondhugula, Salem Dersavi, Alexandre E. Eichenberger, Kevin O'Brien
Proceedings of the 22nd International Conference on High Performance Networking and Computing (SC), pp. 1--12, 2009
Exploiting Parallelism with Dependence - Aware Scheduling
Xiaotong Zhuang, Alexandre E Eichenberger, Yangchun Luo, Kevin O'Brien, Kathryn O'Brien
Proceeding of the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009
Xiaotong Zhuang, Alexandre E Eichenberger, Yangchun Luo, Kevin O'Brien, Kathryn O'Brien
Proceeding of the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009
2008
Positivity, posynomials and tile size selection
L Renganarayana, S Rajopadhye
Proceedings of the 2008 ACM/IEEE conference on Supercomputing (SC), pp. 1--12
L Renganarayana, S Rajopadhye
Proceedings of the 2008 ACM/IEEE conference on Supercomputing (SC), pp. 1--12
Orchestrating data transfer for the cell/BE processor
T Chen, H Lin, T Zhang
Proceedings of the 22nd annual international conference on Supercomputing, pp. 289--298, 2008
T Chen, H Lin, T Zhang
Proceedings of the 22nd annual international conference on Supercomputing, pp. 289--298, 2008
Supporting OpenMP on cell
K O’Brien, K O’Brien, Z Sura, T Chen, T Zhang
International Journal of Parallel Programming 36(3), 289--311, Springer, 2008
K O’Brien, K O’Brien, Z Sura, T Chen, T Zhang
International Journal of Parallel Programming 36(3), 289--311, Springer, 2008
COMIC: a coherent shared memory interface for cell be
J Lee, S Seo, C Kim, J Kim, P Chun, Z Sura, J Kim, S Y Han
Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 303--314, 2008
J Lee, S Seo, C Kim, J Kim, P Chun, Z Sura, J Kim, S Y Han
Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 303--314, 2008
Prefetching irregular references for software cache on cell
T Chen, T Zhang, Z Sura, M G Tallada
Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization, pp. 155--164, 2008
T Chen, T Zhang, Z Sura, M G Tallada
Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization, pp. 155--164, 2008
A novel asynchronous software cache implementation for the Cell-BE processor
J Balart, M Gonzalez, X Martorell, E Ayguade, Z Sura, T Chen, T Zhang, K O’brien, K O’brien
Languages and Compilers for Parallel Computing, 125--140, Springer, 2008
J Balart, M Gonzalez, X Martorell, E Ayguade, Z Sura, T Chen, T Zhang, K O’brien, K O’brien
Languages and Compilers for Parallel Computing, 125--140, Springer, 2008
Supporting OpenMP on cell
K O’Brien, K O’Brien, Z Sura, T Chen, T Zhang
International Journal of Parallel Programming 36(3), 289--311, Springer, 2008
K O’Brien, K O’Brien, Z Sura, T Chen, T Zhang
International Journal of Parallel Programming 36(3), 289--311, Springer, 2008
Hybrid Access-Specific Software Cache Techniques for the Cell BE Architecture
Marc Gonzalez, Nikola Vujic, Alexandre E. Eichenberger, Xavier Martorell, Eduard Ayguade, Tong Chen, Zehra Sura, Tao Zhang, Kevin O'Brien, Kathryn O'Brien
Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT), pp. 292--302, 2008
Marc Gonzalez, Nikola Vujic, Alexandre E. Eichenberger, Xavier Martorell, Eduard Ayguade, Tong Chen, Zehra Sura, Tao Zhang, Kevin O'Brien, Kathryn O'Brien
Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT), pp. 292--302, 2008
2007
Optimizing the Use of Static Buffers for DMA on a CELL Chip
T Chen, Z Sura, K O’Brien, J O’Brien
Languages and Compilers for Parallel Computing, 314--329, Springer, 2007
T Chen, Z Sura, K O’Brien, J O’Brien
Languages and Compilers for Parallel Computing, 314--329, Springer, 2007
2006
Using Advanced Compiler Technology to Exploit the Performance of the Cell Broadband Engine Architecture
A.E. Eichenberger, J. K. O'Brien, K. M. O'Brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. K. Gschwind, R. Archambault, Y. Gao, and R. Koo
IBM System Journal, vol 45, num 1 45(1), 2006
A.E. Eichenberger, J. K. O'Brien, K. M. O'Brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. K. Gschwind, R. Archambault, Y. Gao, and R. Koo
IBM System Journal, vol 45, num 1 45(1), 2006
2005
Efficient SIMD Code Generation for Runtime Alignment and Length Conversion
Peng Wu, Alexandre E. Eichenberger, Amy Wang
Proceeding of the he International Symposium on Code Generation and Optimization (CGO), pp. 153--164, IEEE Computer Society, 2005
Abstract
Peng Wu, Alexandre E. Eichenberger, Amy Wang
Proceeding of the he International Symposium on Code Generation and Optimization (CGO), pp. 153--164, IEEE Computer Society, 2005
Abstract
Year Unknown
Exploiting parallelism with Dependence-Aware scheduling
... , 2009. PACT'09. 18th ..., 2009 - ieeexplore.ieee.org
... , 2009. PACT'09. 18th ..., 2009 - ieeexplore.ieee.org