Advanced Compiler Technologies       

links

Advanced Compiler Technologies Publications



2015

Data Access Optimization in a Processing-in-Memory System
Z Sura, A Jacob, T Chen, B Rosenburg, O Sallenave, C Bertolli, S Antao, J Brunheroto, Y Park, K O'Brien, R Nair
Proceedings of the 12th ACM Conference on Computing Frontiers (CF), 2015


2014

Using Multiple Threads to Accelerate Single Thread Performance
Zehra Sura, Kevin OBrien, Jose Brunheroto
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pp. 985--994

Coordinating GPU Threads for OpenMP 4.0 in LLVM
C Bertolli, S Antao, A Eichenberger, K O'Brien, Z Sura, A Jacob, T Chen, O Sallenave
Proceedings of the Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), 2014


2011

Loop Transformations: Convexity, Pruning, and Optimization
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan, and Nicolas Vasilache
ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), 2011


2010

DMATiler: Revisiting Loop Tiling for Direct Memory Access (Extended Abstract)
HaiBo Lin, Tao Liu, Huoding Li, Tong Chen, Lakshminarayanan Renganarayana, Kevin O'Brien, Ling Shao
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010

Can CPUs Match GPUs on Performance with Productivity?: Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC25033, IBM T.J. Watson Research Center, Yorktown Heights, New York, 2010

COMIC++: A software SVM system for heterogeneous multicore accelerator clusters
J Lee, J Lee, S Seo, J Kim, S Kim, Z Sura
High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on, pp. 1--12

Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan
Supercomputing (SC) 2010

A Model for Fusion and Code Motion in an Automatic Parallelizing Compiler
Uday Bondhugula, Oktay Gunluk, Sanjeeb Dash, Lakshminarayanan Renganarayana
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010

Believe it or Not! Multicore CPUs can Match GPUs for FLOP-intensive Applications!
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC24982, IBM TJ Watson Research Center, Yorktown Heights, New York, 2010

Automatic Creation of Tile Size Selection Models
Tomofumi Yuki, Lakshminarayanan Renganarayana, Sanjay Rajopadhye, Charles Anderson, Alexandre Eichenberger and Kevin O'Brien
International Symposium on Code Generation and Optimization (CGO), 2010


2009

Design and implementation of software-managed caches for multicores with local memory
S Seo, J Lee, Z Sura
High Performance Computer Architecture, 2009, pp. 55--66

DBDB: optimizing DMATransfer for the cell be architecture
Tao Liu, Haibo Lin, Tong Chen, Kevin O'Brien, Ling Shao
ICS, 2009

Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors
Q Lu, C Alias, U Bondhugula, T Henretty, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan, Y Chen, H Lin, others
Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, 2009

Hybrid Iterative and Model-Driven Optimization in the Polyhedral Model
L N Pouchet, U Bondhugula, C Bastoul, A Cohen, R Ramanujam, P Sadayappan
INRIA Research Report 6269, INRIA Saclay, France, 2009

Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors
M Baskaran, N Vydyanathan, Uday Bondhugula, J Ramanujam, A Rountev, P Sadayappan
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 219--228, 2009

Compact Multi-Dimensional Kernel Extraction for Register Tiling
L Renganarayana, Uday Bondhugula, Salem Dersavi, Alexandre E. Eichenberger, Kevin O'Brien
Proceedings of the 22nd International Conference on High Performance Networking and Computing (SC), pp. 1--12, 2009

Exploiting Parallelism with Dependence - Aware Scheduling
Xiaotong Zhuang, Alexandre E Eichenberger, Yangchun Luo, Kevin O'Brien, Kathryn O'Brien
Proceeding of the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009


2008

Positivity, posynomials and tile size selection
L Renganarayana, S Rajopadhye
Proceedings of the 2008 ACM/IEEE conference on Supercomputing (SC), pp. 1--12

Orchestrating data transfer for the cell/BE processor
T Chen, H Lin, T Zhang
Proceedings of the 22nd annual international conference on Supercomputing, pp. 289--298, 2008

Supporting OpenMP on cell
K O’Brien, K O’Brien, Z Sura, T Chen, T Zhang
International Journal of Parallel Programming 36(3), 289--311, Springer, 2008

COMIC: a coherent shared memory interface for cell be
J Lee, S Seo, C Kim, J Kim, P Chun, Z Sura, J Kim, S Y Han
Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 303--314, 2008

Prefetching irregular references for software cache on cell
T Chen, T Zhang, Z Sura, M G Tallada
Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization, pp. 155--164, 2008

A novel asynchronous software cache implementation for the Cell-BE processor
J Balart, M Gonzalez, X Martorell, E Ayguade, Z Sura, T Chen, T Zhang, K O’brien, K O’brien
Languages and Compilers for Parallel Computing, 125--140, Springer, 2008

Supporting OpenMP on cell
K O’Brien, K O’Brien, Z Sura, T Chen, T Zhang
International Journal of Parallel Programming 36(3), 289--311, Springer, 2008

Hybrid Access-Specific Software Cache Techniques for the Cell BE Architecture
Marc Gonzalez, Nikola Vujic, Alexandre E. Eichenberger, Xavier Martorell, Eduard Ayguade, Tong Chen, Zehra Sura, Tao Zhang, Kevin O'Brien, Kathryn O'Brien
Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT), pp. 292--302, 2008


2007

Optimizing the Use of Static Buffers for DMA on a CELL Chip
T Chen, Z Sura, K O’Brien, J O’Brien
Languages and Compilers for Parallel Computing, 314--329, Springer, 2007


2006

Using Advanced Compiler Technology to Exploit the Performance of the Cell Broadband Engine Architecture
A.E. Eichenberger, J. K. O'Brien, K. M. O'Brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. K. Gschwind, R. Archambault, Y. Gao, and R. Koo
IBM System Journal, vol 45, num 1 45(1), 2006


2005

Efficient SIMD Code Generation for Runtime Alignment and Length Conversion
Peng Wu, Alexandre E. Eichenberger, Amy Wang
Proceeding of the he International Symposium on Code Generation and Optimization (CGO), pp. 153--164, IEEE Computer Society, 2005
Abstract


Year Unknown

Exploiting parallelism with Dependence-Aware scheduling

... , 2009. PACT'09. 18th ..., 2009 - ieeexplore.ieee.org