Contact Information

Peng Wu
Research Staff Member
IBM T. J. Watson Research Center


Optimizing R VM: Allocation Removal and Path Length Reduction via Interpreter-level Specialization

Haichuan Wang, Peng Wu, and David Padua
to appear on International Symposium on Code Generation and Optimization (CGO'14), 2014

Software Support and Evaluation of Hardware Transaction Memory on Blue Gene/Q (pdf)

Amy Wang, Matthew Gaudet, Peng Wu, Martin Ohmacht, Jose Nelson Amaral, Christopher Barton, Raul Silvera, and Maged M. Michael
to appear on IEEE Transactions on Computers, 2014


Optimizing IBM Algorithmics' Mark-to-future Aggregation Engine for Real-time Counterparty Credit Risk Scoring (pdf)

Amy Wang, Jan Treibig, Bob Blainey, Peng Wu, Yaoqing Gao, Barnaby Dalton, Danny Gupta, Fahham Khan, Neil Bartlett, Lior Velichover, James Sedgwick, and Louis Ly
Sixth Workshop on High Performance Computational Finance (WHPCF'13), 2013


Evaluation of blue Gene/Q hardware support for transactional memories

Amy Wang, Matthew Gaudet, Peng Wu, Jos\'e Nelson Amaral, Martin Ohmacht, Christopher Barton, Raul Silvera, Maged Michael
Proceedings of the 21st international conference on Parallel architectures and compilation techniques, pp. 127--136, 2012

Adaptive Multi-Level Compilation in a Trace-based Java JIT Compiler

Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu, and Toshio Nakatani
to appear in Proceedings of ACM SIGPLAN on Object-oriented Programming Systems and Applications (OOPSLA), pp. 179--194, ACM, 2012


Reducing trace selection footprint for large-scale Java applications without performance loss (slides)

P. Wu, H. Hayashizaki, H. Inoue, T. Nakatani
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications, pp. 789--804

A trace-based Java JIT compiler retrofitted from a method-based compiler (slides)

H Inoue, H Hayashizaki, P Wu, T Nakatani
Code Generation and Optimization (CGO), 2011 9th Annual IEEE/ACM International Symposium on, pp. 246--256

Improving the performance of trace-based systems by false loop filtering (slides)

Hiroshige Hayashizaki, Peng Wu, Hiroshi Inoue, Mauricio J Serrano, Toshio Nakatani
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems, pp. 405--418, ACM, 2011


Fastpath speculative parallelization

Michael F Spear, Kirk Kelsey, Tongxin Bai, Luke Dalessandro, Michael L Scott, Chen Ding, Peng Wu
Languages and Compilers for Parallel Computing, pp. 338--352, Springer, 2010


Reducing memory ordering overheads in software transactional memory

M.F. Spear, M.M. Michael, M.L. Scott, P. Wu
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 13--24, IEEE Computer Society, 2009


Compiler-driven dependence profiling to guide program parallelization

Peng Wu, Arun Kejariwal, C\u{a}lin Ca\c{s}caval
Languages and Compilers for Parallel Computing, pp. 232--248, Springer, 2008

Overview of the IBM Blue Gene/P project

I B M Blue Gene Team
IBM Journal of Research and Development 52(1/2), 199--220, IBM CORP 1 NEW ORCHARD ROAD, ARMONK, NY 10504 USA, 2008

Compiler and Runtime Techniques for Software Transactional Memory Optimization

Peng Wu, Maged M. Michael, Christoph Von Praun, Takuya Nakaike, Rajesh Bordawekar, Harold W. Cain, Gheorghe Cascaval, Siddhartha Chatterjee, Stefanie R. Chiras, Rui Hou, Mark F. Mergen, Xiaowei Shen, Hua Yong Wang, Kun Wang and Michael Spear
Concurrency and Computation: Practice and Experience 21(1), 7--23, John Wiley & Sons, 2008

Software transactional memory: Why is it only a research toy?

Calin Cascaval, Colin Blundell, Maged M. Michael, Harold W. Cain, Peng Wu, Stefanie Chiras, Siddhartha Chatterjee
Communications of the ACM (CACM) 51(11), 40-46, ACM, 2008


Optimizing data permutations for SIMD devices

G. Ren, P. Wu, D. Padua
ACM SIGPLAN Notices, pp. 118--131, ACM, 2006

Using advanced compiler technology to exploit the performance of the Cell Broadband Engine

A. Eichenberger, J. K. O'brien, K. M. O'brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. K. Gschwind, R. Archambault, Y. Gao, R. Koo
IBM Systems Journal 45(1), 59--84, IBM, 2006


Optimizing Compiler for the CELL Processor

J. K. O'Brien, K. M. O'Brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. K. Gschwind
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 161--172, IEEE Computer Society, 2005

Efficient SIMD Code Generation for Runtime Alignment and Length Conversion

Peng Wu, Alexandre E. Eichenberger, Amy Wang
Proceeding of the he International Symposium on Code Generation and Optimization (CGO), pp. 153--164, IEEE Computer Society, 2005

An Integrated Simdization Framework using Virtual Vectors

Peng Wu, Alexandre E Eichenberger, Amy Wang, Peng Zhao
Proceedings of the International Conference on Supercomputing (ICS), pp. 169--178, ACM, 2005


Vectorization for SIMD architectures with Alignment Constraints

Alexandre E. Eichenberger, Peng Wu, Kevin O'Brien
Proceedings of the Conference on Programming Language Design and Implementation (PLDI), pp. 82--93, ACM, 2004


Induction variable analysis without idiom recognition: Beyond monotonicity

Peng Wu, Albert Cohen, David Padua
Languages and Compilers for Parallel Computing, pp. 427--441, Springer, 2003

A comparison of empirical and model-driven optimization

K. Yotov, X. Li, G. Ren, M. Cibulskis, G. DeJong, M. Garzaran, D. Padua, K. Pingali, P. Stodghill, P. Wu
ACM SIGPLAN Notices, pp. 63--76, ACM, 2003


Instance-wise points-to analysis for loop-based dependence testing

P. Wu, P. Feautrier, D. Padua, Z. Sura
Proceedings of the 16th international conference on Supercomputing, pp. 262--273, ACM, 2002

NINJA: Java for high performance numerical computing

J.E. Moreira, S.P. Midkiff, M. Gupta, P. Wu, G. Almasi, P. Artigas
Scientific Programming 10(1), 19--33, IOS Press, 2002


Monotonic evolution: an alternative to induction variable substitution for dependence analysis

Peng Wu, Albert Cohen, Jay Hoeflinger, David Padua
Proceedings of the 15th international conference on Supercomputing, pp. 78--91, ACM, 2001


Containers on the parallelization of general-purpose Java programs

P Wu, D Padua
Parallel Architectures and Compilation Techniques, 1999, pp. 84--90

Beyond Arrays—A Container-Centric Approach for Parallelization of Real-World Symbolic Applications

P Wu, D Padua
Languages and Compilers for Parallel Computing, 197--212, Springer, 1999

Year Unknown

Dependence Testing without Induction Variable Substitution

A Cohen, P Wu
do 50(1), 3, Citeseer