![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
![]() | ![]() | ![]() |
Additional Information
What's new?
- The IBM paper "Active Management of Timing Guardband to Save Energy in POWER7" won the Best Paper award in the 2011 MICRO-44 conference.
- Architecture Highlights 2010: IBM Architecture PIC Student Workshop, October 7-8, 2010.
Upcoming seminar
- June 6, 2013
Virtualization and Data Center Energy Management, Canturk Isci, IBM Research.
Seminar in YKT 20-001 at 10:00 AM EST. - June 17, 2013
Title to be announced, Balaji Veeraraghavan, IBM Software Group (Netezza).
Seminar in YKT 20-001 at 11:00 AM EST.
Recent Seminars
- May 10, 2013
"Disciplined Approximate Computing: From Language to Hardware," by Prof. Luis Ceze, University of Washington
Seminar in 20-043 at 9:30AM EST. - April 18, 2013
"Efficient and Easily Programmable Accelerator Architectures," by Prof. Tor Aamodt, University of British Columbia
Seminar in 20-001 at 10:30AM EST.
A complete list of recent seminars can be found here.
Related links
Project Name
Computer Architecture
Tab navigation
2012
A Reconfigurable Optical/Electrical Interconnect Architecture for Large-scale Clusters and Datacenters
Diego Lugones, Kostas Katrinis, Martin Collier
ACM Computing Frontiers (Best Paper Award), pp. 13-22, ACM, 2012
Note
Power Management of Multi-Core Chips: Challenges and Pitfalls
Pradip Bose, Alper Buyuktosunoglu, John A. Darringer, Meeta S. Gupta, Michael B. Healy, Hans Jacobson, Indira Nair, Jude A. Rivers, Jeonghee Shin, Augusto Vega, Alan J. Weger
Design, Automation & Test in Europe Conference & Exhibition, pp. 977--982, 2012
Cache Restoration for Highly Partitioned Virtualized Systems
David M. Daly, Harold W. Cain
Proceedings of the 18th International Symposium on High Performance Computer Architecture (HPCA), 2012
2011
Exploring the architecture of a stream register-based snoop filter
M Blumrich, V Salapura, A Gara
Transactions on high-performance embedded architectures and compilers III, 93--114, Springer, 2011
The IBM Blue Gene/Q interconnection network and message unit
D Chen, N A Eisley, P Heidelberger, R M Senger, Y Sugawara, S Kumar, V Salapura, D L Satterfield, B Steinmacher-Burow, J J Parker
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 26
SoftBeam: Precise Tracking of Transient Faults and Vulnerability Analysis at Processor Design Time
M Gschwind, V Salapura, C Trammell, S A McKee
ICCD 2011 - International Conference on Computer Design, pp. 404--410, IEEE
Guest Editorial: Parallel Systems and Compilers
V Salapura, M Gschwind, J Knoop
International Journal of Parallel Programming, 1--3, Springer, 2011
A Hybrid Approach for Large Cache Performance Studies
D Daly, P Dube, K El Maghraoui, D Poff, Li Zhang
Quantitative Evaluation of Systems (QEST), 2011 Eighth International Conference on, pp. 47 -56
A Tool for Scalable Profiling and Tracing of Java and Native Code Interactions
Parijat Dube, Seetharami Seelam, Yanbin Liu, Megumi Ito, Michel Hack, Liana Fong, Graeme Johnson, Michael Dawson, Li Zhang and Yuqing Gao
QEST , pp. 37--46, 2011
Error Tolerance in Server Class Processors
J A Rivers, M S Gupta, J Shin, P N Kudva, P Bose
Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 30(7), 945--959, IEEE, 2011
Scalable Proximity-Aware Cache Replication in Chip Multiprocessors (short paper)
Chongmin Li, Haixia Wang, Yibo Xue, Dongsheng Wang (Tsinghua University), Jian Li
The Twentieth International Conference on Parallel Architectures and Compilation Techniques (PACT) , 2011
TAPO: Thermal-Aware Power Optimization Techniques for Servers and Data Centers (Best Paper Award)
Wei Huang, Malcolm Allen-Ware, John Carter, Elmootazbellah Elnozahy, Hendrik Hamann, Tom Keller, Charles Lefurgy, Jian Li, Karthick Rajamani and Juan Rubio
Second International Green Computing Conference (IGCC'11), 2011
Multi-Channel Wireless Network-on-Chip: A New Approach to Improving On-Chip Communication Capacity
Dan Zhao, Yi Wang, Jian Li and Takamaro Kikkawa
International Symposium on Networks-on-Chip (NOCS), 2011
A Cost-Efficient L1-L2 Multicore Interconnect: Performance, Power and Area Considerations
Amit Golander, Nadav Levison, Omer Heymann, Alexander Briskman, Mark J. Wolski and Eric F. Robinson
Circuits and Systems I, IEEE Transactions on 58(3), IEEE, 2011
Coordinating DRAM and Last-Level-Cache Policies with the Virtual Write Queue
J. Stuecheli, D. Kaseridis, L.K. John, D. Daly, H.C. Hunter
IEEE Micro, Top Picks v. 31, 90--98, Published by the IEEE Computer Society, 2011
Loop Transformations: Convexity, Pruning, and Optimization
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan, and Nicolas Vasilache
ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), 2011
Power Shifting in Thrifty Interconnection Networks
Jian Li, Wei Huang, Lixin Zhang, Charles Lefurgy, Wolfgang Denzel, Richard Treumann and Kun Wang
To appear in International Symposiun on High Performance Computer Architecture (HPCA), 2011
2010
DACO: A High-Performance Disk Architecture Designed Specially for Large-Scale Erasure-Coded Storage Systems
M. Li, J. Shu
Computers, IEEE Transactions on 59(10), 1350--1362, IEEE, 2010
SAFER: Stuck-at-fault error recovery for memories
N H Seong, D H Woo, V Srinivasan, J A Rivers, H H S Lee
2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 115--124
Power and Thermal Characterization of the POWER6 System
Victor Jimenez, Fran J. Cazorla, Roberto Gioiosa, Eren Kursun, Chen Yong Cher, Alper Buyuktosunoglu, Pradip Bose, and Mateo Valero
In Proceedings of ACM International Conference on Parallel Architectures and Compilation Techniques, PACT, pp. 7-18, Vienna, Austria, 2010
Power-efficient, Reliable Microprocessor Architectures: Modeling and Design Methods
Pradip Bose, Alper Buyuktosunoglu, Chen Yong Cher, John Darringer, Meeta Sharma Gupta, Hendrik Hamann, Hans Jacobson, Prabhakar Kudva, Eren Kursun, Niti Madan, Indira Nair, Jude A. Rivers, Jeonghee Shin, Alan J. Weger, Victor Zyuban
In Proceedings of IEEE/ACM GLSVLSI Symposium, pp. 299-304, 2010
Architecting for power management: The IBM POWER7 approach
Malcolm Ware, Karthick Rajamani, Michael Floyd, Bishop Brock, Juan C Rubio, Freeman Rawson, John B Carter
HPCA, pp. 1--11, 2010
A Wire-Speed PowerTM Processor: 2.3GHz 45nm SOI with 16 Cores and 64 Threads
Charles Johnson, David H. Allen, Jeff Brown, Steve Vanderwiel, Russ Hoover, Heather Achilles, Chen-Yong Cher, George A. May, Hubertus Franke, Jimi Xenedis, Claude Basso
2010 IEEE International Solid-State Circuits Conference (ISSCC)
Lime: a Java-compatible and synthesizable language for heterogeneous architectures
Joshua Auerbach, David F Bacon, Perry Cheng, Rodric Rabbah
Proceedings of the ACM International Conference on Object-Oriented Programming Systems, Languages, and Applications, pp. 89--108, ACM, 2010
Abstract
Performance Analysis Of Idle Programs
Erik Altman, Matthew Arnold, Stephen Fink, Nick Mitchell
Proceedings of OOPSLA/SPLASH 2010, pp. 739--753, ACM
Memory Latency Reduction via Thread Throttling
Hsiang-Yun Cheng, Jian Li, Chung-Hsiang Lin and Chia-Lin Yang
To appear in the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2010
The PERCS High-Performance Interconnect
Baba Arimilli, Ravi Arimilli, Vicente Chung, Scott Clark, Wolfgang Denzel, Ben Drerup, Torsten Hoefler, Jody Joyner, Jerry Lewis, Jian Li, Nan Ni, Ram Rajamony
HotInterconnect (Hoti) '2010: Proceedings of the 18th Annual Symposium on High-Performance Interconnects
Design Exploration of Hybrid cache architecture with disparate memory technologies
Xiaoxia Wu, Jian Li, Lixin Zhang, Evan Speight, Ram Rajamony and Yuan Xie
ACM Transaction on Architecture and Code Optimization (TACO), 2010
Power-efficient, Reliable Microprocessor Architectures: Modeling and Design Methods
Pradip Bose, Alper Buyuktosunoglu, Chen-Yong Cher, John A. Darringer, Meeta Sharma Gupta, Hendrik F. Hamann, Hans M. Jacobson, Prabhakar Kudva, Eren Kursun, Niti Madan, Indira Nair, Jude A. Rivers, Jeonghee Shin, Alan J. Weger and Victor V. Zyuban
In Proceedings of IEEE International Symposium, GLSVLSI, pp. 299 - 304, pp. 299--304, IEEE, ACM, 2010
Abstract
Can CPUs Match GPUs on Performance with Productivity?: Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC25033, IBM T.J. Watson Research Center, Yorktown Heights, New York, 2010
A Model for Fusion and Code Motion in an Automatic Parallelizing Compiler
Uday Bondhugula, Oktay Gunluk, Sanjeeb Dash, Lakshminarayanan Renganarayana
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
DMATiler: Revisiting Loop Tiling for Direct Memory Access (Extended Abstract)
HaiBo Lin, Tao Liu, Huoding Li, Tong Chen, Lakshminarayanan Renganarayana, Kevin O'Brien, Ling Shao
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
MemX: Virtualization of Cluster-wide Memory
U Deshpande, B Wang, S Haque, M Hines, K Gopalan
Proc. of 39th International Conference on Parallel Processing (ICPP), 2010
Workload performance characterization of DARPA HPCS benchmarks (Outstanding Paper Award)
S Seelam, I H Chung, G Cong, H F Wen, D Klepacki
Concurrency and Computation: Practice and Experience 22(4), 441--461, John Wiley \& Sons, 2010
Trends and Techniques for Energy Efficient Architectures
Victor Jimenez, Roberto Gioiosa, Eren Kursun, Francisco J. Cazorla, Chen-Yong Cher, Alper Buyuktosunoglu, Pradip Bose, Mateo Valero
In Proceedings of the 18th IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC, pp. 276-279, pp. 276--279, 2010
Power-efficient, reliable microprocessor architectures: modeling and design methods
Pradip Bose, Alper Buyuktosunoglu, Chen-Yong Cher, John A. Darringer, Meeta S. Gupta, Hendrik Hamann, Hans Jacobson, Prabhakar N. Kudva, Eren Kursun, Niti Madan, Indira Nair, Jude A. Rivers, Jeonghee Shin, Alan J. Weger, Victor Zyuban
Proceedings of the 20th symposium on Great lakes symposium on VLSI (GLVLSI), 2010
A Wire-Speed Power (TM) Processor: 2.3GHz 45nm SOI with 16 Cores and 64 Threads
C. Johnson, D. H. Allen, J. Brown, S. Vanderwiel, R. Hoover, H. Achilles, C-Y. Cher, G. A. May, H. Franke, J. Xenedis, C. Basso
2010 IEEE International Solid-State Circuits Conference (ISSCC)
Performance and power evaluation of an in-line accelerator
A. Rico, J.H. Derby, R.K. Montoye, T.H. Heil, C.Y. Cher, P. Bose
Proceedings of the 7th ACM international conference on Computing frontiers, pp. 81--82, 2010
Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan
Supercomputing (SC) 2010
Power-efficient, reliable microprocessor architectures: modeling and design methods
P Bose, A Buyuktosunoglu, C Y Cher, J A Darringer, M S Gupta, H Hamann, H Jacobson, P N Kudva, E Kursun, N Madan, others
Proceedings of the 20th symposium on Great lakes symposium on VLSI, pp. 299--304, 2010
Design Exploration of Hybrid cache architecture with disparate memory technologies
X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony and Y. Xie
ACM Transactions on Architecture and Code Optimization (TACO), ACM, 2010
Enigma: Architectural Support and Operating System Support for Reducing the Impact of Address Translation
Lixin Zhang, Evan Speight, Ram Rajamony, Jiang Lin
International Conference on Supercomputing (ICS), ACM/SIGARCH, 2010
Power and Thermal Characterization of POWER6 System. In the International Conference on Parallel Architectures and Compilation Techniques (PACT), Vienna, Austria, Sep. 2010.
Victor Jimenez, Francisco J. Cazorla, Roberto Gioiosa, Eren Kursun, Canturk Isci, Chen-Yong Cher, Alper Buyuktosunoglu, Pradip Bose, and Mateo Valero
In the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
FPGA-based combined architecture for stream categorization and intrusion detection
M Vorbach S Shukla R Rabbah
MEMOCODE '10: Proceedings of the 8th IEEE/ACM International Conference on Formal Methods and Models for Codesign, pp. 77--80, 2010
A Unified Execution Model for Cloud Computing
Eric van Hensbergen, Noah Evans, Phillip Stanley-Marbell
Large Scale Distributed Systems and Middleware, (LADIS 2009), Co-located with the 22nd ACM Symposium on Operating Systems Principles (SOSP 2009), pp. 12--17, ACM, 2010
Performance of large low-associativity caches
Parijat Dube, Li Zhang, David Daly, Alan Bivens
SIGMETRICS Perform. Eval. Rev. 37(4), 11--18, ACM, 2010
Abstract
The Virtual Write Queue: Coordinating DRAM and Last-Level Cache Policies
J Stuecheli, D Kaseridis, D Daly, H C Hunter, L K John
Proceedings of the The ACM IEEE International Symposium on Computer Architecture (ISCA2010)
A Model for Fusion and Code Motion in an Automatic Parallelizing Compiler
Uday Bondhugula, Oktay Gunluk, Sanjeeb Dash, Lakshminarayanan Renganarayana
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010
Believe it or Not! Multicore CPUs can Match GPUs for FLOP-intensive Applications!
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC24982, IBM TJ Watson Research Center, Yorktown Heights, New York, 2010
GRAPHITE Two Years After
K Trifunovic, A Cohen, D Edelsohn, L Feng, T Grosser, H Jagasia, R Ladelsky, S Pop
GCC Research Opportunities (GROW'10), 2010
Trace-Based Data Layout Optimizations for Multi-core Processors
O Golovanevsky, A Dayan, A Zaks, D Edelsohn
High Performance Embedded Architectures and Compilers, 81--95, Springer, 2010
Compiling for Reduced Bit-Width Queue Processors
Arquimedes Canedo, Ben A Abderazek, Masahiro Sowa
Journal of Signal Processing Systems 59(1), 45--55, Springer, 2010
Natural Instruction Level Parallelism-aware Compiler for High-Performance QueueCore Processor Architecture
Ben A Abderazek, Masashi Masuda, Arquimedes Canedo, Kenichi Kuroda
Journal of Supercomputing, Springer, 2010
Automatic Parallelization of Simulink Applications
Arquimedes Canedo, Takeo Yoshizawa, Hideaki Komatsu
International Symposium on Code Generation and Optimization 2010
Skewed Pipelining for Parallel Simulink Simulations
Arquimedes Canedo, Takeo Yoshizawa, Hideaki Komatsu
Design, Automation and Test in Europe 2010
Automatic Creation of Tile Size Selection Models
Tomofumi Yuki, Lakshminarayanan Renganarayana, Sanjay Rajopadhye, Charles Anderson, Alexandre Eichenberger and Kevin O'Brien
International Symposium on Code Generation and Optimization (CGO), 2010
Observations on Tuning a Java Enterprise Application for Performance and Scalability
Erik Altman, Matthew Arnold, Rajesh Bordawekar, Robert Delmonico, Nick Mitchell, Peter F. Sweeney
IBM Journal of Research and Development 54(5), 2, IBM, 2010
MacroSS: macro-SIMDization of streaming applications
Amir H Hormati, Yoonseo Choi, Mark Woh, Manjunath Kudlur, Rodric Rabbah, Trevor Mudge, Scott Mahlke
ASPLOS '10: Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, pp. 285--296, ACM, 2010
Abstract
2009
A Computing Origami: Folding Streams in FPGAs
Andrei Hagiescu, Weng-Fai Wong, David F. Bacon, Rodric Rabbah
Proceedings of the 46th Annual Design Automation Conference, pp. 282--287, ACM, 2009
Abstract
A tagless coherence directory
J Zebchuk, V Srinivasan, M K Qureshi, A Moshovos
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 423--434, 2009
Enhancing lifetime and security of pcm-based main memory with start-gap wear leveling
M K Qureshi, J Karidis, M Franceschini, V Srinivasan, L Lastras, B Abali
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 14--23, 2009
Scalable high performance main memory system using phase-change memory technology
M K Qureshi, V Srinivasan, J A Rivers
Proceedings of the 36th annual international symposium on Computer architecture, pp. 24--33, 2009
Scalable data center provisioning and control
M Silva, M Banikazemi, M Butrico, D Daly, S Guthridge, J E Moreira, W V Ruggiero
IBM Journal of Research and Development 53(4), 2009
Design and performance of speculative flow control for high-radix datacenter interconnect switches
C Minkenberg, M Gusat
Journal of Parallel and Distributed Computing 69(8), 680--695, Elsevier, 2009
Adaptive Routing in Data Center Bridges
C Minkenberg, M Gusat, G Rodriguez
High Performance Interconnects, 2009, pp. 33--41
Benchmarking the Ethernet-federated datacenter
M Gusat, C DeCusatis, C Minkenberg, L McKenna, K Bhardwaj, GJ Paljak, A Pataricza, I Kocsis
Proc. First Workshop on Data Center - Converged and Virtual Ethernet Switching (DC-CAVES 2009)
Compiling Effectively for Cell with GCC
Ira Rosen, David Edelsohn, Ben Elliston, Revital Eres, Alan Modra, Dorit Nuzman, Ulrich Weigand, and Ayal Zaks
CPC 2009
Compiler Support for Code Size Reduction using a Queue-based Processor
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Lecture Notes in Computer Science, pp. 269--285, Springer Berlin, 2009
Design and implementation of a queue compiler
Arquimedes Canedo, Ben A Abderazek, Masahiro Sowa
Microprocess. Microsyst. 33(2), 129--138, Elsevier Science Publishers B. V., 2009
Efficient Compilation for Queue Size Constrained Queue Processors
Arquimedes Canedo, Ben A Abderazek, Masahiro Sowa
Parallel Comput. 35(4), 213--225, Elsevier Science Publishers B. V., 2009
Software and Hardware Design Issues for Low Complexity High Performance Processor Architecture
Masashi Masuda, Abderazek Ben Abdallah, Arquimedes Canedo
ICPP Workshops, pp. 558-565, 2009
System Resilience at Extreme Scale -- White Paper
T El-Ghazawi, A Fox, B F Godfrey, M D Cray, A Hoisie, J Plank, J Simons, E N M Elnozahy, A IBM
Department of Defense -- Defense Advanced Research Project Agency, U.S. Department of Defense -- Defense Advanced Research Project Agency, 2009
ITCA: Inter-task Conflict-Aware CPU Accounting for CMPs
C Luque, M Moreto, F J Cazorla, R Gioiosa, A Buyuktosunoglu, M Valero
Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques-Volume 00, pp. 203--213
CPU Accounting in CMP Processors
C Luque, M Moreto, F J Cazorla, R Gioiosa, A Buyuktosunoglu, M Valero
IEEE Computer Architecture Letters 8(1), 17--20, IEEE Computer Society, 2009
Dynamic power gating with quality guarantees
A Lungu, P Bose, A Buyuktosunoglu, D J Sorin
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design, pp. 377--382, 2009
Temperature Variation Characterization and Thermal Management of Multicore Architectures
Eren Kursun, Chen Yong Cher
IEEE Micro Top Picks, 2009
Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors
Q Lu, C Alias, U Bondhugula, T Henretty, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan, Y Chen, H Lin, others
Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, 2009
Hybrid Iterative and Model-Driven Optimization in the Polyhedral Model
L N Pouchet, U Bondhugula, C Bastoul, A Cohen, R Ramanujam, P Sadayappan
INRIA Research Report 6269, INRIA Saclay, France, 2009
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors
M Baskaran, N Vydyanathan, Uday Bondhugula, J Ramanujam, A Rountev, P Sadayappan
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 219--228, 2009
Compact Multi-Dimensional Kernel Extraction for Register Tiling
L Renganarayana, Uday Bondhugula, Salem Dersavi, Alexandre E. Eichenberger, Kevin O'Brien
Proceedings of the 22nd International Conference on High Performance Networking and Computing (SC), pp. 1--12, 2009
Formal verification of correctness and performance of random priority-based arbiters
Krishnan Kailas, Viresh Paruthi, Brian Monwai
Proc. of 9th International Conference on Formal Methods in Computer-Aided Design (FMCAD 2009), pp. 101-107, IEEE
Polyhedral-Model Guided Loop-Nest Auto-Vectorization
K Trifunovic, D Nuzman, A Cohen, A Zaks, I Rosen
Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques-Volume 00, pp. 327--337
Encoding Efficiency of Digital Number Representations under Deviation Constraints
Phillip Stanley-Marbell
IEEE Information Theory Workshop, ITW 2009, Taormina, Sicily
Writing Compilers and Interpreters, 3rd ed.: A Software Engineering Approach
Ronald Mak
John Wiley & Sons, 2009
Snugglebug: A Powerful Approach to Weakest Preconditions
S. Chandra, S. Fink, M. Sridharan
ACM Conference on Programming Language Design and Implementation (PLDI), pp. 363--374, ACM, 2009
Securing virtual machine monitors: what is needed?
PA Karger
ACM Symposium on Information, Computer & Communication Security (ASIACCS), pp. xiv-xv, Association for Computing Machinery, 2009
Multicore Power Management: Ensuring Robustness via Early-Stage Formal Verification
A Lungu, P Bose, D J Sorin, S German, G Janssen
152.3.140.5, 2009
Exploiting Parallelism with Dependence - Aware Scheduling
Xiaotong Zhuang, Alexandre E Eichenberger, Yangchun Luo, Kevin O'Brien, Kathryn O'Brien
Proceeding of the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009
True value: assessing and optimizing the cost of computing at the data center level
J Karidis, JE Moreira, Jaime H Moreno
6th ACM Conference on Computing Frontiers, 2009
Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures
A H Hormati, Y Choi, M Kudlur, R Rabbah, T Mudge, S Mahlke
PACT '09: Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, pp. 214--223, 2009
2008
A Proactive Wearout Recovery Approach for Exploiting Microarchitectural Redundancy to Extend Cache SRAM Lifetime
Jeonghee Shin, Victor Zyuban, Pradip Bose, Timothy M Pinkston
SIGARCH Comput. Archit. News36, 353--362, ACM, 2008
Abstract
Optimus: Efficient Realization of Streaming Applications on FPGAs
Amir Hormati, Manjunath Kudlur, Scott Mahlke, David F. Bacon, Rodric Rabbah
Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems, pp. 41--50, ACM, 2008
Abstract
Quantitative Evaluation of Common Subexpression Elimination on Queue Machines
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the International Symposium on Parallel Architectures, Algorithms, and Networks, pp. 25--30, 2008
Software-controlled priority characterization of POWER5 processor
C Boneti, F J Cazorla, R Gioiosa, A Buyuktosunoglu, C Y Cher, M Valero
Proceedings of the 35th International Symposium on Computer Architecture (ISCA), pp. 415--426, 2008
ABSTRACT Performance and Security Lessons Learned from Virtualizing the Alpha Processor
P A Karger
2008 - Citeseer, Citeseer
A practical automatic polyhedral parallelizer and locality optimizer
Uday Bondhugula, A Hartono, J Ramanujam, P Sadayappan
Proceedings of the ACM SIGPLAN conference on Programming Language Design and Implementation, pp. 101--113, 2008
A compiler framework for optimization of affine loop nests for GPGPUs
M Baskaran, U Bondhugula, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
Proceedings of the 22nd International conference on Supercomputing, pp. 225--234, 2008
Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
M Baskaran, U Bondhugula, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 1--10, 2008
Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
Uday Bondhugula, M Baskaran, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
International Conference on Compiler Construction (CC), pp. 132--146, 2008
Scalable and Efficient Tools for Multi-level Tiling
Lakshminarayanan Renganarayana
PhD Dissertation -- Colorado State University, 2008
A domain specific interconnect for reconfigurable computing
S Rajopadhye, G Gupta, L Renganarayana
ACM Conference on Languages, compilers, and tools for embedded systems (LCTES), pp. 79--88, ACM, 2008
Positivity, posynomials and tile size selection
L Renganarayana, S Rajopadhye
Proceedings of the 2008 ACM/IEEE conference on Supercomputing (SC), pp. 1--12
MILEPOST GCC: machine learning based research compiler
G Fursin, C Miranda, O Temam, E Yom-Tov
Proceedings of the GCC Developers' Summit, 2008
Outer-loop vectorization: revisited for short SIMD architectures
D Nuzman, A Zaks
Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 2--11, 2008
Compiling for an indirect vector register architecture
D. Nuzman, M. Namolaru, A. Zaks, J.H. Derby
Proceedings of the 5th conference on Computing frontiers, pp. 199--208, 2008
On Architecture-Measuring Architectural Complexity
G Booch
IEEE Software 25(4), 14--15, [Los Alamitos, CA: IEEE Computer Society, c1984-, 2008
Thousand-Core Chips
David Yeh, Li-Shiuan Peh, Shekhar Borkar, John A. Darringer, Anant Agarwal, Wen-mei Hwu
Design and Test Roundtable, 2008
Providing platform heterogeneity-awareness for data center power management
R Nathuji, C Isci, E Gorbatov, K Schwan
Cluster Computing 11(3), 259--271, Springer, 2008
Power management solutions for computer systems and datacenters
K Rajamani, C Lefurgy, S Ghiasi, J Rubio, H Hanson, T Keller
ISLPED - 2008 - portal.acm.org
Exploring power management in multi-core systems
Reinaldo A. Bergamaschi, Guoling Han, Alper Buyuktosunoglu, Hiren D. Patel, Indira Nair, Gero Dittmann, Geert Janssen, Nagu R. Dhanwada, Zhigang Hu, Pradip Bose, John A. Darringer
Proceedings of the 13th Asia and South Pacific Design Automation Conference (ASP-DAC 2008), pp. 708--713
Keeping hot chips cool: are IC thermal problems hot air?
R Puri, D Varma, D Edwards, A J Weger, P Franzon, A Yang, S Kosonocky
Proceedings of the 45th annual Design Automation Conference, pp. 634--635, 2008
Tapping into the fountain of CPUs: on operating system support for programmable devices
W Y D Dolev, IBM Danny Dolev, IBM Anker Tal, IBM Ben-Yehuda Muli, W P O S Center
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems, pp. 179--188, 2008
Hybrid Access-Specific Software Cache Techniques for the Cell BE Architecture
Marc Gonzalez, Nikola Vujic, Alexandre E. Eichenberger, Xavier Martorell, Eduard Ayguade, Tong Chen, Zehra Sura, Tao Zhang, Kevin O'Brien, Kathryn O'Brien
Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT), 2008
A lightweight streaming layer for multicore execution
David Zhang, Qiuyuan J Li, Rodric Rabbah, Saman Amarasinghe
SIGARCH Comput. Archit. News 36(2), 18--27, ACM, 2008
Abstract
Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary
Shan Shan Huang, Amir Hormati, David F. Bacon, Rodric Rabbah
Proceedings of the 22nd European Conference on Object-Oriented Programming, pp. 76--103, Springer-Verlag, 2008
Abstract
2007
A framework for architecture-level lifetime reliability modeling
J Shin, V Zyuban, Z Hu, J A Rivers, P Bose
2007 - computer.org, IEEE Computer Society
On Flow Completion Time Benchmarking in Datacenters
M Gusat, C Minkenberg, R Luijten
2007 - ieee802.org
Leveraging and Reinforcing FOSS - Experiences from GCC
Nuzman, D. and Edelsohn, D.
IBM Academy of Technology Open Source Software Experiences and Best Practices Conference, 2007
Optimizing Reaching Definitions Overhead in Queue Processors
Yuki Nakanishi, Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Journal of Convergence Information Technology 2(4), 36--40, 2007
Compiler Framework for an Embedded 32-bit Queue Processor
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the International Conference on Convergence Information Technology, pp. 877-884, 2007
An Efficient Code Generation Algorithm for Code Size Reduction using 1-offset P-Code Queue Computation Model
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the IFIP International Conference on Embedded and Ubiquitous Computing EUC2007, pp. 196--208
Queue Register File Optimization Algorithm for QueueCore Processor
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
19th International Symposium on Computer Architecture and High Performance Computing, pp. 169--176, 2007
New Code Generation Algorithm for QueueCore - An Embedded Processor with High ILP
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 185--192, 2007
Five Years with the High Productivity Computing Systems Program A Perspective
E Elnozahy
2007 IEEE International Parallel and Distributed Processing Symposium, pp. 434
Predicting program behavior based on objective function minimization
R Sarikaya, A Buyuktosunoglu
IEEE 10th International Symposium on Workload Characterization, 2007, pp. 25--34
Evaluating design tradeoffs in on-chip power management for CMPs
J Sharkey, A Buyuktosunoglu, P Bose
Proceedings of the 2007 international symposium on Low power electronics and design, pp. 49
EnergyScale for IBM POWER6 microprocessor-based systems
H Y McCreary, MA Broyles, MS Floyd, AJ Geissler, SP Hartman, FL Rawson, TJ Rosedahl, JC Rubio, MS Ware
IBM Journal of Research and Development 51(6), 775--786, IBM Corp., 2007
System power management support in the IBM POWER6 microprocessor
MS Floyd, S Ghiasi, TW Keller, K Rajamani, FL Rawson, JC Rubio, MS Ware
IBM Journal of Research and Development 51(6), 733--746, IBM Corp., 2007
Affine transformations for communication minimal parallelization and locality optimization of arbitrarily-nested loop sequences
U Bondhugula, M Baskaran, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
OSU-CISRC-TR43, The Ohio State University, 2007
Automatic mapping of nested loops to FPGAs
Uday Bondhugula, J Ramanujam, P Sadayappan
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 111, 2007
Effective automatic parallelization of stencil computations
S Krishnamoorthy, M Baskaran, U Bondhugula, J Ramanujam, A Rountev, P Sadayappan
ACM SIGPLAN PLDI, pp. 244, 2007
Towards optimal multi-level tiling for stencil computations
L Renganarayanan, M Harthikote-Matha, R Dewri, SV Rajopadhye
21st IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2007
Multi-level tiling: M for the price of one
D G Kim, L Renganarayanan, D Rostron, S Rajopadhye, M M Strout
Proceedings of the 2007 ACM/IEEE conference on Supercomputing (SC)
Parameterized tiled loops for free
L Renganarayanan, D G Kim, S Rajopadhye, M M Strout
ACM SIGPLAN conference on Programming language design and implementation (PLDI), pp. 414, 2007
Multi-core design automation challenges
JA Darringer
Design Automation Conference, pp. 760 : 764, 2007
Exploiting platform heterogeneity for power efficient data centers
R Nathuji, C Isci, E Gorbatov
Fourth International Conference on Autonomic Computing, 2007, pp. 5--5
Workload Adaptive Power Management with Live Phase Monitoring and Prediction
C Isci, Princeton University, Dept. of Electrical …
2007 - Citeseer
OS-Aware Branch Prediction: Improving Microprocessor Control Flow Prediction for Operating Systems
T Li, LK John, A Sivasubramaniam, N Vijaykrishnan, …
IEEE TRANSACTIONS ON COMPUTERS, 2007 - doi.ieeecomputersociety.org
Thermal response to DVFS: analysis with an Intel Pentium M
H Hanson, SW Keckler, S Ghiasi, K Rajamani, F Rawson, J Rubio
Proceedings of the 2007 international symposium on Low power …, 2007 - portal.acm.org
Architecture and Implementation of the TRIPS Processor
MS Govindan, P Gratz, D Gulati, H Hanson, C Kim, H …
Unique Chips and Systems, 2007 - books.google.com
Implementation and Evaluation of a Dynamically Routed Processor Operand Network
Gratz, K Sankaralingam, H Hanson, P Shivakumar, R …
Proc. 1st Intl Symp. Networks-on-Chip (NOCS 07,null), 2007 - doi.ieeecomputersociety.org
Power, Performance, and Thermal Management for High-Performance Systems
H Hanson, S W Keckler, K Rajamani, S Ghiasi, F Rawson, J Rubio
H Hanson , SW Keckler, K Rajamani, S Ghiasi, F ..., 2007
Novel Addressing Method for Aggregate Types in Queue Processors
Teruhisa Yuki, Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the International Conference on Convergence Information Technology, pp. 1793-1796, 2007
Performance and security lessons learned from virtualizing the alpha processor
P A Karger
ACM SIGARCH Computer Architecture News 35(2), 401, ACM, 2007
Performance modeling for early analysis of multi-core systems
Reinaldo A. Bergamaschi, Indira Nair, Gero Dittmann, Hiren D. Patel, Geert Janssen, Nagu R. Dhanwada, Alper Buyuktosunoglu, Emrah Acar, Gi-Joon Nam, Dorothy Kucar, Pradip Bose, John A. Darringer, Guoling Han
Proceedings of the 5th IEEE/ACM international conference on hardware/software codesign and system synthesis (CODES+ISSS), pp. 209--214, 2007
Characterizing a complex J2EE workload: A comprehensive analysis and opportunities for optimizations
Y Shuf, IM Steiner
International Symposium on Performance Analysis of Systems …, 2007 - doi.ieeecomputersociety.org
Temperature-limited microprocessors: Measurements and design implications
H F Hamann, A Weger, J Lacey, Z Hu, P Bose, E Cohen, J Wakil
2007 - computer.org, IEEE Computer Society
Hotspot-limited microprocessors: Direct temperature and power distribution measurements
Hendrik F. Hamann, Alan J. Weger, James A. Lacey, Zhigang Hu, Pradip Bose, Erwin Cohen, and Jamil Wakil
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 42, NO. 1, JANUARY 2007 V2(1), 56--65, IEEE
2006
Designing a crossbar scheduler for HPC applications
C Minkenberg, F Abel, P Muller, R Krishnamurthy, M Gusat, P Dill, I Iliadis, R Luijten, BR Hemenway, R Grzybowski, others
IEEE Micro 26(3), 58--71, [Los Alamitos, CA]: IEEE Computer Society,[c1981-, 2006
Temperature Management: Investigating the Effects of Task Scheduling on Thermal Behavior
Eren Kursun, Chen Yong Cher, Alper Buyuktosunoglu, Pradip Bose
In Proceedings of International Symposium on Computer Architecture, Temperature-Aware Computer Systems , 2006
Software/Hardware Co-managed Cache Optimizations
Rajiv Ravindran, Krishnan Kailas, Zehra Sura
IBM Research Report RC23998RC 23998, IBM Research Division, 2006
Hardware/software integration for FPGA-based all-pairs shortest-paths
Uday Bondhugula, A Devulapalli, J Dinan, J Fernando, P Wyckoff, E Stahlberg, P Sadayappan
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 152--164, 2006
Parallel FPGA-based all-pairs shortest-paths in a directed graph
Uday Bondhugula, A Devulapalli, J Fernando, P Wyckoff, P Sadayappan
Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2006
Auto-vectorization of interleaved data for SIMD
D Nuzman, I Rosen, A Zaks
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, pp. 143
From single core to multi-core: preparing for a new exponential
J Parkhurst, J Darringer, B Grundmann
ICCAD, pp. 67-72, 2006
Live, runtime phase monitoring and prediction on real systems with application to dynamic power management
C Isci, G Contreras, M Martonosi
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 359--370, 2006
Phase characterization for power: Evaluating control-flow-based and event-counter-based techniques
C Isci, M Martonosi
HPCA’06: Proceedings of the Twelfth International Symposium on High-Performance Computer Architecture, 2006
An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget
C Isci, A Buyuktosunoglu, C Y Cher, P Bose, M Martonosi
39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 347--358, 2006
Benchmarking for Power and Performance
H Hanson, K Rajamani, J Rubio, S Ghiasi, F Rawson
2006 SPEC Benchmark Workshop
Application-Aware Power Management
K Rajamani, H Hanson, J Rubio, S Ghiasi, F Rawson
Proceedings of the 2006 IEEE International Symposium on … - doi.ieeecomputersociety.org
Online Power and Performance Estimation for Dynamic Power Management
K Rajamani, H Hanson, JC Rubio, S Ghiasi, FL …
Research Report RC - domino.research.ibm.com, 2006
Application of full-system simulation in exploratory system design and development
JL Peterson, PJ Bohrer, L Chen, EN Elnozahy, A Gheith, RH Jewell, MD Kistler, TR Maeurer, SA Malone, DB Murrell, others
IBM Journal of Research and Development 50(2.3), 321--332, IBM, 2006
https://www.research.ibm.com/journal/rd/502/agerwala.html
T Agerwala, M Gupta
IBM Journal of Research and Development, 2006 - research.ibm.com
Performance and environment monitoring for continuous program optimization
C Cascaval, E Duesterwald, P F Sweeney, R W Wisniewski
IBM Journal of Research and Development 50(2-3), 239--248, IBM, 2006
HPC-Colony: services and interfaces for very large systems
Kal, T Jones, A Tauferner, T Inglett, J Moreira
ACM SIGOPS Operating Systems Review, 2006 - portal.acm.org
Braids and Fibers: Language Constructs with Architectural Support for Adaptive Responses to Memory Latencies
David F. Bacon, Xiaowei Shen
IBM Journal of Research and Development 50(2/3), 209--221, IBM Corp., 2006
Abstract
Chip-level integration: the new frontier for microprocessor architecture
Jaime H Moreno
ACM Symposium on Parallel Algorithms and Architectures, Boston, 2006
Systems on a chip: The new generation of microprocessors (in Spanish)
Jaime H Moreno
Informatica (Chile), 2006
Online performance auditing: using hot optimizations without getting burned
J Lau, M Arnold, M Hind, B Calder
Proceedings of the 2006 ACM SIGPLAN conference on …, 2006 - portal.acm.org
Online phase detection algorithms
P. Nagpurkar, P. Hind, C. Krintz, P.F. Sweeney, VT Rajan
Code Generation and Optimization, 2006. CGO 2006. International Symposium on, pp. 13--pp
Using Advanced Compiler Technology to Exploit the Performance of the Cell Broadband Engine Architecture
A.E. Eichenberger, J. K. O'Brien, K. M. O'Brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. K. Gschwind, R. Archambault, Y. Gao, and R. Koo
IBM System Journal, vol 45, num 1 45(1), 2006
Loosely coupled tcp acceleration architecture
L Shalev, V Makhervaks, Z Machulsky, G Biran, J Satran, M Ben-Yehuda, I Shimony
14th IEEE Symposium on High-Performance Interconnects, pp. 3--8, IEEE Computer Society, 2006
A characterization of a java-based commercial workload on a high-end enterprise server
IM Steiner, Y Shuf
ACM SIGMETRICS Performance Evaluation Review, 2006 - portal.acm.org
Power Distribution Measurements of the Dual Core PowerPC TM 970MP Microprocessor
HF Hamann, A Weger, J Lacey, E Cohen, C Atherton
IEEE International Solid-State Circuits Conference, 2006. …, 2006 - ieeexplore.ieee.org
Guest Editor Introduction
Erik Altman and Sumedh Sathaye
IBM Journal of Research and Development 50(2/3), 169, IBM, 2006
A tutorial example of a cache memory protocol and RTL implementation
S German, G Janssen
IBM Research Report, RC23958, Tech. Rep, 2006
Delivering Teraflops: An Account of how Blue Gene was Brought to Life
JE Moreira, IBMTJWR Center, Y Heights
IEEE John Vincent Atanasoff 2006 International Symposium on …, 2006 - ieeexplore.ieee.org
Design and implementation of a one-sided communication interface for the IBM eServer Blue Gene …
Smith, G Almsi, J Castaos, D Lieber, J Moreira …
Proceedings of the 2006 ACM/IEEE conference on …, 2006 - portal.acm.org
2005
Reducing Server Data Traffic Using a Hierarchical Computation Model
J Rubio, LK John
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2005 - doi.ieeecomputersociety.org











































































































































