Computer Architecture (Computer Systems Design sub-discipline) Publications



2014

Characterization of Call-Graph Profiles in Java Workloads
Takuya Nakaike, Hiroshi Inoue, Toshio Suganuma, Moriyoshi Ohara
Proceedings of the IEEE International Symposium on Workload Characterization (IISWC 2014), IEEE, pp. 85--102

PSS: A prototype storage subsystem based on PCM
Ioannis Koltsidas, Peter Mueller, Roman Pletka, Thomas Weigold, Evangelos Eleftheriou, Maria Varsamou, Athina Ntalla, Elina Bougioukou, Aspasia Palli, Theodore Antonakopoulos
5th Non-Volatile Memories Workshop (NVMW2014)

Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory
Rei Odaira, Jose G. Castanos, and Hisanobu Tomari
Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 131--142, 2014

3D stacking of high-performance processors
Philip Emma, Alper Buyuktosunoglu, Michael Healy, Krishnan Kailas, Valentin Puente, Roy Yu, Allan Hartstein, Pradip Bose, Jaime Moreno
High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium on, pp. 500--511


Memory-driven near-data acceleration
J. van Lunteren, H. Giefers, C. Hagleitner, R. Jongerius
HPC User Forum, Seattle, September 2014

Using Multiple Threads to Accelerate Single Thread Performance
Zehra Sura, Kevin OBrien, Jose Brunheroto
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pp. 985--994

Soft Error Resiliency Characterization on IBM BlueGene/Q Processor.
Chen-Yong Cher, K Paul Muller, Ruud A Haring, David L Satterfield, Thomas E Musta, Thomas Gooding, Kristan D Davis, Marc Boris Dombrowa, Gerard V Kopcsay, Robert M Senger, others
ASP-DAC, pp. 385--387, 2014

The resilience wall: Cross-layer solution strategies
Subhasish Mitra, Pradip Bose, Eric Cheng, Chen-Yong Cher, Hyungmin Cho, Rajiv Joshi, Young Moon Kim, Charles R Lefurgy, Yanjing Li, Kenneth P Rodbell, others
VLSI Technology, Systems and Application (VLSI-TSA), Proceedings of Technical Program-2014 International Symposium on, pp. 1--11

Near-Data Processing: Insights from a MICRO-46 Workshop
Rajeev Balasubramonian, Jichuan Chang, Troy Manning, Jaime H Moreno, Richard Murphy, Ravi Nair, Steven Swanson
Micro, IEEE 34(4), 36--42, IEEE, 2014

A million spiking-neuron integrated circuit with a scalable communication network and interface
Paul A. Merolla, John V. Arthur, Rodrigo Alvarez-Icaza, Andrew S. Cassidy, Jun Sawada, Filipp Akopyan, Bryan L. Jackson, Nabil Imam, Chen Guo, Yutaka Nakamura, Bernard Brezzo, Ivan Vo, Steven K. Esser, Rathinakumar Appuswamy, Brian Taba, Arnon Amir, Myron
Science 345(6197), 668--673, American Association for the Advancement of Science, 2014


2013

Scaling analytics applications with OpenCL for loosely coupled heterogeneous clusters
Toshio Suganuma, Rajaram B Krishnamurthy, Moriyoshi Ohara, Toshio Nakatani
Proceedings of the ACM International Conference on Computing Frontiers, pp. 35, 2013

The Liquid Metal Blokus Duo Design
Erik R. Altman, Joshua S. Auerbach, David F. Bacon, Ioana Baldini, Perry Cheng, Stephen J. Fink, Rodric M. Rabbah
International Conference on Field-Programmable Technology, pp. 490-493, 2013
Abstract

The Liquid Metal IP bridge
Perry Cheng, Stephen J Fink, Rodric M Rabbah, Sunil Shukla
Proceedings of the 18th Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 313--319, 2013

A systematic methodology to generate decomposable and responsive power models for CMPs
Ramon Bertran, Marc Gonzelez, Xavier Martorell, Nacho Navarro, Eduard Ayguad\'e
Computers, IEEE Transactions On 62(7), 1289--1302, IEEE, 2013

Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up
Ramon Bertran, Marc Gonz\`alez, Xavier Martorell, Nacho Navarro, Eduard Ayguad\'e
The Computer Journal 56(2), 198--213, Br Computer Soc, 2013


2012

Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks
Ramon Bertran, Alper Buyuktosunoglu, Meeta S Gupta, Marc Gonzalez, Pradip Bose, Barcelona Supercomputing Center
Internation Symposium on Microarchitecture (MICRO), pp. 199--211, 2012

A Reconfigurable Optical/Electrical Interconnect Architecture for Large-scale Clusters and Datacenters
Diego Lugones, Kostas Katrinis, Martin Collier
ACM Computing Frontiers (Best Paper Award), pp. 13-22, ACM, 2012
(Best Paper Award)

Power Management of Multi-Core Chips: Challenges and Pitfalls
Pradip Bose, Alper Buyuktosunoglu, John A. Darringer, Meeta S. Gupta, Michael B. Healy, Hans Jacobson, Indira Nair, Jude A. Rivers, Jeonghee Shin, Augusto Vega, Alan J. Weger
Design, Automation & Test in Europe Conference & Exhibition, pp. 977--982, 2012

Guest Editorial: Parallel Systems and Compilers
Valentina Salapura, Michael Gschwind, Jens Knoop
International Journal of Parallel Programming 40(1), 1--3, Springer Netherlands, 2012

Cache Restoration for Highly Partitioned Virtualized Systems
David M. Daly, Harold W. Cain
Proceedings of the 18th International Symposium on High Performance Computer Architecture (HPCA), 2012


2011

Local memory design space exploration for high-performance computing
Ramon Bertran, Marc Gonz\`alez, Xavier Martorell, Nacho Navarro, Eduard Ayguad\'e
The Computer Journal 54(5), 786--799, Br Computer Soc, 2011

Design space exploration for aggressive core replication schemes in CMPs
Lluc Alvarez, Ramon Bertran, Marc Gonz\`alez, Xavier Martorell, Nacho Navarro, Eduard Ayguad\'e
Proceedings of the 20th international symposium on High performance distributed computing, pp. 269--270, 2011

Exploring the architecture of a stream register-based snoop filter
Matthias Blumrich, Valentina Salapura, Alan Gara
Transactions on high-performance embedded architectures and compilers III, pp. 93--114, Springer Berlin Heidelberg, 2011

The IBM Blue Gene/Q interconnection network and message unit
Dong Chen, Noel A Eisley, Philip Heidelberger, Robert M Senger, Yutaka Sugawara, Sameer Kumar, Valentina Salapura, David L Satterfield, Burkhard Steinmacher-Burow, Jeffrey J Parker
High Performance Computing, Networking, Storage and Analysis (SC), 2011 International Conference for, pp. 1--10

SoftBeam: Precise tracking of transient faults and vulnerability analysis at processor design time
Michael Gschwind, Valentina Salapura, Catherine Trammell, Sally A McKee
IEEE 29th International Conference on Computer Design (ICCD), pp. 404--410, IEEE, 2011

A Hybrid Approach for Large Cache Performance Studies
D Daly, P Dube, K El Maghraoui, D Poff, Li Zhang
International Conference on Quantitative Evaluation of Systems (QEST), pp. 47--56, 2011

A Tool for Scalable Profiling and Tracing of Java and Native Code Interactions
Parijat Dube, Seetharami Seelam, Yanbin Liu, Megumi Ito, Michel Hack, Liana Fong, Graeme Johnson, Michael Dawson, Li Zhang and Yuqing Gao
QEST , pp. 37--46, 2011

Error Tolerance in Server Class Processors
J A Rivers, M S Gupta, J Shin, P N Kudva, P Bose
Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 30(7), 945--959, IEEE, 2011

Scalable Proximity-Aware Cache Replication in Chip Multiprocessors (short paper)
Chongmin Li, Haixia Wang, Yibo Xue, Dongsheng Wang (Tsinghua University), Jian Li
The Twentieth International Conference on Parallel Architectures and Compilation Techniques (PACT) , 2011

TAPO: Thermal-Aware Power Optimization Techniques for Servers and Data Centers (Best Paper Award)
Wei Huang, Malcolm Allen-Ware, John Carter, Elmootazbellah Elnozahy, Hendrik Hamann, Tom Keller, Charles Lefurgy, Jian Li, Karthick Rajamani and Juan Rubio
Second International Green Computing Conference (IGCC'11), 2011

Multi-Channel Wireless Network-on-Chip: A New Approach to Improving On-Chip Communication Capacity
Dan Zhao, Yi Wang, Jian Li and Takamaro Kikkawa
International Symposium on Networks-on-Chip (NOCS), 2011

A Cost-Efficient L1-L2 Multicore Interconnect: Performance, Power and Area Considerations
Amit Golander, Nadav Levison, Omer Heymann, Alexander Briskman, Mark J. Wolski and Eric F. Robinson
Circuits and Systems I, IEEE Transactions on 58(3), IEEE, 2011

Coordinating DRAM and Last-Level-Cache Policies with the Virtual Write Queue
J. Stuecheli, D. Kaseridis, L.K. John, D. Daly, H.C. Hunter
IEEE Micro, Top Picks v. 31, 90--98, Published by the IEEE Computer Society, 2011

Loop Transformations: Convexity, Pruning, and Optimization
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan, and Nicolas Vasilache
ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL), 2011

Power Shifting in Thrifty Interconnection Networks
Jian Li, Wei Huang, Lixin Zhang, Charles Lefurgy, Wolfgang Denzel, Richard Treumann and Kun Wang
To appear in International Symposiun on High Performance Computer Architecture (HPCA), 2011


2010

Decomposable and responsive power models for multicore processors using performance counters
Ramon Bertran, Marc Gonzalez, Xavier Martorell, Nacho Navarro, Eduard Ayguade
Proceedings of the 24th ACM International Conference on Supercomputing, pp. 147--158, 2010

DACO: A High-Performance Disk Architecture Designed Specially for Large-Scale Erasure-Coded Storage Systems
M. Li, J. Shu
Computers, IEEE Transactions on 59(10), 1350--1362, IEEE, 2010

SAFER: Stuck-at-fault error recovery for memories
N H Seong, D H Woo, V Srinivasan, J A Rivers, H H S Lee
2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 115--124

Power and Thermal Characterization of the POWER6 System
Victor Jimenez, Fran J. Cazorla, Roberto Gioiosa, Eren Kursun, Chen Yong Cher, Alper Buyuktosunoglu, Pradip Bose, and Mateo Valero
In Proceedings of ACM International Conference on Parallel Architectures and Compilation Techniques, PACT, pp. 7-18, Vienna, Austria, 2010

Power-efficient, Reliable Microprocessor Architectures: Modeling and Design Methods
Pradip Bose, Alper Buyuktosunoglu, Chen Yong Cher, John Darringer, Meeta Sharma Gupta, Hendrik Hamann, Hans Jacobson, Prabhakar Kudva, Eren Kursun, Niti Madan, Indira Nair, Jude A. Rivers, Jeonghee Shin, Alan J. Weger, Victor Zyuban
In Proceedings of IEEE/ACM GLSVLSI Symposium, pp. 299-304, 2010

Architecting for power management: The IBM POWER7 approach
Malcolm Ware, Karthick Rajamani, Michael Floyd, Bishop Brock, Juan C Rubio, Freeman Rawson, John B Carter
HPCA, pp. 1--11, 2010

A Wire-Speed PowerTM Processor: 2.3GHz 45nm SOI with 16 Cores and 64 Threads
Charles Johnson, David H. Allen, Jeff Brown, Steve Vanderwiel, Russ Hoover, Heather Achilles, Chen-Yong Cher, George A. May, Hubertus Franke, Jimi Xenedis, Claude Basso
2010 IEEE International Solid-State Circuits Conference (ISSCC)

Lime: a Java-compatible and synthesizable language for heterogeneous architectures
Joshua Auerbach, David F Bacon, Perry Cheng, Rodric Rabbah
Proceedings of the ACM International Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), pp. 89--108, ACM, 2010
Abstract

Performance Analysis Of Idle Programs
Erik Altman, Matthew Arnold, Stephen Fink, Nick Mitchell
Proceedings of OOPSLA/SPLASH 2010, pp. 739--753, ACM

Memory Latency Reduction via Thread Throttling
Hsiang-Yun Cheng, Jian Li, Chung-Hsiang Lin and Chia-Lin Yang
To appear in the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2010

The PERCS High-Performance Interconnect
Baba Arimilli, Ravi Arimilli, Vicente Chung, Scott Clark, Wolfgang Denzel, Ben Drerup, Torsten Hoefler, Jody Joyner, Jerry Lewis, Jian Li, Nan Ni, Ram Rajamony
HotInterconnect (Hoti) '2010: Proceedings of the 18th Annual Symposium on High-Performance Interconnects, pp. 75--82

Design Exploration of Hybrid cache architecture with disparate memory technologies
Xiaoxia Wu, Jian Li, Lixin Zhang, Evan Speight, Ram Rajamony and Yuan Xie
ACM Transaction on Architecture and Code Optimization (TACO), 2010

Power-efficient, Reliable Microprocessor Architectures: Modeling and Design Methods
Pradip Bose, Alper Buyuktosunoglu, Chen-Yong Cher, John A. Darringer, Meeta Sharma Gupta, Hendrik F. Hamann, Hans M. Jacobson, Prabhakar Kudva, Eren Kursun, Niti Madan, Indira Nair, Jude A. Rivers, Jeonghee Shin, Alan J. Weger and Victor V. Zyuban
In Proceedings of IEEE International Symposium, GLSVLSI, pp. 299 - 304, pp. 299--304, IEEE, ACM, 2010
Abstract

Can CPUs Match GPUs on Performance with Productivity?: Experiences with Optimizing a FLOP-intensive Application on CPUs and GPU
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC25033, IBM T.J. Watson Research Center, Yorktown Heights, New York, 2010

A Model for Fusion and Code Motion in an Automatic Parallelizing Compiler
Uday Bondhugula, Oktay Gunluk, Sanjeeb Dash, Lakshminarayanan Renganarayana
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010

DMATiler: Revisiting Loop Tiling for Direct Memory Access (Extended Abstract)
HaiBo Lin, Tao Liu, Huoding Li, Tong Chen, Lakshminarayanan Renganarayana, Kevin O'Brien, Ling Shao
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010

Memx: Virtualization of cluster-wide memory
Umesh Deshpande, Beilan Wang, Shafee Haque, Michael Hines, Kartik Gopalan
2010 39th International Conference on Parallel Processing, pp. 663--672

Workload performance characterization of DARPA HPCS benchmarks (Outstanding Paper Award)
S Seelam, I H Chung, G Cong, H F Wen, D Klepacki
Concurrency and Computation: Practice and Experience 22(4), 441--461, John Wiley \& Sons, 2010

Trends and Techniques for Energy Efficient Architectures
Victor Jimenez, Roberto Gioiosa, Eren Kursun, Francisco J. Cazorla, Chen-Yong Cher, Alper Buyuktosunoglu, Pradip Bose, Mateo Valero
In Proceedings of the 18th IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC, pp. 276-279, pp. 276--279, 2010

Power-efficient, reliable microprocessor architectures: modeling and design methods
Pradip Bose, Alper Buyuktosunoglu, Chen-Yong Cher, John A. Darringer, Meeta S. Gupta, Hendrik Hamann, Hans Jacobson, Prabhakar N. Kudva, Eren Kursun, Niti Madan, Indira Nair, Jude A. Rivers, Jeonghee Shin, Alan J. Weger, Victor Zyuban
Proceedings of the 20th symposium on Great lakes symposium on VLSI (GLVLSI), 2010

A Wire-Speed Power (TM) Processor: 2.3GHz 45nm SOI with 16 Cores and 64 Threads
C. Johnson, D. H. Allen, J. Brown, S. Vanderwiel, R. Hoover, H. Achilles, C-Y. Cher, G. A. May, H. Franke, J. Xenedis, C. Basso
2010 IEEE International Solid-State Circuits Conference (ISSCC)

Performance and power evaluation of an in-line accelerator
Alejandro Rico, Jeff H Derby, Robert K Montoye, Timothy H Heil, Chen-Yong Cher, Pradip Bose
Proceedings of the 7th ACM international conference on Computing frontiers, pp. 81--82, 2010

Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework
Louis-Noel Pouchet, Uday Bondhugula, Cedric Bastoul, Albert Cohen, J Ramanujam, P Sadayappan
Supercomputing (SC) 2010

Power-efficient, reliable microprocessor architectures: modeling and design methods
P Bose, A Buyuktosunoglu, C Y Cher, J A Darringer, M S Gupta, H Hamann, H Jacobson, P N Kudva, E Kursun, N Madan, others
Proceedings of the 20th symposium on Great lakes symposium on VLSI, pp. 299--304, 2010

Design Exploration of Hybrid cache architecture with disparate memory technologies
X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony and Y. Xie
ACM Transactions on Architecture and Code Optimization (TACO), ACM, 2010

Enigma: Architectural Support and Operating System Support for Reducing the Impact of Address Translation
Lixin Zhang, Evan Speight, Ram Rajamony, Jiang Lin
International Conference on Supercomputing (ICS), ACM/SIGARCH, 2010

Power and Thermal Characterization of POWER6 System. In the International Conference on Parallel Architectures and Compilation Techniques (PACT), Vienna, Austria, Sep. 2010.
Victor Jimenez, Francisco J. Cazorla, Roberto Gioiosa, Eren Kursun, Canturk Isci, Chen-Yong Cher, Alper Buyuktosunoglu, Pradip Bose, and Mateo Valero
In the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010

FPGA-based combined architecture for stream categorization and intrusion detection
S Shukla, R Rabbah, M Vorbach
Proceedings of the 8th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE), pp. 77--80, 2010

A Unified Execution Model for Cloud Computing
Eric van Hensbergen, Noah Evans, Phillip Stanley-Marbell
Large Scale Distributed Systems and Middleware, (LADIS 2009), Co-located with the 22nd ACM Symposium on Operating Systems Principles (SOSP 2009), pp. 12--17, ACM, 2010

Performance of large low-associativity caches
Parijat Dube, Li Zhang, David Daly, Alan Bivens
SIGMETRICS Perform. Eval. Rev. 37(4), 11--18, ACM, 2010
Abstract

The Virtual Write Queue: Coordinating DRAM and Last-Level Cache Policies
J Stuecheli, D Kaseridis, D Daly, H C Hunter, L K John
Proceedings of the The ACM IEEE International Symposium on Computer Architecture (ISCA2010)

A Model for Fusion and Code Motion in an Automatic Parallelizing Compiler
Uday Bondhugula, Oktay Gunluk, Sanjeeb Dash, Lakshminarayanan Renganarayana
International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010

Believe it or Not! Multicore CPUs can Match GPUs for FLOP-intensive Applications!
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao
Research Report RC24982, IBM TJ Watson Research Center, Yorktown Heights, New York, 2010

GRAPHITE Two Years After
K Trifunovic, A Cohen, D Edelsohn, L Feng, T Grosser, H Jagasia, R Ladelsky, S Pop
GCC Research Opportunities (GROW'10), 2010

Trace-Based Data Layout Optimizations for Multi-core Processors
O Golovanevsky, A Dayan, A Zaks, D Edelsohn
High Performance Embedded Architectures and Compilers, 81--95, Springer, 2010

Compiling for Reduced Bit-Width Queue Processors
Arquimedes Canedo, Ben A Abderazek, Masahiro Sowa
Journal of Signal Processing Systems 59(1), 45--55, Springer, 2010

Natural Instruction Level Parallelism-aware Compiler for High-Performance QueueCore Processor Architecture
Ben A Abderazek, Masashi Masuda, Arquimedes Canedo, Kenichi Kuroda
Journal of Supercomputing, Springer, 2010

Automatic Parallelization of Simulink Applications
Arquimedes Canedo, Takeo Yoshizawa, Hideaki Komatsu
International Symposium on Code Generation and Optimization 2010

Skewed Pipelining for Parallel Simulink Simulations
Arquimedes Canedo, Takeo Yoshizawa, Hideaki Komatsu
Design, Automation and Test in Europe 2010

Automatic Creation of Tile Size Selection Models
Tomofumi Yuki, Lakshminarayanan Renganarayana, Sanjay Rajopadhye, Charles Anderson, Alexandre Eichenberger and Kevin O'Brien
International Symposium on Code Generation and Optimization (CGO), 2010

Observations on Tuning a Java Enterprise Application for Performance and Scalability
Erik Altman, Matthew Arnold, Rajesh Bordawekar, Robert Delmonico, Nick Mitchell, Peter F. Sweeney
IBM Journal of Research and Development 54(5), 2, IBM, 2010

MacroSS: macro-SIMDization of streaming applications
Amir H Hormati, Yoonseo Choi, Mark Woh, Manjunath Kudlur, Rodric Rabbah, Trevor Mudge, Scott Mahlke
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems (ASPLOS), pp. 285--296, ACM, 2010
Abstract


2009

The data-centricity of Web 2.0 workloads and its impact on server performance
Moriyoshi Ohara, Priya Nagpurkar, Yohei Ueda, Kazuaki Ishizaki
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009), pp. 133-142, IEEE

A Computing Origami: Folding Streams in FPGAs
Andrei Hagiescu, Weng-Fai Wong, David F. Bacon, Rodric Rabbah
Proceedings of the 46th Annual Design Automation Conference (DAC), pp. 282--287, ACM, 2009
Abstract

A tagless coherence directory
J Zebchuk, V Srinivasan, M K Qureshi, A Moshovos
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 423--434, 2009

Enhancing lifetime and security of pcm-based main memory with start-gap wear leveling
M K Qureshi, J Karidis, M Franceschini, V Srinivasan, L Lastras, B Abali
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 14--23, 2009

Scalable high performance main memory system using phase-change memory technology
M K Qureshi, V Srinivasan, J A Rivers
Proceedings of the 36th annual international symposium on Computer architecture, pp. 24--33, 2009

Scalable data center provisioning and control
M Silva, M Banikazemi, M Butrico, D Daly, S Guthridge, J E Moreira, W V Ruggiero
IBM Journal of Research and Development 53(4), 2009

Design and performance of speculative flow control for high-radix datacenter interconnect switches
Cyriel Minkenberg, Mitchell Gusat
Journal of Parallel and Distributed Computing 69(8), 680--695, Elsevier, 2009

Adaptive Routing in Data Center Bridges
Cyriel Minkenberg, Mitchell Gusat, German Rodriguez
High Performance Interconnects, 2009. HOTI 2009. 17th IEEE Symposium on, pp. 33--41

Benchmarking the Ethernet-federated datacenter
M Gusat, C DeCusatis, C Minkenberg, L McKenna, K Bhardwaj, GJ Paljak, A Pataricza, I Kocsis
Proc. First Workshop on Data Center - Converged and Virtual Ethernet Switching (DC-CAVES 2009)

Compiling Effectively for Cell with GCC
Ira Rosen, David Edelsohn, Ben Elliston, Revital Eres, Alan Modra, Dorit Nuzman, Ulrich Weigand, and Ayal Zaks
CPC 2009

Compiler Support for Code Size Reduction using a Queue-based Processor
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Lecture Notes in Computer Science, pp. 269--285, Springer Berlin, 2009

Design and implementation of a queue compiler
Arquimedes Canedo, Ben A Abderazek, Masahiro Sowa
Microprocess. Microsyst. 33(2), 129--138, Elsevier Science Publishers B. V., 2009

Efficient Compilation for Queue Size Constrained Queue Processors
Arquimedes Canedo, Ben A Abderazek, Masahiro Sowa
Parallel Comput. 35(4), 213--225, Elsevier Science Publishers B. V., 2009

Software and Hardware Design Issues for Low Complexity High Performance Processor Architecture
Masashi Masuda, Abderazek Ben Abdallah, Arquimedes Canedo
ICPP Workshops, pp. 558-565, 2009

System Resilience at Extreme Scale -- White Paper
T El-Ghazawi, A Fox, B F Godfrey, M D Cray, A Hoisie, J Plank, J Simons, E N M Elnozahy, A IBM
Department of Defense -- Defense Advanced Research Project Agency, U.S. Department of Defense -- Defense Advanced Research Project Agency, 2009

ITCA: Inter-task Conflict-Aware CPU Accounting for CMPs
C Luque, M Moreto, F J Cazorla, R Gioiosa, A Buyuktosunoglu, M Valero
Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques-Volume 00, pp. 203--213

CPU Accounting in CMP Processors
C Luque, M Moreto, F J Cazorla, R Gioiosa, A Buyuktosunoglu, M Valero
IEEE Computer Architecture Letters 8(1), 17--20, IEEE Computer Society, 2009

Dynamic power gating with quality guarantees
Anita Lungu, Pradip Bose, Alper Buyuktosunoglu, Daniel J Sorin
Proceedings of the 2009 ACM/IEEE international symposium on Low power electronics and design, pp. 377--382

Temperature Variation Characterization and Thermal Management of Multicore Architectures
Eren Kursun, Chen Yong Cher
IEEE Micro Top Picks, 2009

Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors
Q Lu, C Alias, U Bondhugula, T Henretty, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan, Y Chen, H Lin, others
Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, 2009

Hybrid Iterative and Model-Driven Optimization in the Polyhedral Model
L N Pouchet, U Bondhugula, C Bastoul, A Cohen, R Ramanujam, P Sadayappan
INRIA Research Report 6269, INRIA Saclay, France, 2009

Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors
M Baskaran, N Vydyanathan, Uday Bondhugula, J Ramanujam, A Rountev, P Sadayappan
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 219--228, 2009

Compact Multi-Dimensional Kernel Extraction for Register Tiling
L Renganarayana, Uday Bondhugula, Salem Dersavi, Alexandre E. Eichenberger, Kevin O'Brien
Proceedings of the 22nd International Conference on High Performance Networking and Computing (SC), pp. 1--12, 2009

Formal verification of correctness and performance of random priority-based arbiters
Krishnan Kailas, Viresh Paruthi, Brian Monwai
Proc. of 9th International Conference on Formal Methods in Computer-Aided Design (FMCAD 2009), pp. 101-107, IEEE

Polyhedral-model guided loop-nest auto-vectorization
Konrad Trifunovic, Dorit Nuzman, Albert Cohen, Ayal Zaks, Ira Rosen
Parallel Architectures and Compilation Techniques, 2009. PACT'09. 18th International Conference on, pp. 327--337

Encoding Efficiency of Digital Number Representations under Deviation Constraints
Phillip Stanley-Marbell
IEEE Information Theory Workshop, ITW 2009, Taormina, Sicily


Snugglebug: A Powerful Approach to Weakest Preconditions
S. Chandra, S. Fink, M. Sridharan
ACM Conference on Programming Language Design and Implementation (PLDI), pp. 363--374, ACM, 2009

Securing virtual machine monitors: what is needed?
PA Karger
ACM Symposium on Information, Computer & Communication Security (ASIACCS), pp. xiv-xv, Association for Computing Machinery, 2009

Multicore power management: Ensuring robustness via early-stage formal verification
Anita Lungu, Pradip Bose, Daniel J Sorin, Steven German, Geert Janssen
Formal Methods and Models for Co-Design, 2009. MEMOCODE'09. 7th IEEE/ACM International Conference on, pp. 78--87

Exploiting Parallelism with Dependence - Aware Scheduling
Xiaotong Zhuang, Alexandre E Eichenberger, Yangchun Luo, Kevin O'Brien, Kathryn O'Brien
Proceeding of the International Conference on Parallel Architectures and Compilation Techniques (PACT), 2009

Transformational Hybrid Systems
Jaime H Moreno
IEEE-INDUSTRY DAY, 2009

True value: assessing and optimizing the cost of computing at the data center level
J Karidis, JE Moreira, Jaime H Moreno
6th ACM Conference on Computing Frontiers, 2009

Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures
A H Hormati, Y Choi, M Kudlur, R Rabbah, T Mudge, S Mahlke
Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 214--223, 2009


2008

A proactive wearout recovery approach for exploiting microarchitectural redundancy to extend cache SRAM lifetime
Jeonghee Shin, Victor Zyuban, Pradip Bose, Timothy M Pinkston
ACM SIGARCH Computer Architecture News, pp. 353--362, 2008

Optimus: Efficient Realization of Streaming Applications on FPGAs
Amir Hormati, Manjunath Kudlur, Scott Mahlke, David F. Bacon, Rodric Rabbah
Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES), pp. 41--50, ACM, 2008
Abstract

Quantitative Evaluation of Common Subexpression Elimination on Queue Machines
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the International Symposium on Parallel Architectures, Algorithms, and Networks, pp. 25--30, 2008

Software-controlled priority characterization of POWER5 processor
C Boneti, F J Cazorla, R Gioiosa, A Buyuktosunoglu, C Y Cher, M Valero
Proceedings of the 35th International Symposium on Computer Architecture (ISCA), pp. 415--426, 2008


A practical automatic polyhedral parallelizer and locality optimizer
Uday Bondhugula, A Hartono, J Ramanujam, P Sadayappan
Proceedings of the ACM SIGPLAN conference on Programming Language Design and Implementation, pp. 101--113, 2008

A compiler framework for optimization of affine loop nests for GPGPUs
M Baskaran, U Bondhugula, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
Proceedings of the 22nd International conference on Supercomputing, pp. 225--234, 2008

Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories
M Baskaran, U Bondhugula, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 1--10, 2008

Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model
Uday Bondhugula, M Baskaran, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
International Conference on Compiler Construction (CC), pp. 132--146, 2008

Scalable and Efficient Tools for Multi-level Tiling
Lakshminarayanan Renganarayana
PhD Dissertation -- Colorado State University, 2008

A domain specific interconnect for reconfigurable computing
S Rajopadhye, G Gupta, L Renganarayana
ACM Conference on Languages, compilers, and tools for embedded systems (LCTES), pp. 79--88, ACM, 2008

Positivity, posynomials and tile size selection
L Renganarayana, S Rajopadhye
Proceedings of the 2008 ACM/IEEE conference on Supercomputing (SC), pp. 1--12

MILEPOST GCC: machine learning based research compiler
G Fursin, C Miranda, O Temam, E Yom-Tov
Proceedings of the GCC Developers' Summit, 2008

Outer-loop vectorization: revisited for short simd architectures
Dorit Nuzman, Ayal Zaks
Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 2--11, 2008

Compiling for an indirect vector register architecture
Dorit Nuzman, Mircea Namolaru, Ayal Zaks, Jeff H Derby
Proceedings of the 5th Conference on Computing Frontiers, pp. 199--208, 2008

On Architecture-Measuring Architectural Complexity
G Booch
IEEE Software 25(4), 14--15, [Los Alamitos, CA: IEEE Computer Society, c1984-, 2008

Thousand-Core Chips
David Yeh, Li-Shiuan Peh, Shekhar Borkar, John A. Darringer, Anant Agarwal, Wen-mei Hwu
Design and Test Roundtable, 2008

Providing platform heterogeneity-awareness for data center power management
R Nathuji, C Isci, E Gorbatov, K Schwan
Cluster Computing 11(3), 259--271, Springer, 2008

Power management solutions for computer systems and datacenters
K Rajamani, C Lefurgy, S Ghiasi, J Rubio, H Hanson, T Keller
ISLPED - 2008 - portal.acm.org

Exploring power management in multi-core systems
Reinaldo A. Bergamaschi, Guoling Han, Alper Buyuktosunoglu, Hiren D. Patel, Indira Nair, Gero Dittmann, Geert Janssen, Nagu R. Dhanwada, Zhigang Hu, Pradip Bose, John A. Darringer
Proceedings of the 13th IEEE/ACM Asia and South Pacific Design Automation Conference (ASP-DAC 2008), pp. 708--713

Keeping hot chips cool: are IC thermal problems hot air?
R Puri, D Varma, D Edwards, A J Weger, P Franzon, A Yang, S Kosonocky
Proceedings of the 45th annual Design Automation Conference, pp. 634--635, 2008

Tapping into the fountain of CPUs: on operating system support for programmable devices
Yaron Weinsberg, Danny Dolev, Tal Anker, Muli Ben-Yehuda, Pete Wyckoff
ASPLOS '08: Thirteenth international conference on Architectural Support for Programming Languages and Operating Systems, pp. 179--188, 2008

Hybrid Access-Specific Software Cache Techniques for the Cell BE Architecture
Marc Gonzalez, Nikola Vujic, Alexandre E. Eichenberger, Xavier Martorell, Eduard Ayguade, Tong Chen, Zehra Sura, Tao Zhang, Kevin O'Brien, Kathryn O'Brien
Proceedings of the 17th international conference on Parallel architectures and compilation techniques (PACT), pp. 292--302, 2008

A lightweight streaming layer for multicore execution
David Zhang, Qiuyuan J Li, Rodric Rabbah, Saman Amarasinghe
SIGARCH Comput. Archit. News 36(2), 18--27, ACM, 2008
Abstract

Liquid Metal: Object-Oriented Programming Across the Hardware/Software Boundary
Shan Shan Huang, Amir Hormati, David F. Bacon, Rodric Rabbah
Proceedings of the 22nd European Conference on Object-Oriented Programming (ECOOP), pp. 76--103, Springer-Verlag, 2008
Abstract


2007

A framework for architecture-level lifetime reliability modeling
Jeonghee Shin, Victor Zyuban, Zhigang Hu, Jude A Rivers, Pradip Bose
Dependable Systems and Networks, 2007. DSN'07. 37th Annual IEEE/IFIP International Conference on, pp. 534--543

On Flow Completion Time Benchmarking in Datacenters
M Gusat, C Minkenberg, R Luijten
2007 - sb.tmit.bme.hu

Leveraging and Reinforcing FOSS - Experiences from GCC
Nuzman, D. and Edelsohn, D.
IBM Academy of Technology Open Source Software Experiences and Best Practices Conference, 2007

Optimizing Reaching Definitions Overhead in Queue Processors
Yuki Nakanishi, Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Journal of Convergence Information Technology 2(4), 36--40, 2007

Compiler Framework for an Embedded 32-bit Queue Processor
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the International Conference on Convergence Information Technology, pp. 877-884, 2007

An Efficient Code Generation Algorithm for Code Size Reduction using 1-offset P-Code Queue Computation Model
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the IFIP International Conference on Embedded and Ubiquitous Computing EUC2007, pp. 196--208

Queue Register File Optimization Algorithm for QueueCore Processor
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
19th International Symposium on Computer Architecture and High Performance Computing, pp. 169--176, 2007

New Code Generation Algorithm for QueueCore - An Embedded Processor with High ILP
Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 185--192, 2007

Five Years with the High Productivity Computing Systems Program A Perspective
E Elnozahy
2007 IEEE International Parallel and Distributed Processing Symposium, pp. 434

Predicting program behavior based on objective function minimization
R Sarikaya, A Buyuktosunoglu
IEEE 10th International Symposium on Workload Characterization, 2007, pp. 25--34

Evaluating design tradeoffs in on-chip power management for CMPs
J Sharkey, A Buyuktosunoglu, P Bose
Proceedings of the 2007 international symposium on Low power electronics and design, pp. 49

EnergyScale for IBM POWER6 microprocessor-based systems
H Y McCreary, MA Broyles, MS Floyd, AJ Geissler, SP Hartman, FL Rawson, TJ Rosedahl, JC Rubio, MS Ware
IBM Journal of Research and Development 51(6), 775--786, IBM Corp., 2007

System power management support in the IBM POWER6 microprocessor
MS Floyd, S Ghiasi, TW Keller, K Rajamani, FL Rawson, JC Rubio, MS Ware
IBM Journal of Research and Development 51(6), 733--746, IBM Corp., 2007

Affine transformations for communication minimal parallelization and locality optimization of arbitrarily-nested loop sequences
U Bondhugula, M Baskaran, S Krishnamoorthy, J Ramanujam, A Rountev, P Sadayappan
OSU-CISRC-TR43, The Ohio State University, 2007

Automatic mapping of nested loops to FPGAs
Uday Bondhugula, J Ramanujam, P Sadayappan
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 111, 2007

Effective automatic parallelization of stencil computations
S Krishnamoorthy, M Baskaran, U Bondhugula, J Ramanujam, A Rountev, P Sadayappan
ACM SIGPLAN PLDI, pp. 244, 2007

Towards optimal multi-level tiling for stencil computations
L Renganarayanan, M Harthikote-Matha, R Dewri, SV Rajopadhye
21st IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2007

Multi-level tiling: M for the price of one
D G Kim, L Renganarayanan, D Rostron, S Rajopadhye, M M Strout
Proceedings of the 2007 ACM/IEEE conference on Supercomputing (SC)

Parameterized tiled loops for free
L Renganarayanan, D G Kim, S Rajopadhye, M M Strout
ACM SIGPLAN conference on Programming language design and implementation (PLDI), pp. 414, 2007

The economics of architecture-first
G Booch
IEEE SOFTWARE, 18--20, IEEE Computer Society, 2007

Multi-core design automation challenges
JA Darringer
Design Automation Conference, pp. 760 : 764, 2007

Exploiting platform heterogeneity for power efficient data centers
R Nathuji, C Isci, E Gorbatov
Fourth International Conference on Autonomic Computing, 2007, pp. 5--5

Workload Adaptive Power Management with Live Phase Monitoring and Prediction
C Isci, Princeton University, Dept. of Electrical …
2007 - Citeseer

Unique Chips and Systems
E John, J Rubio
2007 - books.google.com

OS-Aware Branch Prediction: Improving Microprocessor Control Flow Prediction for Operating Systems
T Li, LK John, A Sivasubramaniam, N Vijaykrishnan, …
IEEE TRANSACTIONS ON COMPUTERS, 2007 - doi.ieeecomputersociety.org

Thermal response to DVFS: analysis with an Intel Pentium M
H Hanson, SW Keckler, S Ghiasi, K Rajamani, F Rawson, J Rubio
Proceedings of the 2007 international symposium on Low power …, 2007 - portal.acm.org

Architecture and Implementation of the TRIPS Processor
MS Govindan, P Gratz, D Gulati, H Hanson, C Kim, H …
Unique Chips and Systems, 2007 - books.google.com

Implementation and Evaluation of a Dynamically Routed Processor Operand Network
Gratz, K Sankaralingam, H Hanson, P Shivakumar, R …
Proc. 1st Intl Symp. Networks-on-Chip (NOCS 07,null), 2007 - doi.ieeecomputersociety.org

Power, Performance, and Thermal Management for High-Performance Systems
H Hanson, S W Keckler, K Rajamani, S Ghiasi, F Rawson, J Rubio
H Hanson , SW Keckler, K Rajamani, S Ghiasi, F ..., 2007

Novel Addressing Method for Aggregate Types in Queue Processors
Teruhisa Yuki, Arquimedes Canedo, Ben Abderazek, Masahiro Sowa
Proceedings of the International Conference on Convergence Information Technology, pp. 1793-1796, 2007

Performance and security lessons learned from virtualizing the alpha processor
P A Karger
ACM SIGARCH Computer Architecture News 35(2), 401, ACM, 2007

Performance modeling for early analysis of multi-core systems
Reinaldo A. Bergamaschi, Indira Nair, Gero Dittmann, Hiren D. Patel, Geert Janssen, Nagu R. Dhanwada, Alper Buyuktosunoglu, Emrah Acar, Gi-Joon Nam, Dorothy Kucar, Pradip Bose, John A. Darringer, Guoling Han
Proceedings of the 5th IEEE/ACM international conference on hardware/software codesign and system synthesis (CODES+ISSS), pp. 209--214, 2007

Characterizing a complex J2EE workload: A comprehensive analysis and opportunities for optimizations
Y Shuf, IM Steiner
International Symposium on Performance Analysis of Systems …, 2007 - doi.ieeecomputersociety.org

Temperature-limited microprocessors: Measurements and design implications
H F Hamann, A Weger, J Lacey, Z Hu, P Bose, E Cohen, J Wakil
2007 - computer.org, IEEE Computer Society

Hotspot-limited microprocessors: Direct temperature and power distribution measurements
Hendrik F. Hamann, Alan J. Weger, James A. Lacey, Zhigang Hu, Pradip Bose, Erwin Cohen, and Jamil Wakil
IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 42, NO. 1, JANUARY 2007 V2(1), 56--65, IEEE


2006

MPI microtask for programming the Cell Broadband Engine processor
Ohara, M. Inoue, H. Sohda, Y. Komatsu, H. Nakatani, T.
IBM Systems Journal 45(1), 85 - 102, IBM, 2006

Designing a crossbar scheduler for HPC applications
C Minkenberg, F Abel, P Muller, R Krishnamurthy, M Gusat, P Dill, I Iliadis, R Luijten, BR Hemenway, R Grzybowski, others
IEEE Micro 26(3), 58-71, 2006
Abstract

Temperature Management: Investigating the Effects of Task Scheduling on Thermal Behavior
Eren Kursun, Chen Yong Cher, Alper Buyuktosunoglu, Pradip Bose
In Proceedings of International Symposium on Computer Architecture, Temperature-Aware Computer Systems , 2006

Software/Hardware Co-managed Cache Optimizations
Rajiv Ravindran, Krishnan Kailas, Zehra Sura
IBM Research Report RC23998RC 23998, IBM Research Division, 2006

Hardware/software integration for FPGA-based all-pairs shortest-paths
Uday Bondhugula, A Devulapalli, J Dinan, J Fernando, P Wyckoff, E Stahlberg, P Sadayappan
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 152--164, 2006

Parallel FPGA-based all-pairs shortest-paths in a directed graph
Uday Bondhugula, A Devulapalli, J Fernando, P Wyckoff, P Sadayappan
Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2006

Auto-vectorization of interleaved data for SIMD
Dorit Nuzman, Ira Rosen, Ayal Zaks
ACM SIGPLAN Notices, pp. 132--143, 2006

From single core to multi-core: preparing for a new exponential
J Parkhurst, J Darringer, B Grundmann
ICCAD, pp. 67-72, 2006

Live, runtime phase monitoring and prediction on real systems with application to dynamic power management
C Isci, G Contreras, M Martonosi
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 359--370, 2006

Phase characterization for power: Evaluating control-flow-based and event-counter-based techniques
C Isci, M Martonosi
HPCA’06: Proceedings of the Twelfth International Symposium on High-Performance Computer Architecture, 2006

An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget
C Isci, A Buyuktosunoglu, C Y Cher, P Bose, M Martonosi
39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 347--358, 2006

Benchmarking for Power and Performance
H Hanson, K Rajamani, J Rubio, S Ghiasi, F Rawson
2006 SPEC Benchmark Workshop

Application-Aware Power Management
K Rajamani, H Hanson, J Rubio, S Ghiasi, F Rawson
Proceedings of the 2006 IEEE International Symposium on … - doi.ieeecomputersociety.org

Online Power and Performance Estimation for Dynamic Power Management
K Rajamani, H Hanson, JC Rubio, S Ghiasi, FL …
Research Report RC - domino.research.ibm.com, 2006


https://www.research.ibm.com/journal/rd/502/agerwala.html
T Agerwala, M Gupta
IBM Journal of Research and Development, 2006 - research.ibm.com

Performance and environment monitoring for continuous program optimization
C Cascaval, E Duesterwald, P F Sweeney, R W Wisniewski
IBM Journal of Research and Development 50(2-3), 239--248, IBM, 2006

HPC-Colony: services and interfaces for very large systems
Kal, T Jones, A Tauferner, T Inglett, J Moreira
ACM SIGOPS Operating Systems Review, 2006 - portal.acm.org

Braids and Fibers: Language Constructs with Architectural Support for Adaptive Responses to Memory Latencies
David F. Bacon, Xiaowei Shen
IBM Journal of Research and Development 50(2/3), 209--221, IBM Corp., 2006
Abstract

Chip-level integration: the new frontier for microprocessor architecture
Jaime H Moreno
ACM Symposium on Parallel Algorithms and Architectures, Boston, 2006

Systems on a chip: The new generation of microprocessors (in Spanish)
Jaime H Moreno
Informatica (Chile), 2006

Online performance auditing: using hot optimizations without getting burned
J Lau, M Arnold, M Hind, B Calder
Proceedings of the 2006 ACM SIGPLAN conference on …, 2006 - portal.acm.org

Online phase detection algorithms
Priya Nagpurkar, Michael Hind, Chandra Krintz, Peter F Sweeney, VT Rajan
Code Generation and Optimization, 2006. CGO 2006. International Symposium on, pp. 13--pp

Using Advanced Compiler Technology to Exploit the Performance of the Cell Broadband Engine Architecture
A.E. Eichenberger, J. K. O'Brien, K. M. O'Brien, P. Wu, T. Chen, P. H. Oden, D. A. Prener, J. C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, M. K. Gschwind, R. Archambault, Y. Gao, and R. Koo
IBM System Journal, vol 45, num 1 45(1), 2006

Loosely coupled tcp acceleration architecture
L Shalev, V Makhervaks, Z Machulsky, G Biran, J Satran, M Ben-Yehuda, I Shimony
14th IEEE Symposium on High-Performance Interconnects, pp. 3--8, IEEE Computer Society, 2006

A characterization of a java-based commercial workload on a high-end enterprise server
IM Steiner, Y Shuf
ACM SIGMETRICS Performance Evaluation Review, 2006 - portal.acm.org

Power Distribution Measurements of the Dual Core PowerPC TM 970MP Microprocessor
HF Hamann, A Weger, J Lacey, E Cohen, C Atherton
IEEE International Solid-State Circuits Conference, 2006. …, 2006 - ieeexplore.ieee.org

Guest Editor Introduction
Erik Altman and Sumedh Sathaye
IBM Journal of Research and Development 50(2/3), 169, IBM, 2006

A tutorial example of a cache memory protocol and RTL implementation
S German, G Janssen
IBM Research Report, RC23958, Tech. Rep, 2006

Delivering Teraflops: An Account of how Blue Gene was Brought to Life
JE Moreira, IBMTJWR Center, Y Heights
IEEE John Vincent Atanasoff 2006 International Symposium on …, 2006 - ieeexplore.ieee.org

Design and implementation of a one-sided communication interface for the IBM eServer Blue Gene …
Smith, G Almsi, J Castaos, D Lieber, J Moreira …
Proceedings of the 2006 ACM/IEEE conference on …, 2006 - portal.acm.org


2005

Reducing Server Data Traffic Using a Hierarchical Computation Model
J Rubio, LK John
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2005 - doi.ieeecomputersociety.org


1996



1995

The SPLASH-2 programs: Characterization and methodological considerations
Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta.
Proceedings of the 22nd annual international symposium on Computer architecture , pp. 24-36, ACM, 1995




Upcoming seminars

A complete list of upcoming and recent seminars can be found here.