A taxonomy of task-based parallel programming technologies for high-performance computing
Peter Thoman, Kiril Dichev, Thomas Heller, Roman Iakymchuk, Xavier Aguilar, Khalid Hasanov, Philipp Gschwandtner, Pierre Lemarinier, Stefano Markidis, Herbert Jordan, Thomas Fahringer, Kostas Katrinis, Erwin Laure, Dimitrios S. Nikolopoulos
The Journal of Supercomputing, Elsevier, 2018

Exploiting the non-linear current-voltage characteristics for resistive memory readout
Nikolaos Papandreou, Abu Sebastian, Haralampos Pozidis
The International Symposium on Circuits and Systems (ISCAS) , 2018

Signal and noise extraction from non-volatile memory for neuromorphic computing: A machine learning based approach
T. Ando, N. Gong, T. Ide, S. Kim, I. Boybat, A. Sebastian, V. Narayanan
Neuro Inspired Computational Elements Workshop (NICE), 2018

A Software-defined SoC Memory Bus Bridge Architecture for Disaggregated Computing
Syrivelis, Dimitris and Reale, Andrea and Katrinis, Kostas and Pinto, Christian
3rd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems (AISTECS 2018, part of HiPEAC 2018)

VELOUR: Very Low Voltage Operation Under Resilience Constraints
Schuyler Eldridge, Alec Roelke, Xinfei Guo, Vaibhav Verma, Karthik Swaminathan, Nandhini Chandramoorthy, Martin Cochet, Alper Buyuktosunoglu, Christos Vezyrtzis, Rajiv Joshi, Matt Ziegler, Mircea Stan, Pradip Bose
GOMACTech, 2018

Cascaded and Resonant SRAM Supply Boosting for Ultra-Low Voltage Cognitive IoT Applications
Rajiv V. Joshi, Matthew M. Ziegler, Karthik Swaminathan, Nandhini Chandramoorthy
IEEE Custom Integrated Circuits Conference (CICC), 2018

Impact of Software Approximations on the Resiliency of a Video Summarization System
Radha Venkatagiri, Karthik Swaminathan, Chung-Ching Lin, Liang Wang, Alper Buyuktosunoglu, Pradip Bose
48th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2018), IEEE

Mixed-precision in-memory computing
Manuel Le Gallo*, Abu Sebastian*, Roland Mathis, Matteo Manica, Heiner Giefers, Tomas Tuma, Costas Bekas, Alessandro Curioni, Evangelos Eleftheriou
Nature Electronics1, 246-253, 2018

Spiking neural networks enable two-dimensional neurons and unsupervised multi-timescale learning
Timoleon Moraitis*, Abu Sebastian*, Evangelos Eleftheriou
International Joint Conference on Neural Networks (IJCNN), 2018

Analytic multi-core processor model for fast design-space exploration
Rik Jongerius, Andreea Anghel, Gero Dittmann, Giovanni Mariani, Erik Vermij, Henk Corporaal
Transactions on Computers 67(6), 755-770, IEEE, 2018
Shadow Puppets: Cloud-level Accurate AI Inference at the Speed and Economy of Edge
Venugopal, Srikumar and Gazzetti, Michele and Gkoufas, Yiannis and Katrinis, Kostas
USENIX Workshop on Hot Topics in Edge Computing (HotEdge), 2018

Enabling 3D-TLC NAND Flash in Enterprise Storage Systems
Pletka, Roman and Ioannou, Nikolas and Papandreou, Nikolaos and Parnell, Thomas and Tomic, Sasa

Signal and noise extraction from analog memory elements for neuromorphic computing
N. Gong, T. Ide, S. Kim, I. Boybat, A. Sebastian, V. Narayanan, T. Ando
Nature Communications 9(2102), 2018

Extending the POWER Architecture with Transprecision Co-Processors
Giefers, Heiner and Diamantopoulos, Dionysios
Circuits and Systems (ISCAS), 2018 IEEE International Symposium on, pp. 1--5

Equivalent-accuracy accelerated neural-network training using analogue memory
S. Ambrogio, P. Narayanan, H. Tsai, R. M. Shelby, I. Boybat, C. di Nolfo, S. Sidler, M. Giordano, M. Bodini, N. C. P. Farinha, B. Killeen, C. Cheng, Y. Jaoudi and G. W. Burr
Nature, 2018

Neuromorphic computing with multi-memristive synapses
Irem Boybat*, Manuel Le Gallo, S.R. Nandakumar, Timoleon Moraitis, Thomas Parnell, Tomas Tuma, Bipin Rajendran, Yusuf Leblebici, Abu Sebastian*, Evangelos Eleftheriou
Nature Communications 9(2514), 2018

Provisioning a bare-metal server
Abali, Bulent and Crumley, Paul G and Franke, Hubertus and Li, Chung-Sheng
US Patent App. 15/337,791

Performance Characterization of State-Of-The-Art Deep Learning Workloads on an IBM" Minsky" Platform
Guignard, Mauricio and Schild, Marcelo and Bederi{\'a}n, Carlos S and Wolovick, Nicol{\'a}s and Vega, Augusto J
Proceedings of the 51st Hawaii International Conference on System Sciences, 2018

Towards a Composable Computer System
Chung, I and Abali, Bulent and Crumley, Paul and others
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, pp. 137--147, 2018

Cache Management in RDMA Distributed Key/Value Stores Based on Atomic Operations
Michel Hack, Yufei Ren, Yandong Wang, Li Zhang
US Patent 10,031,883; US Patent 10,037,302

Energy-secure swarm power management
Vega, Augusto and Buyuktosunoglu, Alper and Bose, Pradip
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2018, pp. 1652--1657

Towards "Smarter" Vehicles through Cloud-Backed Swarm Cognition
A. Vega, A. Buyuktosunoglu, P. Bose
IEEE Intelligent Vehicles Symposium (IV 2018)


Performance Characteristics of Linux for Java Workloads Oversubscribing Memory
Takuya Nakaike Yohei Ueda Takanori and Ohara, Moriyoshi
Conference proceedings Japan Society for Software Science and Technology, 2017

SHIVA: An Integrated Toolset for Cross-Layer Modeling in Support of Resilient, Low-Power Embedded Processor Design
Karthik Swaminathan, Ramon Bertran, Schuyler Eldridge, Chen - Yong Cher, Hans Jacobson, Augusto Vega, Alper Buyuktosunoglu, John - David Wellman, Robert Montoye, Pradip Bose
GOMACTech-17, 2017

Adaptive Clocking in the POWER9TM Processor for Voltage Droop Protection
Pierce I-Jen Chuang, Christos Vezyrtzis, Divya Pathak, Richard Rizzolo, Tobias Webel, Thomas Strach, Otto Torreiter, Preetham Lobo, Alper Buyuktosunoglu, Ramon Bertran, Michael Floyd, Malcolm Ware, Gerard Salem, Sean Carey, Phillip Restle
IEEE ISSCC Technical Digest, 2017

Power Supply Noise in a 22nm z13TM Microprocessor
P. I. Chuang, C. Vezyrtzis, D. Pathak, R. Rizzolo, T. Webel, T. Strach, O. Torreiter, P. Lobo, A. Buyuktosunoglu, R. Bertran, M. Floyd, M. Ware, G. Salem, S. Carey, P. Restle
2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 438-439

MicroProbe: An Open Source Microbenchmark Generator Ported to the RISC-V ISA
Eldridge, Schuyler and Bertran, Ramon and Buyuktosunoglu, Alper and Bose, Pradip
7th RISC-V Workshop, 2017
BRAVO: Balanced Reliability-Aware Voltage Optimization
K. Swaminathan, N. Chandramoorthy, C. Cher, R. Bertran, A. Buyuktosunoglu, P. Bose
International Symposium on High-Performance Computer Architecture (HPCA), 2017

LORE: A Loop Repository for the Evaluation of Compilers
Chen, Zhi and Gong, Zhangxiaowen and Szaday, Justin Josef and Wong, David C and Padua, David and Nicolau, Alexandru and Veidenbaum, Alexander V and Watkinson, Neftali and Sura, Zehra and Maleki, Saeed and others
2017 IEEE International Symposium on Workload Characterization (IISWC), pp. 219--228

26.2 Power supply noise in a 22nm z13™ microprocessor
Pierce, I and Chuang, Jen and Vezyrtzis, Christos and Pathak, Divya and Rizzolo, Richard and Webel, Tobias and Strach, Thomas and Torreiter, Otto and Lobo, Preetham and Buyuktosunoglu, Alper and others
Solid-State Circuits Conference (ISSCC), 2017 IEEE International, pp. 438--439

A 32Gb/s, 4.7pJ/bit Optical Link with -11.7dBm Sensitivity in 14nm FinFET CMOS
J. Proesel, Z. Toprak-Deniz, A. Cevrero, I. Ozkaya, S. Kim, D. Kuchta, S. Lee, S. Rylov, H. Ainspan, T. Dickson, J. Bulzacchelli, and M. Meghelli
Symposium on VLSI Circuits, pp. 318-319, 2017

A 1.8pJ/bit 56Gb/s PAM4 Transmitter with Fractionally-Spaced FFE in 14nm CMOS
T. O. Dickson, H. A. Ainspan, and M. Meghelli
IEEE International Solid-State Circuits Conference (ISSCC), pp. 118-119, 2017

Very Low Voltage (VLV) Design
Bertran, Ramon and Bose, Pradip and Brooks, David and Burns, Jeff and Buyuktosunoglu, Alper and Chandramoorthy, Nandhini and Cheng, Eric and Cochet, Martin and Eldridge, Schuyler and Friedman, Daniel and others
Computer Design (ICCD), 2017 IEEE International Conference on, pp. 601--604

libPRISM: an intelligent adaptation of prefetch and SMT levels
Ortega, Cristobal and Moreto, Miquel and Casas, Marc and Bertran, Ramon and Buyuktosunoglu, Alper and Eichenberger, Alexandre E and Bose, Pradip
Proceedings of the International Conference on Supercomputing, pp. 28, 2017

A Single-Node Datastore for High-Velocity Multidimensional Sensor Data
Juan Colmenares, Reza Dorrigiv, and Daniel Waddington
IEEE International Conference on Big Data (Big Data), 2017

Distributed learning of deep feature embeddings for visual recognition tasks
Bhattacharjee, B and Hill, ML and Wu, H and Chandakkar, PS and Smith, JR and Wegman, MN
IBM Journal of Research and Development 61(4), 4--1, IBM, 2017

Scalable Linux Container Provisioning in Fog and Edge Computing Platforms
Michele Gazzetti, Andrea Reale, Kostas Katrinis, Antonio Corradi
Euro-Par 2017: Parallel Processing Workshops - Euro-Par 2017 International Workshops, Santiago de Compostela, Spain, August 28-29, 2017, Revised Selected Papers, pp. 304--315

A Low Voltage RISC-V Heterogeneous System
Schuyler Eldridge, Karthik Swaminathan, Nandhini Chandramoorthy, Alper Buyuktosunoglu, Alec Roelke, Xinfei Guo, Vaibhav Verma, Rajiv Joshi, Mircea Stan, Pradip Bos
First Workshop on Computer Architecture Research with RISC-V (CARRV 2017)

Very Low Voltage (VLV) Design
Ramon Bertran, Karthik Swaminathan, et al.
IEEE International Conference on Computer Design (ICCD), 2017

Understanding and optimizing the performance of distributed machine learning applications on apache spark
C. Duenner, T. Parnell, K. Atasu, M. Sifalakis, H. Pozidis
Big Data (Big Data), 2017 IEEE International Conference on, pp. 331--338

An fpga-based integrated mapreduce accelerator platform
Kachris, Christoforos and Diamantopoulos, Dionysios and Sirakoulis, Georgios Ch and Soudris, Dimitrios
Journal of Signal Processing Systems 87(3), 357--369, Springer, 2017

Efficient Fork-Join on GPUs Through Warp Specialization
Arpith Chacko Jacob, Alexandre E. Eichenberger, Hyojin Sung, Samuel F. Ant\ {a}o, Gheorghe-Teodor Bercea, Carlo Bertolli, Alexey Bataev, Tian Jin, Tong Chen, Zehra Sura, Georgios Rokos, Kevin O'Brien
24th IEEE International Conference on High Performance Computing, HiPC 2017, Jaipur, India, December 18-21, 2017, pp. 358--367

Implementing implicit OpenMP data sharing on GPUs
Gheorghe-Teodor Bercea, Carlo Bertolli, Arpith C. Jacob, Alexandre E. Eichenberger, Alexey Bataev, Georgios Rokos, Hyojin Sung, Tong Chen, Kevin O'Brien
Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, LLVM-HPC@SC 2017, Denver, CO, USA, November 13, 2017, pp. 5:1--5:12

Multilayer Perceptron Algorithm: Impact of Nonideal Conductance and Area-Efficient Peripheral Circuits
L. L. Sanches, A. Fumarola, S. Sidler, P. Narayanan, I. Boybat, J. Jang, K. Moon, R. M. Shelby, Y. Leblebici, H. Hwang and G. W. Burr
Neuro-inspired Computing Using Resistive Synaptic Devices, 2017

A Low Power, Fully Event-Based Gesture Recognition System
Arnon Amir, Brian Taba, David Berg, Timothy Melano, Jeffrey McKinstry, Carmelo Di Nolfo, Tapan Nayak, Alexander Andreopoulos, Guillaume Garreau, Marcela Mendoza, Jeff Kusnitz, Michael Debole, Steve Esser, Tobi Delbruck, Myron Flickner, Dharmendra Modha
IEEE Computer Vision and Pattern Recognition, 2017

Nonvolatile Memory Crossbar Arrays for Non-von Neumann Computing
Sidler, Severin and Jang, Jun-Woo and Burr, Geoffrey W and Shelby, Robert M and Boybat, Irem and Di Nolfo, Carmelo and Narayanan, Pritish and Virwani, Kumar and Hwang, Hyunsang
Advances in Neuromorphic Hardware Exploiting Emerging Nanoscale Devices, pp. 129--149, Springer, 2017

Scalable Agentless Cloud Network Monitoring
Morgan Brattstrom ; Patricia Morreale
2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud), IEEE

NAMD: Scalable Molecular Dynamics Based on the Charm++ Parallel Runtime System
Bilge Acun, Ronak Buch, Laxmikant Kale, James C Phillips
Exascale Scientific Applications: Scalability and Performance Portability, pp. 119, CRC Press, 2017

Support for Power Efficient Proactive Cooling Mechanisms
Bilge Acun, Eun Kyung Lee, Yoonho Park, Laxmikant V Kale
High Performance Computing (HiPC), 2017 IEEE 24th International Conference on, pp. 94--103

Sharing ownership of an input/output device using a device driver partition
Alvarez, Juan J and Arroyo, Jesse P and Crumley, Paul G and Graham, Charles S and Jann, Joefon and Schimke, Timothy J and Wu, Ching-Farn E
US Patent App. 15/001,874


Approximate computing: Challenges and opportunities
Agrawal, Ankur and Choi, Jungwook and Gopalakrishnan, Kailash and Gupta, Suyog and Nair, Ravi and Oh, Jinwook and Prener, Daniel A and Shukla, Sunil and Srinivasan, Vijayalakshmi and Sura, Zehra
Rebooting Computing (ICRC), IEEE International Conference on, pp. 1--8, 2016

Automatic Copying of Pointer-Based Data Structures
Chen, Tong and Sura, Zehra and Sung, Hyojin
International Workshop on Languages and Compilers for Parallel Computing, pp. 265--281, 2016

A 1.8-pJ/b 16x16-Gb/s Source Synchronous Parallel Interface in 32nm SOI CMOS with Receiver Redundancy for Link Recalibration
T. O. Dickson, Y. Liu, A. Agrawal, J. F. Bulzacchelli, H. Ainspan, Z. Toprak-Deniz, B. D. Parker, M.P. Beakes, M. Meghelli, and D.J. Friedman
IEEE Journal of Solid-State Circuits 51(8), 1744-1755, 2016

Efficient tomographic reconstruction for commodity processors with limited memory bandwidth
Inoue, Hiroshi
Biomedical Imaging (ISBI), 2016 IEEE 13th International Symposium on, pp. 747--750

Fragmented BWT: An Extended BWT for Full-Text Indexing
Ito, Masaru and Inoue, Hiroshi and Taura, Kenjiro
International Symposium on String Processing and Information Retrieval, pp. 97--109, 2016

An Effective Page Padding Method for RAM Buffer Algorithms to Enhance the SSD Endurance
E. Ogawa, K. Kise
2016 Fourth International Symposium on Computing and Networking (CANDAR), pp. 133-139

Automatic Copying of Pointer-Based Data Structures
Tong Chen, Zehra Sura, Hyojin Sung
Languages and Compilers for Parallel Computing - 29th International Workshop, LCPC 2016, Rochester, NY, USA, September 28-30, 2016, Revised Papers, pp. 265--281

DeltaSherlock: Identifying changes in the cloud
Turk, Ata and Chen, Hao and Byrne, Anthony and Knollmeyer, John and Duri, Sastry S and Isci, Canturk and Coskun, Ayse K
Big Data (Big Data), 2016 IEEE International Conference on, pp. 763--772

TrueNorth Ecosystem for Brain-Inspired Computing: Scalable Systems, Software, and Applications
J Sawada, F Akopyan, A Cassidy, B Taba, M Debole, P Datta, R Alvarez-Icaza, A Amir, J Arthur, A Andreopoulos, et al.
Proceedings of Supercomputing 2016

Convolutional networks for fast, energy-efficient neuromorphic computing
Steven K. Esser, Paul A. Merolla, John V. Arthur, Andrew S. Cassidy, Rathinakumar Appuswamy, Alexander Andreopoulos, David J. Berg, Jeffrey L. McKinstry, Timothy Melano, Davis R. Barch, Carmelo di Nolfo, Pallab Datta, Arnon Amir, Brian Taba, Myron D. Flic
Proceedings of the National Academy of Sciences 113(41), 11441--11446, National Academy of Sciences, 2016

Power, Reliability, and Performance: One System to Rule Them All
Bilge Acun, Akhil Langer, Esteban Meneses, Harshitha Menon, Osman Sarood, Ehsan Totoni, Laxmikant V Kale
Computer 49(10), 30--37, IEEE, 2016

Neural Network-Based Task Scheduling with Preemptive Fan Control
Bilge Acun, Eun Kyung Lee, Yoonho Park, Laxmikant V Kale
Proceedings of the 4th International Workshop on Energy Efficient Supercomputing, pp. 77--84, 2016


Review of Advances in Neural Networks: Neural Design Technology Stack
Stanislaw Woźniak, Adela-Diana Almási, Valentin Cristea, Yusuf Leblebici, Ton Engbersen
Proceedings of ELM-2014 Volume 1, pp. 367--376, Springer, 2015

A Case for Approximate Computing in Real-Time Mobile Cognition
K. Swaminathan, C. Lin, A. Vega, A. Buyuktosunoglu, P. Bose, S. Pankanti.
Workshop on Approximate Computing (WACAS), in conjunction with ASPLOS 2015

Robust power management in the IBM z13
T. Webel, P.M. Lobo, R. Bertran, G.M. Salem, M. Allen-Ware, R. Rizzolo, S.M. Carey, T. Strach, A. Buyuktosunoglu, C. Lefurgy, P. Bose, R. Nigaglioni, T. Slegel, M.S. Floyd, B.W. Curran
IBM Journal of Research and Development 59(4), 16--16, 2015

Safe Limits on Voltage Reduction Efficiency in GPUs: a Direct Measurement Approach
Jingwen Leng, Alper Buyuktosunoglu, Ramon Bertran, Pradip Bose, Vijay Janapa Reddi
MICRO '15: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (Acceptance Ratio: ??. Conf. Rank: A), pp. 0--0, IEEE Computer Society, 2015

A 1.4-pJ/b, Power-Scalable 16x12-Gb/s Source-Synchronous I/O with DFE Receiver in 32nm SOI CMOS Technology
T. O. Dickson, Y. Liu, S. V. Rylov, A. Agrawal, S. Kim, P.-H. Hsieh, J. F. Bulzacchelli, M. Ferriss, H. A. Ainspan, A. Rylyakov, B. D. Parker, C. Baks, L. Shan, Y. Kwark, J. Tierno, and D.J. Friedman
IEEE Journal of Solid-State Circuits 50(8), 1917-1931, 2015

LabBook: Metadata-driven social collaborative data analysis
Kandogan, Eser and Roth, Mary and Schwarz, Peter and Hui, Joshua and Terrizzano, Ignacio and Christodoulakis, Christina and Miller, Ren{\'e}e J
Big Data (Big Data), 2015 IEEE International Conference on, pp. 431--440

Mitigating memory-induced dark silicon in many-accelerator architectures
Diamantopoulos, Dionysios and Xydis, Sotirios and Siozios, Kostas and Soudris, Dimitrios
IEEE Computer Architecture Letters 14(2), 136--139, IEEE, 2015

High-Level-Synthesis Extensions for Scalable Single-Chip Many-Accelerators on FPGAs
Diamantopoulos, Dionysios and Xydis, Sotirios and Siozios, Kostas and Soudris, Dimitrios
Field Programmable Logic and Applications (FPL), 2015 25th International Conference on, pp. 1--2

GENESIS: Parallel application placement onto reconfigurable architectures (invited for the special issue on runtime management)
Diamantopoulos, Dionysios and Siozios, Kostas and Xydis, Sotirios and Soudris, Dimitrios
ACM Transactions on Embedded Computing Systems (TECS) 14(1), 18, ACM, 2015

Evaluation of High-Level Synthesis Techniques for Memory and Datapath Tradeoffs in FPGA Based SoC Architectures
Sotiriou-Xanthopoulos, Efstathios and Diamantopoulos, Dionysios and Economakos, George
International Symposium on Applied Reconfigurable Computing, pp. 321--330, 2015

TEAChER: TEach AdvanCEd Reconfigurable Architectures and Tools
Siozios, Kostas and Figuli, Peter and Sidiropoulos, Harry and Tradowsky, Carsten and Diamantopoulos, Dionysios and Maragos, Konstantinos and Delicia, Shalina Percy and Soudris, Dimitrios and Becker, J{\"u}rgen
International Symposium on Applied Reconfigurable Computing, pp. 103--114, 2015

Experimental demonstration and tolerancing of a large-scale neural network (165 000 synapses) using phase-change memory as the synaptic weight element
Burr, Geoffrey W and Shelby, Robert M and Sidler, Severin and Di Nolfo, Carmelo and Jang, Junwoo and Boybat, Irem and Shenoy, Rohit S and Narayanan, Pritish and Virwani, Kumar and Giacometti, Emanuele U and others
IEEE Transactions on Electron Devices 62(11), 3498--3507, IEEE, 2015

Non-volatile memory as hardware synapse in neuromorphic computing: A first look at reliability issues
Shelby, Robert M and Burr, Geoffrey W and Boybat, Irem and Di Nolfo, Carmelo
Reliability Physics Symposium (IRPS), 2015 IEEE International, pp. 6A--1

Non-volatile memory crossbar arrays for non-Von Neumann computing
Burr, Geoffrey W and Shelby, Robert M and Boybat, Irem and Sidler, Severin and di Nolfo, Carmelo
Electronic Materials Conference (EMC), 2015

Crossbar arrays for Storage Class Memory and non-Von Neumann computing
Burr, Geoffrey W and Shelby, Robert M and Sidler, Severin and Narayanan, Pritish and Boybat, Irem and di Nolfo, Carmelo
European Phase-Change and Ovonic Symposium (E$ackslash$PCOS), 2015

CORONET: Testbeds, demonstration, and lessons learned
Von Lehmen, Ann and Doverspike, Robert and Clapp, George and Freimuth, Douglas M and Gannett, Joel and Kolarov, Aleksandar and Kobrinski, Haim and Makaya, Christian and Mavrogiorgis, Emmanuil and Pastor, Jorge and others
IEEE/OSA Journal of Optical Communications and Networking 7(3), A447--A458, IEEE, 2015

Preliminary Evaluation of a Parallel Trace Replay Tool for HPC Network Simulations
Bilge Acun, Nikhil Jain, Abhinav Bhatele, Misbah Mubarak, Christopher D Carothers, Laxmikant V Kale
European Conference on Parallel Processing, pp. 417--429, 2015