Subhankar Pal  Subhankar Pal photo         

contact information

Postdoctoral Researcher, Efficient and Reliable Systems
Thomas J. Watson Research Center, Yorktown Heights, NY USA
  

links

Professional Associations

Professional Associations:  ACM  |  ACM SIGARCH  |  ACM SIGMICRO  |  IEEE


2022

Versa: A 36-Core Systolic Multiprocessor With Dynamically Reconfigurable Interconnect and Memory
Sung Kim, Morteza Fayazi, Alhad Daftardar, Kuan-Yu Chen, Jielun Tan, Subhankar Pal, Tutu Ajayi, Yan Xiong, Trevor Mudge, Chaitali Chakrabarti, David Blaauw, Ronald Dreslinski, Hun-Seok Kim
IEEE Journal of Solid-State Circuits 57(4), 986-998, 2022

HetSched: Quality-of-Mission Aware Scheduling for Autonomous Vehicle SoCs
Aporva Amarnath, Subhankar Pal, Hiwot Kassa, Augusto Vega, Alper Buyuktosunoglu, Hubertus Franke, John-David Wellman, Ronald Dreslinski, Pradip Bose
arXiv, 2022

OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators
Subhankar Pal, Swagath Venkataramani, Viji Srinivasan, Kailash Gopalakrishnan
ACM Trans. Embed. Comput. Syst., Association for Computing Machinery, 2022
Abstract   Just Accepted


2021

A Holistic Solution for Reliability of 3D Parallel Systems
Javad Bagherzadeh, Aporva Amarnath, Jielun Tan, Subhankar Pal, Ronald G. Dreslinski
J. Emerg. Technol. Comput. Syst. 18(1), Association for Computing Machinery, 2021
Abstract

Towards Closing the Programmability-Efficiency Gap using Software-Defined Hardware
Subhankar Pal
2021
closing, software, computer hardware, computer science, adaptive hardware

SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator
Subhankar Pal, Aporva Amarnath, Siying Feng, Michael OBoyle, Ronald Dreslinski, Christophe Dubach
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 1005-1021, ACM, 2021
Abstract   control reconfiguration, cache, synchronization, oracle, linear algebra, parallel computing, multiplication, granularity, shared resource, computer science

Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators
Subhankar Pal, Swagath Venkataramani, Viji Srinivasan, Kailash Gopalakrishnan
2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 240-242, IEEE
Abstract

Heterogeneity-Aware Scheduling on SoCs for Autonomous Vehicles
Aporva Amarnath, Subhankar Pal, Hiwot Tadese Kassa, Augusto Vega, Alper Buyuktosunoglu, Hubertus Franke, John-David Wellman, Ronald Dreslinski, Pradip Bose
IEEE Computer Architecture Letters 20(2), 82-85, Institute of Electrical and Electronics Engineers (IEEE), 2021
Abstract   scheduling, design space exploration, dynamic priority scheduling, distributed computing, space exploration, reduction, computer science, mission time, multiple applications, processor scheduling

CoSPARSE: A Software and Hardware Reconfigurable SpMV Framework for Graph Analytics
Siying Feng, Jiawen Sun, Subhankar Pal, Xin He, Kuba Kaszyk, Dong-hyeon Park, Magnus Morton, Trevor Mudge, Murray Cole, Michael F P OBoyle, Chaitali Chakrabarti, Ronald Dreslinski
58th Design Automation Conference, ACM Association for Computing Machinery, 2021
software, computer architecture, computer science, graph analytics

Versa: A Dataflow-Centric Multiprocessor with 36 Systolic ARM Cortex-M4F Cores and a Reconfigurable Crossbar-Memory Hierarchy in 28nm
Sung Kim, Morteza Fayazi, Alhad Daftardar, Kuan-Yu Chen, Jielun Tan, Subhankar Pal, Tutu Ajayi, Yan Xiong, Trevor Mudge, Chaitali Chakrabarti, David Blaauw, Ronald Dreslinski, Hun-Seok Kim
2021 Symposium on VLSI Circuits, pp. 1-2, IEEE
Abstract   memory hierarchy, versa, central processing unit, dataflow, mobile processor, arm architecture, synchronization, control reconfiguration, parallel computing, computer science


2020

R2D3: A Reliability Engine for 3D Parallel Systems
Javad Bagherzadeh, Aporva Amarnath, Jielun Tan, Subhankar Pal, Ronald G. Dreslinski
2020 57th ACM/IEEE Design Automation Conference (DAC), pp. 1-6, IEEE
Abstract   failure rate, reliability, performance improvement, controller, reliability engineering, baseline, computer science, degradation, reliability management

STOMP: A Tool for Evaluation of Scheduling Policies in Heterogeneous Multi-Processors
Augusto Vega, Aporva Amarnath, John-David Wellman, Hiwot Kassa, Subhankar Pal, Hubertus Franke, Alper Buyuktosunoglu, Ronald G. Dreslinski, Pradip Bose
arXiv preprint arXiv:2007.14371, 2020
Abstract   scheduling, distributed computing, chip, computer science, criticality, general purpose, homogeneous

HETSIM: Simulating Large-Scale Heterogeneous Systems using a Trace-driven, Synchronization and Dependency-Aware Framework
Subhankar Pal, Kuba Kaszyk, Siying Feng, Bjorn Franke, Murray Cole, Michael OBoyle, Trevor Mudge, Ronald G. Dreslinski
2020 IEEE International Symposium on Workload Characterization (IISWC), pp. 13-24, IEEE
Abstract   emulation, synchronization, trace, software, distributed computing, dependency, chip, computer science, power, scale

Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture
Yan Xiong, Jian Zhou, Subhankar Pal, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski, Chaitali Chakrabarti
2020 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-5, IEEE
Abstract   cache, recurrent neural network, node, artificial neural network, bridging, control reconfiguration, computer architecture, transformer, kernel, computer science

Accelerating Linear Algebra Kernels on a Massively Parallel Reconfigurable Architecture
Anuraag Soorishetty, Jian Zhou, Subhankar Pal, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski, Chaitali Chakrabarti
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1558-1562, IEEE
Abstract   lu decomposition, cache, qr decomposition, linear algebra, control reconfiguration, massively parallel, triangular matrix, transformer, matrix, solver, parallel computing, computer science

A 7.3 M Output Non-Zeros/J, 11.7 M Output Non-Zeros/GB Reconfigurable Sparse Matrix-Matrix Multiplication Accelerator
Dong-Hyeon Park, Subhankar Pal, Siying Feng, Paul Gao, Jielun Tan, Austin Rovinski, Shaolin Xie, Chun Zhao, Aporva Amarnath, Timothy Wesley, Jonathan Beaumont, Kuan-Yu Chen, Chaitali Chakrabarti, Michael Bedford Taylor, Trevor Mudge, David Blaauw, Hun-Seok Kim, Ronald G. Dreslinski
IEEE Journal of Solid-state Circuits 55(4), 933-944, Institute of Electrical and Electronics Engineers (IEEE), 2020
Abstract   matrix multiplication, multiplication, sparse matrix, memory hierarchy, cache, graph, chip, spectral efficiency, discrete mathematics, physics

Sparse-TPU: adapting systolic arrays for sparse matrices
Xin He, Subhankar Pal, Aporva Amarnath, Siying Feng, Dong-Hyeon Park, Austin Rovinski, Haojie Ye, Yuhan Chen, Ronald Dreslinski, Trevor Mudge
Proceedings of the 34th ACM International Conference on Supercomputing, ACM, 2020
Abstract   sparse matrix, systolic array, matrix, multiplication, kernel, floating point, overhead, throughput, computational science, computer science

Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration
Subhankar Pal, Siying Feng, Dong-hyeon Park, Sung Kim, Aporva Amarnath, Chi-Sheng Yang, Xin He, Jonathan Beaumont, Kyle May, Yan Xiong, Kuba Kaszyk, John Magnus Morton, Jiawen Sun, Michael OBoyle, Murray Cole, Chaitali Chakrabarti, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski
Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, pp. 175-190, ACM Association for Computing Machinery, 2020
Abstract   dataflow, hardware acceleration, dennard scaling, control reconfiguration, data structure, throughput, field programmable gate array, dynamic data, computer architecture, computer science


2019

Parallelism Analysis of Prominent Desktop Applications: An 18- Year Perspective
Siying Feng, Subhankar Pal, Yichen Yang, Ronald G. Dreslinski
2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 202-211, IEEE
Abstract   task parallelism, dennard scaling, multi core processor, clock rate, software, thermal design power, parallelism, computer architecture, computer science, state

A 7.3 M Output Non-Zeros/J Sparse Matrix-Matrix Multiplication Accelerator using Memory Reconfiguration in 40 nm
Subhankar Pal, Dong-hyeon Park, Siying Feng, Paul Gao, Jielun Tan, Austin Rovinski, Shaolin Xie, Chun Zhao, Aporva Amarnath, Timothy Wesley, Jonathan Beaumont, Kuan-Yu Chen, Chaitali Chakrabarti, Michael Taylor, Trevor Mudge, David Blaauw, Hun-Seok Kim, Ronald Dreslinski
2019 Symposium on VLSI Technology, IEEE
Abstract   matrix multiplication, sparse matrix, multiplication, memory hierarchy, central processing unit, chip, cache, system on a chip, topology, physics


2018

OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator
Subhankar Pal, Jonathan Beaumont, Dong-Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun-Seok Kim, David Blaauw, Trevor Mudge, Ronald Dreslinski
2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 724-736, IEEE Computer Society
Abstract   matrix multiplication, multiplication algorithm, sparse matrix, speedup, xeon, massively parallel, high bandwidth memory, spmd, parallel computing, computer science


2017

A carbon nanotube transistor based RISC-V processor using pass transistor logic
Aporva Amarnath, Siying Feng, Subhankar Pal, Tutu Ajayi, Austin Rovinski, Ronald G. Dreslinski
2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp. 1-6, IEEE
Abstract   pass transistor logic, carbon nanotube field effect transistor, logic gate, cmos, transistor, threshold voltage, critical path method, adder, electrical engineering, electronic engineering, engineering


2014

A New Design of an N-Bit Reversible Arithmetic Logic Unit
Subhankar Pal, Chetan Vudadha, P. Sai Phaneendra, Sreehari Veeramachaneni, Srinivas Mandalika
2014 Fifth International Symposium on Electronic System Design, pp. 224-225, IEEE
Abstract   adder, serial binary adder, carry save adder, pass transistor logic, arithmetic logic unit, three input universal logic gate, dissipation, quantum gate, arithmetic, electronic engineering, mathematics