Subhankar Pal
contact information
Postdoctoral Researcher, Efficient and Reliable SystemsThomas J. Watson Research Center, Yorktown Heights, NY USA
links
2022
Versa: A 36-Core Systolic Multiprocessor With Dynamically Reconfigurable Interconnect and Memory
Sung Kim, Morteza Fayazi, Alhad Daftardar, Kuan-Yu Chen, Jielun Tan, Subhankar Pal, Tutu Ajayi, Yan Xiong, Trevor Mudge, Chaitali Chakrabarti, David Blaauw, Ronald Dreslinski, Hun-Seok Kim
IEEE Journal of Solid-State Circuits 57(4), 986-998, 2022
Sung Kim, Morteza Fayazi, Alhad Daftardar, Kuan-Yu Chen, Jielun Tan, Subhankar Pal, Tutu Ajayi, Yan Xiong, Trevor Mudge, Chaitali Chakrabarti, David Blaauw, Ronald Dreslinski, Hun-Seok Kim
IEEE Journal of Solid-State Circuits 57(4), 986-998, 2022
HetSched: Quality-of-Mission Aware Scheduling for Autonomous Vehicle SoCs
Aporva Amarnath, Subhankar Pal, Hiwot Kassa, Augusto Vega, Alper Buyuktosunoglu, Hubertus Franke, John-David Wellman, Ronald Dreslinski, Pradip Bose
arXiv, 2022
Aporva Amarnath, Subhankar Pal, Hiwot Kassa, Augusto Vega, Alper Buyuktosunoglu, Hubertus Franke, John-David Wellman, Ronald Dreslinski, Pradip Bose
arXiv, 2022
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators
Subhankar Pal, Swagath Venkataramani, Viji Srinivasan, Kailash Gopalakrishnan
ACM Trans. Embed. Comput. Syst., Association for Computing Machinery, 2022
Abstract Just Accepted
Subhankar Pal, Swagath Venkataramani, Viji Srinivasan, Kailash Gopalakrishnan
ACM Trans. Embed. Comput. Syst., Association for Computing Machinery, 2022
Abstract Just Accepted
2021
A Holistic Solution for Reliability of 3D Parallel Systems
Javad Bagherzadeh, Aporva Amarnath, Jielun Tan, Subhankar Pal, Ronald G. Dreslinski
J. Emerg. Technol. Comput. Syst. 18(1), Association for Computing Machinery, 2021
Abstract
Javad Bagherzadeh, Aporva Amarnath, Jielun Tan, Subhankar Pal, Ronald G. Dreslinski
J. Emerg. Technol. Comput. Syst. 18(1), Association for Computing Machinery, 2021
Abstract
Towards Closing the Programmability-Efficiency Gap using Software-Defined Hardware
Subhankar Pal
2021
closing, software, computer hardware, computer science, adaptive hardware
Subhankar Pal
2021
closing, software, computer hardware, computer science, adaptive hardware
SparseAdapt: Runtime Control for Sparse Linear Algebra on a Reconfigurable Accelerator
Subhankar Pal, Aporva Amarnath, Siying Feng, Michael OBoyle, Ronald Dreslinski, Christophe Dubach
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 1005-1021, ACM, 2021
Abstract control reconfiguration, cache, synchronization, oracle, linear algebra, parallel computing, multiplication, granularity, shared resource, computer science
Subhankar Pal, Aporva Amarnath, Siying Feng, Michael OBoyle, Ronald Dreslinski, Christophe Dubach
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 1005-1021, ACM, 2021
Abstract control reconfiguration, cache, synchronization, oracle, linear algebra, parallel computing, multiplication, granularity, shared resource, computer science
Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators
Subhankar Pal, Swagath Venkataramani, Viji Srinivasan, Kailash Gopalakrishnan
2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 240-242, IEEE
Abstract
Subhankar Pal, Swagath Venkataramani, Viji Srinivasan, Kailash Gopalakrishnan
2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 240-242, IEEE
Abstract
Heterogeneity-Aware Scheduling on SoCs for Autonomous Vehicles
Aporva Amarnath, Subhankar Pal, Hiwot Tadese Kassa, Augusto Vega, Alper Buyuktosunoglu, Hubertus Franke, John-David Wellman, Ronald Dreslinski, Pradip Bose
IEEE Computer Architecture Letters 20(2), 82-85, Institute of Electrical and Electronics Engineers (IEEE), 2021
Abstract scheduling, design space exploration, dynamic priority scheduling, distributed computing, space exploration, reduction, computer science, mission time, multiple applications, processor scheduling
Aporva Amarnath, Subhankar Pal, Hiwot Tadese Kassa, Augusto Vega, Alper Buyuktosunoglu, Hubertus Franke, John-David Wellman, Ronald Dreslinski, Pradip Bose
IEEE Computer Architecture Letters 20(2), 82-85, Institute of Electrical and Electronics Engineers (IEEE), 2021
Abstract scheduling, design space exploration, dynamic priority scheduling, distributed computing, space exploration, reduction, computer science, mission time, multiple applications, processor scheduling
CoSPARSE: A Software and Hardware Reconfigurable SpMV Framework for Graph Analytics
Siying Feng, Jiawen Sun, Subhankar Pal, Xin He, Kuba Kaszyk, Dong-hyeon Park, Magnus Morton, Trevor Mudge, Murray Cole, Michael F P OBoyle, Chaitali Chakrabarti, Ronald Dreslinski
58th Design Automation Conference, ACM Association for Computing Machinery, 2021
software, computer architecture, computer science, graph analytics
Siying Feng, Jiawen Sun, Subhankar Pal, Xin He, Kuba Kaszyk, Dong-hyeon Park, Magnus Morton, Trevor Mudge, Murray Cole, Michael F P OBoyle, Chaitali Chakrabarti, Ronald Dreslinski
58th Design Automation Conference, ACM Association for Computing Machinery, 2021
software, computer architecture, computer science, graph analytics
Versa: A Dataflow-Centric Multiprocessor with 36 Systolic ARM Cortex-M4F Cores and a Reconfigurable Crossbar-Memory Hierarchy in 28nm
Sung Kim, Morteza Fayazi, Alhad Daftardar, Kuan-Yu Chen, Jielun Tan, Subhankar Pal, Tutu Ajayi, Yan Xiong, Trevor Mudge, Chaitali Chakrabarti, David Blaauw, Ronald Dreslinski, Hun-Seok Kim
2021 Symposium on VLSI Circuits, pp. 1-2, IEEE
Abstract memory hierarchy, versa, central processing unit, dataflow, mobile processor, arm architecture, synchronization, control reconfiguration, parallel computing, computer science
Sung Kim, Morteza Fayazi, Alhad Daftardar, Kuan-Yu Chen, Jielun Tan, Subhankar Pal, Tutu Ajayi, Yan Xiong, Trevor Mudge, Chaitali Chakrabarti, David Blaauw, Ronald Dreslinski, Hun-Seok Kim
2021 Symposium on VLSI Circuits, pp. 1-2, IEEE
Abstract memory hierarchy, versa, central processing unit, dataflow, mobile processor, arm architecture, synchronization, control reconfiguration, parallel computing, computer science
2020
R2D3: A Reliability Engine for 3D Parallel Systems
Javad Bagherzadeh, Aporva Amarnath, Jielun Tan, Subhankar Pal, Ronald G. Dreslinski
2020 57th ACM/IEEE Design Automation Conference (DAC), pp. 1-6, IEEE
Abstract failure rate, reliability, performance improvement, controller, reliability engineering, baseline, computer science, degradation, reliability management
Javad Bagherzadeh, Aporva Amarnath, Jielun Tan, Subhankar Pal, Ronald G. Dreslinski
2020 57th ACM/IEEE Design Automation Conference (DAC), pp. 1-6, IEEE
Abstract failure rate, reliability, performance improvement, controller, reliability engineering, baseline, computer science, degradation, reliability management
STOMP: A Tool for Evaluation of Scheduling Policies in Heterogeneous Multi-Processors
Augusto Vega, Aporva Amarnath, John-David Wellman, Hiwot Kassa, Subhankar Pal, Hubertus Franke, Alper Buyuktosunoglu, Ronald G. Dreslinski, Pradip Bose
arXiv preprint arXiv:2007.14371, 2020
Abstract scheduling, distributed computing, chip, computer science, criticality, general purpose, homogeneous
Augusto Vega, Aporva Amarnath, John-David Wellman, Hiwot Kassa, Subhankar Pal, Hubertus Franke, Alper Buyuktosunoglu, Ronald G. Dreslinski, Pradip Bose
arXiv preprint arXiv:2007.14371, 2020
Abstract scheduling, distributed computing, chip, computer science, criticality, general purpose, homogeneous
HETSIM: Simulating Large-Scale Heterogeneous Systems using a Trace-driven, Synchronization and Dependency-Aware Framework
Subhankar Pal, Kuba Kaszyk, Siying Feng, Bjorn Franke, Murray Cole, Michael OBoyle, Trevor Mudge, Ronald G. Dreslinski
2020 IEEE International Symposium on Workload Characterization (IISWC), pp. 13-24, IEEE
Abstract emulation, synchronization, trace, software, distributed computing, dependency, chip, computer science, power, scale
Subhankar Pal, Kuba Kaszyk, Siying Feng, Bjorn Franke, Murray Cole, Michael OBoyle, Trevor Mudge, Ronald G. Dreslinski
2020 IEEE International Symposium on Workload Characterization (IISWC), pp. 13-24, IEEE
Abstract emulation, synchronization, trace, software, distributed computing, dependency, chip, computer science, power, scale
Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture
Yan Xiong, Jian Zhou, Subhankar Pal, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski, Chaitali Chakrabarti
2020 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-5, IEEE
Abstract cache, recurrent neural network, node, artificial neural network, bridging, control reconfiguration, computer architecture, transformer, kernel, computer science
Yan Xiong, Jian Zhou, Subhankar Pal, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski, Chaitali Chakrabarti
2020 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-5, IEEE
Abstract cache, recurrent neural network, node, artificial neural network, bridging, control reconfiguration, computer architecture, transformer, kernel, computer science
Accelerating Linear Algebra Kernels on a Massively Parallel Reconfigurable Architecture
Anuraag Soorishetty, Jian Zhou, Subhankar Pal, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski, Chaitali Chakrabarti
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1558-1562, IEEE
Abstract lu decomposition, cache, qr decomposition, linear algebra, control reconfiguration, massively parallel, triangular matrix, transformer, matrix, solver, parallel computing, computer science
Anuraag Soorishetty, Jian Zhou, Subhankar Pal, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski, Chaitali Chakrabarti
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1558-1562, IEEE
Abstract lu decomposition, cache, qr decomposition, linear algebra, control reconfiguration, massively parallel, triangular matrix, transformer, matrix, solver, parallel computing, computer science
A 7.3 M Output Non-Zeros/J, 11.7 M Output Non-Zeros/GB Reconfigurable Sparse Matrix-Matrix Multiplication Accelerator
Dong-Hyeon Park, Subhankar Pal, Siying Feng, Paul Gao, Jielun Tan, Austin Rovinski, Shaolin Xie, Chun Zhao, Aporva Amarnath, Timothy Wesley, Jonathan Beaumont, Kuan-Yu Chen, Chaitali Chakrabarti, Michael Bedford Taylor, Trevor Mudge, David Blaauw, Hun-Seok Kim, Ronald G. Dreslinski
IEEE Journal of Solid-state Circuits 55(4), 933-944, Institute of Electrical and Electronics Engineers (IEEE), 2020
Abstract matrix multiplication, multiplication, sparse matrix, memory hierarchy, cache, graph, chip, spectral efficiency, discrete mathematics, physics
Dong-Hyeon Park, Subhankar Pal, Siying Feng, Paul Gao, Jielun Tan, Austin Rovinski, Shaolin Xie, Chun Zhao, Aporva Amarnath, Timothy Wesley, Jonathan Beaumont, Kuan-Yu Chen, Chaitali Chakrabarti, Michael Bedford Taylor, Trevor Mudge, David Blaauw, Hun-Seok Kim, Ronald G. Dreslinski
IEEE Journal of Solid-state Circuits 55(4), 933-944, Institute of Electrical and Electronics Engineers (IEEE), 2020
Abstract matrix multiplication, multiplication, sparse matrix, memory hierarchy, cache, graph, chip, spectral efficiency, discrete mathematics, physics
Sparse-TPU: adapting systolic arrays for sparse matrices
Xin He, Subhankar Pal, Aporva Amarnath, Siying Feng, Dong-Hyeon Park, Austin Rovinski, Haojie Ye, Yuhan Chen, Ronald Dreslinski, Trevor Mudge
Proceedings of the 34th ACM International Conference on Supercomputing, ACM, 2020
Abstract sparse matrix, systolic array, matrix, multiplication, kernel, floating point, overhead, throughput, computational science, computer science
Xin He, Subhankar Pal, Aporva Amarnath, Siying Feng, Dong-Hyeon Park, Austin Rovinski, Haojie Ye, Yuhan Chen, Ronald Dreslinski, Trevor Mudge
Proceedings of the 34th ACM International Conference on Supercomputing, ACM, 2020
Abstract sparse matrix, systolic array, matrix, multiplication, kernel, floating point, overhead, throughput, computational science, computer science
Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration
Subhankar Pal, Siying Feng, Dong-hyeon Park, Sung Kim, Aporva Amarnath, Chi-Sheng Yang, Xin He, Jonathan Beaumont, Kyle May, Yan Xiong, Kuba Kaszyk, John Magnus Morton, Jiawen Sun, Michael OBoyle, Murray Cole, Chaitali Chakrabarti, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski
Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, pp. 175-190, ACM Association for Computing Machinery, 2020
Abstract dataflow, hardware acceleration, dennard scaling, control reconfiguration, data structure, throughput, field programmable gate array, dynamic data, computer architecture, computer science
Subhankar Pal, Siying Feng, Dong-hyeon Park, Sung Kim, Aporva Amarnath, Chi-Sheng Yang, Xin He, Jonathan Beaumont, Kyle May, Yan Xiong, Kuba Kaszyk, John Magnus Morton, Jiawen Sun, Michael OBoyle, Murray Cole, Chaitali Chakrabarti, David Blaauw, Hun-Seok Kim, Trevor Mudge, Ronald Dreslinski
Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, pp. 175-190, ACM Association for Computing Machinery, 2020
Abstract dataflow, hardware acceleration, dennard scaling, control reconfiguration, data structure, throughput, field programmable gate array, dynamic data, computer architecture, computer science
2019
Parallelism Analysis of Prominent Desktop Applications: An 18- Year Perspective
Siying Feng, Subhankar Pal, Yichen Yang, Ronald G. Dreslinski
2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 202-211, IEEE
Abstract task parallelism, dennard scaling, multi core processor, clock rate, software, thermal design power, parallelism, computer architecture, computer science, state
Siying Feng, Subhankar Pal, Yichen Yang, Ronald G. Dreslinski
2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 202-211, IEEE
Abstract task parallelism, dennard scaling, multi core processor, clock rate, software, thermal design power, parallelism, computer architecture, computer science, state
A 7.3 M Output Non-Zeros/J Sparse Matrix-Matrix Multiplication Accelerator using Memory Reconfiguration in 40 nm
Subhankar Pal, Dong-hyeon Park, Siying Feng, Paul Gao, Jielun Tan, Austin Rovinski, Shaolin Xie, Chun Zhao, Aporva Amarnath, Timothy Wesley, Jonathan Beaumont, Kuan-Yu Chen, Chaitali Chakrabarti, Michael Taylor, Trevor Mudge, David Blaauw, Hun-Seok Kim, Ronald Dreslinski
2019 Symposium on VLSI Technology, IEEE
Abstract matrix multiplication, sparse matrix, multiplication, memory hierarchy, central processing unit, chip, cache, system on a chip, topology, physics
Subhankar Pal, Dong-hyeon Park, Siying Feng, Paul Gao, Jielun Tan, Austin Rovinski, Shaolin Xie, Chun Zhao, Aporva Amarnath, Timothy Wesley, Jonathan Beaumont, Kuan-Yu Chen, Chaitali Chakrabarti, Michael Taylor, Trevor Mudge, David Blaauw, Hun-Seok Kim, Ronald Dreslinski
2019 Symposium on VLSI Technology, IEEE
Abstract matrix multiplication, sparse matrix, multiplication, memory hierarchy, central processing unit, chip, cache, system on a chip, topology, physics
2018
OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator
Subhankar Pal, Jonathan Beaumont, Dong-Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun-Seok Kim, David Blaauw, Trevor Mudge, Ronald Dreslinski
2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 724-736, IEEE Computer Society
Abstract matrix multiplication, multiplication algorithm, sparse matrix, speedup, xeon, massively parallel, high bandwidth memory, spmd, parallel computing, computer science
Subhankar Pal, Jonathan Beaumont, Dong-Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun-Seok Kim, David Blaauw, Trevor Mudge, Ronald Dreslinski
2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 724-736, IEEE Computer Society
Abstract matrix multiplication, multiplication algorithm, sparse matrix, speedup, xeon, massively parallel, high bandwidth memory, spmd, parallel computing, computer science
2017
A carbon nanotube transistor based RISC-V processor using pass transistor logic
Aporva Amarnath, Siying Feng, Subhankar Pal, Tutu Ajayi, Austin Rovinski, Ronald G. Dreslinski
2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp. 1-6, IEEE
Abstract pass transistor logic, carbon nanotube field effect transistor, logic gate, cmos, transistor, threshold voltage, critical path method, adder, electrical engineering, electronic engineering, engineering
Aporva Amarnath, Siying Feng, Subhankar Pal, Tutu Ajayi, Austin Rovinski, Ronald G. Dreslinski
2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp. 1-6, IEEE
Abstract pass transistor logic, carbon nanotube field effect transistor, logic gate, cmos, transistor, threshold voltage, critical path method, adder, electrical engineering, electronic engineering, engineering
2014
A New Design of an N-Bit Reversible Arithmetic Logic Unit
Subhankar Pal, Chetan Vudadha, P. Sai Phaneendra, Sreehari Veeramachaneni, Srinivas Mandalika
2014 Fifth International Symposium on Electronic System Design, pp. 224-225, IEEE
Abstract adder, serial binary adder, carry save adder, pass transistor logic, arithmetic logic unit, three input universal logic gate, dissipation, quantum gate, arithmetic, electronic engineering, mathematics
Subhankar Pal, Chetan Vudadha, P. Sai Phaneendra, Sreehari Veeramachaneni, Srinivas Mandalika
2014 Fifth International Symposium on Electronic System Design, pp. 224-225, IEEE
Abstract adder, serial binary adder, carry save adder, pass transistor logic, arithmetic logic unit, three input universal logic gate, dissipation, quantum gate, arithmetic, electronic engineering, mathematics