Analog AI Accelerators for Transformer-based Language Models: Hardware, Workload, and Power Performance

Sidney Tsai; Hadjer Benmeziane; Irem Boybat-Kara; Julian Büchel; Pritish Narayanan; Manuel Le Gallo; Shubham Jain; A. Vasilopoulos; William Simon; Kohji Hosokawa; Masatoshi Ishii; Y. Kohda; An Chen; Charles Mackin; Kaoutar El Maghraoui; Atsuya Okazaki; Alexander Friz; Jose Luquin; Abu Sebastian; Vijay Narayanan; Geoffrey Burr

IMW 2025

Conference paper

18 May 2025

Analog AI Accelerators for Transformer-based Language Models: Hardware, Workload, and Power Performance

Abstract

Transformer-based Large Language Models (LLMs) demand large weight capacity, efficient computing, and high throughput access to large amount of dynamic memory. These challenges present great opportunities for algorithmic and hardware innovations, including Analog AI accelerators. In this paper, we describe recent progress on Phase Change Memory-based hardware and architectural designs to address the challenges for LLM inference.

Conference paper

Solving optimization tasks power-efficiently exploiting VO₂'s phase-change properties with Oscillating Neural Networks

Olivier Maher, N. Harnack, et al.

DRC 2023

Paper

Exploring the benefits of using co-packaged optics in data center and AI supercomputer networks: A simulation-based analysis [Invited]

Pavlos Maniotis, Daniel M. Kuchta

J. of Opt. Comm. and Netw.

Paper

Filamentary TaO_x/HfO₂ ReRAM Devices for Neural Networks Training with Analog In-Memory Computing

Tommaso Stecconi, Roberto Guido, et al.

Advanced Electronic Materials

Conference paper

A Multiscale Workflow for Thermal Analysis of 3DI Chip Stacks

Max Bloomfield, Amogh Wasti, et al.

ITherm 2025

View all publications

Abstract

Related

Solving optimization tasks power-efficiently exploiting VO2's phase-change properties with Oscillating Neural Networks

Exploring the benefits of using co-packaged optics in data center and AI supercomputer networks: A simulation-based analysis [Invited]

Filamentary TaOx/HfO2 ReRAM Devices for Neural Networks Training with Analog In-Memory Computing

A Multiscale Workflow for Thermal Analysis of 3DI Chip Stacks

Solving optimization tasks power-efficiently exploiting VO₂'s phase-change properties with Oscillating Neural Networks

Filamentary TaO_x/HfO₂ ReRAM Devices for Neural Networks Training with Analog In-Memory Computing