Conference paper

Analog AI Accelerators for Transformer-based Language Models: Hardware, Workload, and Power Performance

Abstract

Transformer-based Large Language Models (LLMs) demand large weight capacity, efficient computing, and high throughput access to large amount of dynamic memory. These challenges present great opportunities for algorithmic and hardware innovations, including Analog AI accelerators. In this paper, we describe recent progress on Phase Change Memory-based hardware and architectural designs to address the challenges for LLM inference.