GPFS (General Parallel File System)       


Robert B. Garner photo

GPFS (General Parallel File System) - overview

Since 1991, the Spectrum Scale / General Parallel File System (GPFS) group at IBM Almaden Research has spearheaded the architecture, design, and implementation of the IT industry's premiere high-performance, big data, clustered parallel file platform. The Almaden Research group proactively advances and enhances GPFS in anticipation of large-scale storage and analytic application domain needs, working closely with its companion product organization and customers.

Spectrum Scale/GPFS is deployed by thousands of customers in applications including scientific and engineering high-performance supercomputing (HPC), big-data analytics, health care analytics, bioinformatics, seismic data processing, media streaming, data bases, high-speed data ingesting, interdepartmental and remote file sharing, NAS serving (file, web, email), and business enterprise operations.

GPFS provides a single cluster-wide name space with standard file system POSIX semantics. As a parallel system, data flows concurrently between all nodes and storage devices, with clusters comprising 2 to 10,000+ nodes. GPFS stripes application data across all nodes with no single-server bottleneck and no centralized metadata server, so performance scales nearly linearly with the number of nodes. GPFS implements a distributed token lock manager, distributed metadata logging, and optimized byte range write locking that all-but-eliminates POSIX compliance overhead. The performance of non-write-sharing applications is nearly equivalent to running on a local file system.  GPFS allows uninterrupted file system access in the presence of either node failures or disk failures--with either replication or the space-efficient GPFS Native RAID feature.

GPFS features spearheaded at Almaden Research include: the GPFS native RAID (GNR) advanced software-based physical disk controller, local file placement optimization/shared-nothing cluster (FPO/SNC), Swift object interface, Hadoop interface, multi-cluster cross-cluster file system mounts, user quotas, and file access control lists (ACL). Additional data management features include: file snapshots, wide-area network caching and replication (Active File Management, AFM), policy driven management via storage pools and file sets for performance optimization, remote replication, file lifetime management (Information Life-cycle Management, ILM), and a DMAPI interface for hierarchical storage managers.

Users can deploy GPFS in one of three configurations: cluster nodes with back-end storage controllers (Network Shared Disk, NSD); cluster nodes with local private storage (Shared-Nothing Cluster, SNC); or cluster nodes connected to a shared Storage Area Network (SAN).

With GPFS's NSD configuration, GPFS Native RAID (GNR) can be used to directly manage physical disks, whether HDDs, SSDs, or NVMe. GNR employs a high-performance 'declustered RAID' Reed Solomon erasure code algorithm and data dispersal algorithm, resulting in a significantly reduced load on user applications during disk rebuild operations.  GNR can be deployed on hardware using eitther "twin-tailed storage," with hundreds of dual-ported disks in JBOD enclosures attached to dual servers, or on a cluster of "single-tailed storage-rich" servers, with a dozens of single-ported disks in each server. In the later "network RAID" configuration, the application data and erasure code information is randomly and uniformly distributed across all the disks in the cluster.

As an IBM shipping product for 18 years, Spectrum Scale/GPFS continues to adapt to new applications and cluster architectures. Its scale-out and distributed design has allowed for single-name-space POSIX semantics together with a Swift ojbect interface with minimal overhead. The GPFS research vision encourages new research and development advances together with a path into a shipping product, including new data interfaces, self-tuning and self-monitoring, and new functionality and application domain features.

GPFS History

The GPFS project began as the Shark video-on-demand server project at IBM Almaden Research Center in September, 1991. In 1993, it was successfully deployed in one of the first video server customer field trials.

In March, 1993, the project expanded to "Tiger Shark: A Scalable, Reliable File System for Video on Demand" with the objective of a parallel, wide-striped, single-namespace, fault-tolerant file system capable of driving an arbitrarily large number of video streams. Successful field deployments of the IBM Multimedia Server on the RS/6000 SP/2 occurred in 1994/5.

On May 29, 1998, GPFS V1R1 was released as a general-purpose high-performance computing (HPC) parallel file system on RS/6000 systems in conjunction with IBM's Server Technology Group (STG) in Poughkeepsie, NY. At Super Computing 2004, GPFS won Jim Gray's annual TeraByteSort and MinuteSort performance grand titles on a Linux cluster.

The Almaden Research project was named Shark because a video server should not stop streaming, just as a shark, to stay alive, should not cease swimming.