GPFS (General Parallel File System) - overview
Since 1991, the Spectrum Scale / General Parallel File System (GPFS) group at IBM Almaden Research has spearheaded the architecture, design, and implementation of the IT industry's premiere high-performance, big data, clustered parallel file platform. The Almaden Research group proactively advances and enhances GPFS in anticipation of large-scale storage and analytic application domain needs, working closely with its companion product organization and customers.
Spectrum Scale/GPFS is deployed by thousands of customers in applications including scientific and engineering high-performance supercomputing (HPC), big-data analytics, health care analytics, bioinformatics, seismic data processing, media streaming, data bases, high-speed data ingesting, interdepartmental and remote file sharing, NAS serving (file, web, email), and business enterprise operations.
GPFS provides a single cluster-wide name space with standard file system POSIX semantics. As a parallel system, data flows concurrently between all nodes and storage devices, with clusters comprising 2 to 10,000+ nodes. GPFS stripes application data across all nodes with no single-server bottleneck and no centralized metadata server, so performance scales nearly linearly with the number of nodes. GPFS implements a distributed token lock manager, distributed metadata logging, and optimized byte range write locking that all-but-eliminates POSIX compliance overhead. The performance of non-write-sharing applications is nearly equivalent to running on a local file system. GPFS allows uninterrupted file system access in the presence of either node failures or disk failures--with either replication or the space-efficient GPFS Native RAID feature.
GPFS features spearheaded at Almaden Research include: the GPFS native RAID (GNR) advanced software-based physical disk controller, local file placement optimization/shared-nothing cluster (FPO/SNC), Swift object interface, Hadoop interface, multi-cluster cross-cluster file system mounts, user quotas, and file access control lists (ACL). Additional data management features include: file snapshots, wide-area network caching and replication (Active File Management, AFM), policy driven management via storage pools and file sets for performance optimization, remote replication, file lifetime management (Information Life-cycle Management, ILM), and a DMAPI interface for hierarchical storage managers.
Users can deploy GPFS in one of three configurations: cluster nodes with back-end storage controllers (Network Shared Disk, NSD); cluster nodes with local private storage (Shared-Nothing Cluster, SNC); or cluster nodes connected to a shared Storage Area Network (SAN).
With GPFS's NSD configuration, GPFS Native RAID (GNR) can be used to directly manage physical disks, whether HDDs, SSDs, or NVMe. GNR employs a high-performance 'declustered RAID' Reed Solomon erasure code algorithm and data dispersal algorithm, resulting in a significantly reduced load on user applications during disk rebuild operations. GNR can be deployed on hardware using eitther "twin-tailed storage," with hundreds of dual-ported disks in JBOD enclosures attached to dual servers, or on a cluster of "single-tailed storage-rich" servers, with a dozens of single-ported disks in each server. In the later "network RAID" configuration, the application data and erasure code information is randomly and uniformly distributed across all the disks in the cluster.
As an IBM shipping product for 18 years, Spectrum Scale/GPFS continues to adapt to new applications and cluster architectures. Its scale-out and distributed design has allowed for single-name-space POSIX semantics together with a Swift ojbect interface with minimal overhead. The GPFS research vision encourages new research and development advances together with a path into a shipping product, including new data interfaces, self-tuning and self-monitoring, and new functionality and application domain features.