IBM Research Storage Sub-Discipline       

It is expected that by 2020, humankind will store remarkable 40 zettabytes of digital data. Even more astonishing is the diversity of data types, applications and hardware technologies tangible now and expected to hit the market in the near future. Sensor data generated by the billions of Internet of Things (IoT) devices, traditional structured data in ubiquitous databases, massive medical records and images, results of high-precision scientific simulations, terrabytes of video and audio material are only a few examples of storage-hungry data types. 

Modern data is accessed by a myriad of applications that exhibit a diverse set of I/O characteristics and are sensitive to different performance properties of storage systems, including: Latency, sequential throughput, level of parallelism and metadata scalability, among others.

Furthermore, evolving applications continuously demand new interfaces and mechanisms to access and manage the data. Among these: Object stores, adaptive tiering, snapshots and QoS.

At the same time, hardware vendors rapidly advance storage technologies, increasing the spectrum of usable performance levels to previously unseen extents. Today's storage system designers can mix and match high-throughput mechanical disk drives, low-latency flash-based solid state drives, or byte-addressable non-volatile memories with nano-second latencies.

The grand challenge of modern storage research is to keep up with the exponential growth of data while also addressing an unprecedented level of diversity among data types, applications and hardware. Furthermore, the traditional characteristics of data storage -- reliability, fault-tolerance, high-availability, and security -- need to be preserved and creatively evolved where necessary. After all, losing data is not an option in a modern economy. IBM Research works in many directions to extend the capabilities of modern storage technologies. Our focus is on:

+ Scale-out, parallel, and distributed file systems
+ Object, key-value, and NoSQL storage
+ Cloud storage
+ New storage technologies
+ Solid state and non-volatile memory systems
+ Archival and long-term storage
+ Data deduplication and compression
+ Autonomic storage systems
+ Performance analysis, optimization, and automatic configuration of storage systems
+ Empirical evaluation of storage systems
+ Storage for high-performance computing environments
+ Multi-tiered storage and efficient caching
+ Erasure coding and RAID technologies
+ Efficient data indexing, search, and retrieval
+ Storage virtualization
+ Specialized storage for nbg data analytics, brain-inspired computing, and sensor data.

Our community contributes to many products, such as IBM Spectrum Scale (aka GPFS), DS8000 and XiV, and open-source projects, such as OpenStack and CloudFoundry. Modern storage is tightly integrated with the rest of the system. As a result, our storage researchers publish in a broad variety of systems and domain specific conferences. One can often see our publications in top storage conferences: USENIX FAST, MSST, ACM SYSTOR and USENIX HotStorage.

See also IBM Research Storage Systems.