Cloud is an ecosystem of services that enables rapid prototyping, scalability, resiliency, and iterative evolution of software applications. Thanks to cloud computing, developers do not need to worry about infrastructure or the organization and management of their software. Instead, they can focus on developing their core ideas.
For example, a fundamental principle in software evolution is modularity. Cloud computing takes modularity to the next level, allowing an organization to offer its capabilities as cloud services accessible via APIs. To increase the speed of innovation, a developer can compose many different APIs. API-driven composability allows organizations to take advantage of other organizations' innovations, thus speeding up the evolution of their services.
While cloud is the preferred platform for software innovation, challenges remain. For one, cloud needs to efficiently support innovation in computation-heavy emerging areas such as artificial intelligence. Two, it needs to allow developers to integrate capabilities safely across multiple organizations. Three, it must help developers securely integrate capabilities across clouds, public and on-premise. Finally, it needs to guarantee privacy, integrity, and regulatory compliance for applications and data. These challenges require innovation in systems, programming models and algorithms.
Our research seeks to develop technologies that will enable cloud to become the most efficient and most secure platform for modern workloads. We study system architectures and distributed platforms for these workloads, container technologies for application portability and security, programming models, software architecture paradigms, networking and storage, all with an eye to enabling resiliency, agility, and holistic security.
Projects that IBM researchers are working on include:
Secure Container Platform. Container technologies have emerged in recent years as a means to transform DevOps. Consisting of the entire run-time environment, containers serve as the primary means of workload virtualization and isolation, and run directly on the host operating system. Efficient packaging formats, such as Docker, enable unprecedented workload portability across a hybrid cloud. Moreover, lightweight container design leads to high resource utilization and much improved DevOps functionality. And container transparency allows a cloud provider to gain insight into applications security, compliance, and performance, enabling new kinds of user-facing application-centric services. Yet, container adoption has been stymied by perceived lesser security compared to virtual machines. Our research aims to change this inequality: we build technologies that make container platforms surpass virtual machines in the area of security.
Cloud native DevOps and Microservices are primarily motivated by the quest for agility, which we define as innovation with speed and insight. Microservice architecture embraces a collection of software design principles that call for decoupling of unrelated business functions into separate API endpoints. But API-driven composability comes with some risks: One service may affect the quality of another service, or unauthorized parties may gain access to sensitive data. Technologies and methodologies must be developed to achieve development agility with less risk. These include dynamic discovery and binding of endpoints, built-in fault tolerance, state separation, elasticity, canary updates, resilience testing, access trail, security analytics and enforcement -- all in support of agility, quality and robustness. We develop these capabilities by leveraging and contributing to the open source implementation of Microservice architecture, Istio.
Programming Models and Tools for Hybrid Cloud. Application developers are increasingly turning to multiple cloud platforms to build their applications. They do so in search of best capabilities, security, and cost efficiency. To attract them, we rely on a unified programming and management model to simplify the development and life-cycle management of applications across multiple clouds, public and on-site.
Our research builds upon our innovations in serverless computing, API management, microservice architecture, and container technologies, and focuses on rich, platform-wide composition and orchestration of application components based on functions, stream processing, analytics, web services and data.
Cloud Data Governance. For large organizations, data security is the most important concern when moving applications to the cloud. Cloud providers need to assure enterprise security teams that they will have data privacy and integrity, as well as control over data storage and management. Our research aims to
enable comprehensive data governance across multiple cloud data services in a hybrid cloud. We focus on providing governance enforcement mechanisms through the entire data life-cycle. Our innovations include policy-driven transformations for regulatory compliance, elimination of side-channel attacks on data, bi-directional data lineage tracking, and in-memory data protection.
Cloud infrastructure services. Infrastructure-as-a-Service (IaaS) offerings help clients obtain almost instantaneous access to significant compute and storage resources with no capital investment. Major challenges in enterprise cloud services exist, however, including much stricter security requirements; ever-growing demand for network bandwidth; service-level agreement demands; an ability to customize service management and related business processes; and the enablement of legacy applications on the cloud. At IBM Research we tackle these challenges by developing innovations in the areas of data-center design, efficient bare metal provisioning, smart network adapter technologies and Software-Defined Networks, scalable orchestration platforms, hardware-enabled security, and electrical energy management. We also pursue the application of AI technologies to improve operational efficiency and resiliency of IaaS.
Cloud resource management. We study topics pertaining to the management of computation and data in large-scale distributed environments where resources are shared among multiple applications and users. Most recently, systems like these are studied in the context of cloud computing. We are developing management technologies to enable advanced workload-centric resource management in private and hybrid cloud contexts and across multiple heterogeneous clouds. Our particular interest is in workload scheduling solutions for AI workloads, which require innovations in batch scheduling and in efficient management of batch workloads with interactive workloads, as well as in managing special hardware like GPUs. To solve this interdisciplinary problem, we are drawing from and pursuing research in several disciplines: distributed systems, optimization theory, control theory, performance modeling, statistical analysis and AI.