Ignacio G Terrizzano photo

Splash - overview

Project Description


Splash Overview

Smarter Planet solutions increasingly need to bring together multiple models from a broad range of areas to guide investment, planning, and policy decisions around highly complex issues such as population health, public safety, and disaster preparedness. These include simulation models developed both inside and outside of IBM.

The Splash project provides a prototype platform for combining existing heterogeneous simulation models and datasets to create composite simulation models of complex systems, thereby facilitating cross-disciplinary modeling, simulation, and optimization.

Splash loosely couples models via data exchange, exploiting and extending data-integration, workflow management, and simulation technologies. Splash enables interoperability and reuse of models and data across multiple disciplines by semi-automatically exploiting model schemas and data mappings to create new combinations of models and data. The resulting composite models can be used to conduct deep predictive analytics, enabling "what-if" analyses to assess intended and unintended consequences. The "human-in-the-loop" technology for schema mapping helps ensure that models are combined in a semantically meaningful way, while automating straightforward harmonization tasks such as conversion of measurement units. This approach provides a powerful way for scientists, engineers, and decision makers to collaborate across disciplines, and to effectively and efficiently understand the tradeoffs inherent in complex problems and their proposed solutions.

More precisely, Splash is a platform that facilitates systematic and reliable creation of composite models from simpler component models. The main idea is that multiple component model and data sources can be linked together through directed data flow to create a coherent and useful composite model. Both data sources and models are heterogeneous and independently created, using different data formats, programming languages, operating systems, simulation paradigms, and so on, and embodying different assumptions. To make disparate models and data go together in Splash, all models and data are described with metadata using Splash Actor Description Language (SADL). SADL descriptions of models and data are fundamental to the interoperability of models and data, just as schemas are fundamental to the interoperability of data, and enables model and data discovery, semi-automatic generation of data transformations between models, orchestration of a simulation run, and the design and execution of simulation experiments. Splash can exploit technologies such as Hadoop to achieve scalability in its data transformations, and the experiment-management component allows systematic and efficient exploration of the behavior of composite models, as well as sensitivity analysis and stochastic optimization.

Contact: Peter J. Haas