Physical Analytics - PAIRS: Big geospatial Data and Analytics
Description: IBM's Physical Analytics Integrated Data Repository and Services (PAIRS) is a data and analytics service. It is based on a cloud-based Big Data analytics platform coupled with a massive data store of pre-processed and curated geo-spatial (or spatio-temporal) data, which enables running complex analytical services globally such as high accuracy weather forecasting or agricultural models. By design PAIRS is highly scalable meaning that the time to query data is independent of the searched data size. PAIRS also offers an easy-to-use platform for assembling and evaluating geo-spatial Big Data sets, lowering time-to-discovery significantly by reducing the data management burden.
The actual technology is based on the open source distributed data store Hbase/Hadoop to (cost) efficiently host and manage PetaBytes of data. PAIRS leverages efficient data indexing methods resulting into spatially and temporally linked data layers, both for structured (e.g. satellite images, weather, soil, landuse, etc) and unstructured (e.g. data from telco, social media, distributed sensor networks etc) data. As all data layers are aligned on global grids, any location on the globe can be queried in an intuitive and logical manner. The user obtains results in standard file formats such as GeoTiff, CSV, XML.
One of the unique feature of PAIRS is its scalable and fast cross-layer query capabilities. For example, a query such as: “Show me all urban areas where is sunny for the next 10 days and where the population density is larger than 500 people per square mile and there are two coffee shops in 500 sq mile area” requires filtering and querying multiple layers across different spatial and temporal scales. PAIRS handles such multi-layer queries with speed and ease. Data discovery is possible across continental scale, potentially 10 to 100 times faster than conventional methods.
PAIRS curates and updates satellite imagery, weather data, census, land use and business location data as they become available. PAIRS can integrate additional datasets including custom and proprietary data layers that will be automatically linked with existing data layers to run data discovery queries on them. Besides access to data layers, PAIRS is running three operational analytics: (1) significantly improved weather forecast data based on data blending (approx. 30% improvement compared with existing weather forecast), (2) solar radiation and wind forecasting for renewable energy production and (3) irrigation forecasts for precision agriculture.
News:
- PAIRS is available for trials for academic users
- Our paper on PAIRS (S.Lu, X. Shao, M. Freitag, et al, IBM PAIRS Curated Data Service for Accelerated Geospatial Data Analytics and Discovery) won the Best Paper Award at The 1st IEEE International Workshop on Big Spatial Data Workship of the 2016 IEEE Big World Conference on Big Data
- Recent presentations at 2015 AGU and IEEE Big Data Conference
Recent presentations:
- 2016 IEEE Big Spatial Data Workshop
- AGU presentation
- 2015 IEEE Big Data Presentation
Recent publications:
- PAIRS: A scalable geo-spatial data analytics platform
L.J. Klein, F.J. Marianno, C.M Albrecht, M. Freitag, H.F. Hamann
2015 IEEE Conference on Big Data (Big Data) 1290-1298, (2015).
doi:10.1109/BigData.2015.7363884 - IBM PAIRS curated big data service for accelerated geospatial data analytics and discovery
S. Lu, X. Shao, M. Freitag, L. Klein, J. Renwick, F. Marianno, C. Albrecht, H.F. Hamann
2016 IEEE Conference on Big Data (Big Data) 1290-1298, (2016).
doi:10.1109/BigData.2016.7840910
Demos:
Partners:
Other links: