Smarter Urban Dynamics - Public Transport Awareness

PTA (Public Transport Awareness) is an high-throughput mechanism to sense, optimize and manage the public transit infrastructure. PTA processes, stores and displays passengers and vehicle movements across a public transport network in real time. Example of data sources of information supported by PTA include Automated Vehicle Location (AVL) systems, Automated Number Plate Recognition systems, induction loops at traffic intersections, and other opportunistic sources of trajectory data including smart-phones, and Bluetooth readers. PTA makes use of such data collected from a variety of sources to gain a more accurate insight about the performances of the transport infrastructure and enable real time traffic monitoring and management. PTA Features include:

  • Scalable real time analytics that continuously analyses data from standard Automatic Vehicle Location (AVL) systems and compute Key Performance Indicators about the state of the transport infrastructure.
  • Automated generation of reliable infrastructure data including transport routes and stop locations from historical vehicle trajectory traces.
  • Inferred road speed, vehicle volume, and bus bunching.
  • Adaptive Estimated Time-of-Arrival (ETA) predictive analytics that adapt the quality of the ETA depending on what information is available at the moment.


PTA must answer major challenges in order to be flexible and powerful enough to handle diverse demands from a large user base. The first challenge is one of scalability. As various kinds of sensor technologies become ubiquitous, the massive amount of data they produce must be analyzed in real-time, and fused in large system states (e.g., road networks) that can potentially contain millions of elements. A second challenge is the quality of the data. As the sensor technologies grow larger so does the risk of receiving invalid data from faulty sensors, or noisy data from less accurate sensors. Data sparsity is another challenge. Even though the volume of data can be extremely large, because of the equally large dimension space of the system, the volume is usually insufficient for deriving accurate traffic models. Another challenge is the development of the computing infrastructure required to support the needed functionality of ITS, especially given the large volumes and variety of data available and the diverse set of parties involved in providing this data. These systems access a broad spectrum of data source types that produce a heterogeneous mix of content with varying degrees of quality. The different types developed by different parties may necessitate proprietary software components to access the data, which are not necessarily developed with interoperability in mind. A final challenge is different kinds of end-users have different needs for the traffic-data. These users not only pose large numbers of simultaneous analysis requests, but also require analyses of significantly different natures, the result of which must be displayed in near real time in the respective clients.

To address those challenges PTA analytics are executed on the IBM InfoSphere Streams computing platform. InfoSphere Streams was selected here because of its ability to execute complex analytics with very low latency requirements on massive amount of data with a minimum hardware footprint. Also the data streams programming paradigm provides a platform on which scalable solutions can be built to address the challenges of data quality, data sparsity, and diversity of data sources and user requests. Futhermore, InfoSphere Streams application are modular and consists of reusable components from libraries of data mining, time series analytics and geospatial analytics. Thanks to this programming model, PTA is highly flexible and can easily and rapidly be adapted to answer city specific requirements.

PTA Architecture

PTA architecture



  • Predicting arrival times of buses using real-time GPS measurements. M Sinn, JW Yoon, F Calabrese, E Bouillet 5th International IEEE Annual Conference on Intelligent Transportation Systems (ITSC), 2012
  • Time of arrival predictability horizons for public bus routes. C Coffey, A Pozdnoukhov, F Calabrese, Proceedings of the 4th ACM SIGSPATIAL International Workshop on Computational Transportation Science, 2011.
  • System and Analytics for Continuously Assessing Transport Systems from Sparse and Noisy Observations: Case Study in Dublin. L Gasparini, E Bouillet, F Calabrese, O Verscheure, B O\'Brien, M O\'Donnell, IEEE International Conference on Intelligent Transportation Systems, 2011.