## Statistics - Traffic Prediction Tool

IBM Traffic Prediction Tool is a statistical model for the near-term prediction of traffic conditions. TPT was developed at IBM and has been tested in Singapore, where the Land Transport Authority is working with IBM and others to develop technology that will provide one-hour traffic predictions.

The IBM Traffic Prediction Tool (TPT) is a statistical model for the near-term prediction of traffic conditions.

With rising gasoline prices, the need for such a system to help reduce traffic congestion is becoming increasingly important. With real-time prediction of near-future road traffic conditions, traffic controllers can take preemptive measures to mitigate imminent congestion, and commuters can decide whether to get on a particular road or not. ITS also predicts when a congested area will return to normal.

At work in a real-world laboratory
TPT has been tested in Singapore, where the Land Transport Authority is working with IBM and others to develop technology that will give city administrators an hour's notice about traffic conditions. The system combines combines information obtained from video cameras, G.P.S. devices in taxis and sensors embedded in streets.

Average volume and speed (the average number of cars and their speed while passing a location in a given time interval) are the key indices to assess traffic conditions. In optimal conditions, real-time information about speed and volume on a city's road network is monitored continuously and recorded via multiple detectors. TPT's goal is to provide fine-time resolution and near-term prediction of average volume and speed across every link in a road network.

Traffic conditions are time-average observable phenomena of a system that consists of many vehicles (traffic participants). Hence, a statistical approach is useful in view of the fundamental philosophy of statistics, which credits the "laws of large numbers." Some researchers historically have expressed doubts about the possibility of predicting road traffic conditions, arguing instead on behalf of "chaotic behavior." We found that the actual data suggest otherwise.

Figure 1 shows clearly observable patterns across multiple periods: weekday, weekend, morning vs. evening and cross-weekly.

Figure 1

Figure 1. Average volume (every 5 minutes) at one location over three weeks.

Developing the statistical model
The statistical model we developed at IBM consists of two components:

• Capturing the trend (periodic component), as suggested by Figure 1.
• Accounting for deviation from the trend.

To address these points, we established a spatial-temporal model motivated by the serial correlation and spatial correlation present in traffic data. The model is comparable to models of water flow over a network. Through model selection criteria, such as Akaike's Information Criterion, we ascertained the number of neighboring locations that have a significant effect on local traffic patterns. We obtained the order of serial correlation by using the same data. The model is recalibrated at the beginning of each week on data from the most recent six weeks. The updated model is used to perform real-time forecasting throughout the week.

Figures 2 and 3 show ten-minute predictions of volume and speed at nine locations. Black points indicate actual data; blue points refer to ten-minute predictions. For forecasts made 5, 10, 15, ..., 60 minutes ahead, the average accuracy across 500 locations during a one-week field run was in the range of 85% - 93% for traffic volume and 87% - 95% for vehicle speed.

Figure 2

Figure 2: 10 minute-ahead volume forecast (blue) vs. actual value (black).

Figure 3

Figure 3: 10 minute-ahead speed forecast (blue) vs. actual value (black).

Summary
The current innovation has produced a traffic prediction system that is accurate and fast and covers wide and detailed road networks. It is poised to become an essential component of urban and highway planning systems.

Press coverage: