As cities worldwide deploy more sensors and intelligent systems to manage traffic, a hidden problem is undermining their efforts: missing data. Sensor failures, communication dropouts, and harsh environmental conditions often lead to gaps in traffic information, complicating everything from real-time traffic light control to long-term urban planning.
The study
In a comprehensive new review published in AIAS, researchers from Shandong Technology and Business University survey the latest AI-powered techniques designed to fill these data gaps automatically. The paper, titled "A Brief Review on Missing Traffic Data Imputation in Intelligent Transportation Systems," provides a systematic classification of existing methods and compares their performance under various missing data scenarios.
"When traffic data is incomplete, it affects signal timing, congestion prediction, and even emergency response planning," says Kaiyuan Wang, the lead author of the study. "Our goal was to provide a clear framework to help choose the right method for the right situation."
Methods and categories
The review divides data imputation techniques into two broad categories: structure-based methods, which rely on the inherent low-rank structure and spatiotemporal patterns of traffic data; and learning-based methods, which use deep learning models like GANs, GNNs, and attention mechanisms to learn complex data relationships.
"Structure-based methods are often more interpretable and work well with moderate missing rates," explains Dr. Xiaobo Chen, the corresponding author. "But in cases of high missing rates or complex patterns, learning-based methods, especially those using graph neural networks or generative models, can be more powerful."
The study also summarizes publicly available datasets, such as PeMS, METR-LA, and TaxiBJ, and standard evaluation metrics, including MAE, MAPE, and RMSE, providing a valuable resource for researchers looking to benchmark their models.
Key findings
Perhaps most practically, the team tested multiple methods under unified conditions and developed a decision-making workflow to help users select the best approach based on missing data type, rate, and available computational resources.
Despite these advances, challenges remain. "Real-world traffic data is messy—it’s not just randomly missing. It can be influenced by traffic signals, weather, or even time of day," says Wang. "We also need methods that are fast enough for real-time use and that can quantify uncertainty in their predictions."
Future directions
Looking ahead, the authors highlight several promising directions, including multi-source data fusion, lightweight AI models for edge computing, and uncertainty-aware imputation techniques.
"We’re moving toward a future where AI doesn’t just fill in missing data, it understands why it’s missing and how to best reconstruct it in a way that supports safer and smarter cities," adds Dr. Chen.
This review offers both a scholarly resource and a practical guide for anyone working in smart transportation, urban analytics, or AI-enabled infrastructure management.