AI Weather Models Predict Hurricane Paths Well But Struggle With Storm Physics

A large-scale analysis of hundreds of tropical cyclones shows that while AI can forecast where storms will travel with remarkable speed and accuracy, replicating the internal physics that drive real-world hurricane damage remains a major challenge for next-generation weather models.

Research: Climatological Benchmarking of AI-Generated Tropical Cyclones. Image Credit: Stock Lpa / Shutterstock

Research: Climatological Benchmarking of AI-Generated Tropical Cyclones. Image Credit: Stock Lpa / Shutterstock

Artificial intelligence is rapidly transforming weather prediction, enabling forecasts that once required hours of supercomputing time to run in just minutes. But as AI tools play an expanding role in high-stakes hazard modeling, researchers at Rice University say an essential question remains: Do AI-generated storms behave realistically?

Their new study, published in the Journal of Geophysical Research: Atmospheres, provides a comprehensive evaluation of how AI-based global weather models simulate tropical cyclones. The researchers found that while leading AI systems perform well at predicting storm tracks and large-scale behavior, they can struggle to reproduce the physical structure of storms, particularly the wind patterns that drive real-world impacts.

AI weather models dramatically accelerate global forecasting

"In recent years, we've seen an explosion in AI-based weather models," said corresponding author Avantika Gori, assistant professor of civil and environmental engineering at Rice. "These systems are trained on massive atmospheric datasets, and once trained, they can generate global forecasts in just a minute or two, which is dramatically faster than traditional physics-based models."

That speed represents a major shift for forecasting. Conventional numerical weather models simulate atmospheric processes by solving complex physical equations, a computationally expensive approach. AI models instead learn statistical relationships from historical data, allowing them to produce forecasts with remarkable efficiency.

But their complexity also introduces challenges.

"Because these models are so large with millions or billions of parameters, we don't always have visibility into how they generate their predictions," Gori said. "For high-consequence events like tropical cyclones, that makes systematic evaluation critically important."

Study tests two major AI weather models using hundreds of tropical cyclones

The study evaluated two prominent AI global weather models, Pangu-Weather and Aurora, using storms from the North Atlantic and western North Pacific basins between 2020 and 2025. To ensure a rigorous test, the researchers simulated roughly 200 storms outside the models' training periods, then compared AI-generated storm characteristics with ERA5 reanalysis data.

"We wanted to determine whether the models could reproduce the climatology and physical behaviors observed in real storms," said postdoctoral student and first author Yanmo Weng. "Many prior evaluations examined only one or two cyclones, but by analyzing hundreds of storms, we were able to draw more accurate and generalizable conclusions about model performance."

AI systems show strong performance in forecasting cyclone tracks

Their analysis showed that AI modeling was most successful at forecasting storm paths.

"We found that the AI models we evaluated performed remarkably well in predicting cyclone tracks," Gori said. "They reproduced where storms traveled and where they made landfall with a high degree of consistency, which is reassuring since forecasting a storm's path helps shape evacuation decisions and early warnings."

Storm intensity predictions show uneven but improving results

Storm intensity, which has traditionally been a challenge for AI weather models, showed uneven but promising improvement. Earlier AI systems often underestimated the strength of tropical cyclones, missing the highest winds and lowest pressures associated with major storms. In the Rice benchmarking, Aurora more closely matched ERA5 intensity distributions, while Pangu-Weather exhibited larger biases for the most intense cyclones.

Even so, accurately representing extreme storms remains difficult. The researchers emphasized an important caveat: ERA5 tends to underestimate peak intensity relative to observations, so agreement with the reanalysis does not automatically imply accuracy.

Windfield structure remains a major challenge for AI storm simulations

The study's most significant caution involved the physical realism of simulated windfields - the internal structure of winds within AI-generated storms. Although many simulations appeared visually convincing, closer analysis revealed that they did not always satisfy established physical constraints. Tests of gradient wind balance, a fundamental relationship governing mature cyclones, showed notable deviations, particularly near storm centers.

"These inconsistencies are not always obvious," Gori said. "Windfields can look realistic while still violating key aspects of atmospheric physics."

The team also found that both AI models tended to overestimate inner core size, especially in stronger storms. Such biases matter because cyclone impacts depend not only on track but also on how winds are organized - factors that shape projections of wind damage, rainfall, and storm surge. Accurately capturing storm structure is therefore critical for risk assessment. When windfields are not physically consistent, downstream hazard and damage predictions can be affected.

Researchers say AI forecasting still requires scientific expertise

Still, Gori said the findings guide improvement rather than undermine the promise of AI forecasting.

"Our work helps identify where bias corrections or additional interpretation may be necessary," Gori said. "For instance, if a model systematically underestimates intensity, forecasters can adjust rather than relying on the raw output."

Beyond specific model biases, the researchers also emphasized a broader lesson: AI tools still depend on field expertise.

"These systems are extraordinarily powerful, but they are not self-validating," Gori said. "Close collaboration between atmospheric scientists and AI developers is essential to ensure that model outputs remain physically meaningful, and advancing these technologies responsibly will require continuous input and refinement by the scientific community."

In other words, as weather forecasting continues to embrace the promise of AI models, they should be used as a complement to, and not a replacement for, human expertise and understanding.

Funding

This research was supported by the National Science Foundation.

Source:
Journal reference:

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
AI Meets Metal: Rowan’s DEHub Redefines Smart Manufacturing With Supercharged 3D Printing