Google DeepMind isn’t the only big tech company applying AI to weather forecasting. Nvidia is out FourCastNet in 2022. And in 2023, Huawei developed its Pangu-Weather model, which was trained on 39 years of data. It produces deterministic forecasts, those that provide a single number rather than a range, such as a forecast that tomorrow it will be 30°F or 0.7 inches of rain.
GenCast differs from Pangu-Weather in that it produces probabilistic forecasts, that is, probabilities for various weather outcomes rather than precise forecasts. For example, the forecast might be “There is a 40% chance that the temperature will reach a low of 30°F” or “There is a 60% chance that 0.7 inches of rain will fall tomorrow.” This type of analysis helps managers understand the likelihood of different weather events and plan accordingly.
These results do not mean the end of conventional meteorology as a field. The model is trained on past weather conditions, and applying it to the distant future can lead to inaccurate predictions for a changing and increasingly erratic climate.
GenCast still relies on a data set like ERA5, which is an hourly estimate of various atmospheric variables dating back to 1940, says Aaron Hill, an assistant professor at the University of Oklahoma School of Meteorology, who has not participated in this research. . “The backbone of ERA5 is a physics-based model,” he explains.
Additionally, there are many variables in our atmosphere that we don’t directly observe, which is why meteorologists use physical equations to make estimates. These estimates are combined with accessible observational data to power a model like GenCast, and new data will always be needed. “A model trained until 2018 will do worse in 2024 than a model trained until 2023 will do in 2024,” explains Ilan Price, a researcher at DeepMind and one of the creators of GenCast.
In the future, DeepMind plans to test models directly using data such as wind or humidity readings to see how well it is possible to make predictions based on observational data alone.