Billions of dollars are being invested in the development of advanced machine learning and artificial intelligence-based weather forecasting models in response to the global upsurge in extreme weather events. Prominent tech behemoths like Google and IBM are leading the charge for faster and more accurate predictions.
Climate experts in India have also started experimenting with AI. Minister Kiren Rijiju of the Ministry of Earth Sciences said in December 2023 that the department had set up a virtual center devoted to creating and improving different AI and machine learning methods for improved weather forecasting.
Since then, the nation has seen a significant increase in interest in AI-based weather forecasting. There is, however, a problem: insufficient reliable data.
AI-based modeling, according to Indian Institute of Technology Delhi computer science professor Amitabha Bagchi, “extrapolates and builds scenarios based on the available data and past trends.” According to Bagchi, data management accounts for 95% of the process of developing AI models, and reliable data is essential to it.
In India, particularly in the Himalayas, gathering such data is difficult, according to Irfan Rashid, an assistant professor of geoinformatics at the University of Kashmir. In order to increase data collecting in the Himalayan cryosphere—the frozen portion of the Earth system—and perhaps improve artificial intelligence predictions of glacial lake outburst floods, or GLOFs, Rashid is working on a Ministry of Earth Sciences project that profiles 15 glacial lakes in Jammu and Kashmir and Ladakh.
Less than 30 glaciers have been the subject of thorough glaciological investigations, despite the fact that the Geological Survey of India has reported approximately 9,575 glaciers in the Himalayas. AI-based early warning system development is hampered by this lack of data. “At this time, there are no reliable in-situ measurements available to determine the water volume in a glacier lake. High levels of uncertainty are connected with data derived from empirical models. Building an AI and ML-based model with such data could mimic or foresee unreliable situations or outcomes, according to Rashid.
Madhavan Nair Rajeevan, a former earth sciences secretary and one of India’s leading climate experts, shares his concerns. He points out that the Himalayas are not included in the nation’s data sets, which has an impact on the accuracy of AI and machine learning forecasts for the region’s complicated topography: The fundamental meteorological parameters—rainfall, temperature, humidity, wind speed, etc.—have solid data sets available in India. Nevertheless, Rajeevan notes, “We hardly have enough data to work on GLOFs, and we don’t have enough across the Himalayas.
Weather forecasting with machine learning
In order to anticipate the weather, traditional weather forecasting usually uses computer computations based on physics. On the other hand, massive amounts of raw, unprocessed, and raw data are used by AI and deep learning, a subset of machine learning, to forecast the weather. They can improve weather forecast accuracy and dependability when combined with conventional physical models and statistical techniques.
As he explains to Dialogue Earth, “Traditionally, weather forecasting models use a bunch of different starting points and then use physics equations to build models that give out various probabilistic scenarios.” Mohak Shah is the founder and managing director of Praescivi Advisors, a strategic AI advisory firm based in California, USA. However, machine learning uses correlations in previous data to expedite weather predictions.
Machine learning offers benefits and drawbacks, much like any other technology, according to Shah: “It is relatively low-cost… scalable too and can democratize weather forecasting.” However, we make the assumption that sufficient granular data is available, which isn’t the case [in India], at least not at this time. Lack of information at the local level may be a major issue.
Though there is no alternative for high-quality data for ideal results, Shah tells Dialogue Earth that machine learning may simulate missing information using data from related places, thereby giving forecasters a head start to overcome data scarcity.
Shah expresses worries over machine learning models’ opaque, “black box” character. Because traditional weather models are based on physics equations, particular flaws may be identified and corrected thanks to their definable margin of error. Because AI and machine learning models rely on historical correlations, they frequently lack this kind of transparency, making it challenging to pinpoint the precise causes of their errors.
Data conundrum
Roxy Climate scientist Mathew Koll works for the earth sciences ministry at the Indian Institute of Tropical Meteorology in Pune. He has had difficulty getting the data needed to create an AI-based dengue forecasting model.
We have analyzed historical data on a number of variables, such as humidity, temperature, and rainfall, that influence the prevalence of dengue. However, obtaining health information about the city’s daily caseload of diseases was extremely difficult. Agency concerns were not prepared to release the information. Obtaining permission to use and publish the data required a lot of door-knocking, according to Koll.
Koll emphasizes the clear link between high-quality data and AI’s predictive power, saying that “AI would be able to provide high-resolution forecasting [for] climate-sensitive diseases such as dengue, malaria, Chikungunya, etc. if it is trained on very high-resolution data.” “The problem is getting access to data from the individual health departments, but the AI-based modeling for dengue in Pune may be duplicated in other places.”
Government scientists have dealt with this issue as well. “I attempted to obtain some health information from the top government authorities even during my tenure as secretary, the highest ranking administrative officer in the government. Nothing appeared. We do not have a culture of gathering and preserving detailed social-economic data. We need these kinds of data if we want impact studies, adds Rajeevan. The former secretary continues, “Without it, the research does not translate into real-world benefits.” Like Koll, he maintains that the quality of the technologies depends on the data they are fed.
Bagchi concurs as well. “Machine learning is the future and possesses the best mathematical tools now available to us. However, there are issues with data quantity, quality, and integrity in the Indian context, which could impede the advancement of AI-based weather forecasting.
Shah does not regard AI and machine learning as a panacea, but rather as a complimentary supplement. He states, “We have to view machine learning as an extra tool in our toolbox.”
Himalayan GLOFs
Director Kalachand Sain of the Wadia Institute of Himalayan Geology, located in Doon Valley, Uttarakhand, is pushing for the use of AI and ML in the creation of an improved warning system for glacial risks.
Sain studied the Chamoli incident in detail. In February 2021, an avalanche in Uttarakhand’s Chamoli area badly damaged two hydropower facilities and killed over 200 people.
“We do not monitor seismic activity around glaciers, but our study found that the rock-ice avalanche appears to have been initiated by seismic precursors which were continuously active for 2.5 hours prior to main detachment,” Sain told Dialogue Earth.
In Uttarakhand, Sain’s institute has been detecting possible GLOF risk zones. The tectonically active Alaknanda-Dhauliganga-Rishiganga basin is singled out by him as a priority location because of the thirty-nine hydroelectric projects that are now under construction or in various phases of proposal.
“We need satellite data, real-time meteorological data, real-time hydrological data, real-time seismic and GPS data, and general field survey for an AI-based integrated early warning system for glacial hazards,” adds Sain. He emphasizes how urgent it is to establish a dedicated glaciological center in the area, which would cost between Rs. 10 and Rs. 12 crore (USD $1.2 and $1.4 million).
Rashid agrees, pointing to the catastrophic Chamoli incident of 2021 and the GLOF that struck Sikkim in 2023, both of which happened in spite of malfunctioning monitoring equipment.
“Seismic data surrounding glaciers is currently nonexistent throughout the Indian Himalayan region. Furthermore, there isn’t a specific GLOF risk, he claims. The disjointed nature of current research provides an inadequate understanding of the glacial risk in the Himalayas.
Rashid is in favor of a uniform approach to gathering field data on glaciers and glacial lakes throughout the area. The creation of an all-encompassing AI-based forecast and alert system would require this data. Rashid ends, “Money will come only if there is a strong political will. This massive exercise will need money.”