According to sources cited by The Economic Times, the government intends to forge its own path in the field of artificial intelligence (AI) by creating a foundational model that is specifically designed to meet the requirements of Indian businesses, entrepreneurs, scholars, and researchers.
This mammoth project, which is scheduled to be launched following the current Lok Sabha elections in 2024, will begin with an initial investment of Rs 2,000 crore.
Anticipated to place India in the company of countries like China and the US, which have already started comparable projects, the proposed endeavor might be headed by the IndiaAI Innovation Center. The India AI Mission, which is projected to cost Rs 10,000 crore, will create this center.
“To work on a foundational model, the government will probably tap eminent higher education institutions and prominent researchers working on AI in the private sector,” a senior source stated.
According to the official, it is designed to generate outputs that may be used for a wide variety of applications and services. It is envisioned as either a large action model (LAM) or a large multimodal model (LMM).
The foundation for developing other AI models are foundational models, which are effectively pre-trained generative transformers. Based on user inputs, they create new responses using the data that already exists. “This [foundational model] will aim to provide output in more than one native language, borrowing from all the work that has been done so far on projects such as Bhashini,” the official said, addressing the unique needs and preferences of India.
Challenges with AI translation
An appropriate example is Bhashini, an AI-based language translation platform and model developed by the IT ministry and introduced in 2022. Foundational models are usually created by government and corporate organizations. Data from the Stanford Center for Research on Foundation Models show that as of April 2024, more than 330 of these models have been created by governments and private companies worldwide.
Even with problems like prejudice, hallucinations, and understanding deficiencies, models created by companies like Google, Amazon, and OpenAI, which is supported by Microsoft, are still ahead of the competition. To train its model, the Indian government plans to employ digitalized literature, publicly available data, and anonymized non-personal data collected from multiple stakeholders.
Nonetheless, it is still imperative to resolve copyright and privacy issues pertaining to the data. “We may also look at a platform exclusively for Indian startups where non-personal and anonymized data can be volunteered for training of the model,” an official said in reference to a potential remedy. To improve its performance, the base model will also be trained on international datasets with open-source machine learning technologies.
Data exchange is essential.
A cooperative strategy incorporating data exchange is thought to be necessary for this project to succeed. Officials stress how important it is to make training data available in order to support the creation of a variety of use cases prior to commercial deployment.