What if artificial intelligence (AI) could communicate fluently in any Indian language?
At the Mint Digital Innovation Summit 2024 in Mumbai, Pranav Mistry, founder and CEO of Two Platforms Inc., shared his vision for India to become a leader in artificial intelligence in non-English markets. Mistry emphasized that by developing up-to-date multilingualism models that do not require training from scratch, India can overcome the complexity of diverse languages and achieve this ambitious goal.
“Language is the interface between humans, and the development of artificial intelligence has placed it at the center of communication with machines. Over the last few years, LLMs (Gigantic Language Models) have made great strides, demonstrating human-level performance, but only in English. For other languages, LLM teachers were unable to grasp the cultural contexts and often gave inconsistent and incorrect answers,” Mistry explained.
Mistry believes that Indian startups, working to improve existing models, can drive this change by rethinking their approach and developing up-to-date multilingual models that do not require training from scratch can support it become an AI leader in non-investment areas. -English markets.
“But a gap in LLM programs should not exist, and given India’s diverse language and dialects, we deserve an AI that is fluent in multiple languages. But only if we can get around the complexities of Indian languages,” he added.
Two Platforms, a Silicon Valley-based deep tech startup backed by Mukesh Ambani’s Jio Platforms and South Korea’s Naver Corp., recently released Sutra, a multilingual, multilingual model designed specifically for the Indian market.
“The fundamental innovation of the Sutra is the separation of the concept of learning from language. In Sutra, we have our own up-to-date 256K tokenizer, a balanced tokenizer that covers all languages in a very balanced way along with high-quality data,” Mistry explained.
Sutra, he said, outperforms most local Indian LLMs, as well as models like GPT3.5,4 and Lama, not only in Hindi but also in languages like Gujarati.
“The English-centric model of immense language models cannot solve our problem,” he noted.
Posted: May 24, 2024 9:11 pm ET