India’s AI Push Might Be Pointless Without National Language Standardisation

Lack of engagement in vernacular languages leads AI to depend upon synthetic data sets for development.
IndiaAI
India's AI ambitions will be incomplete without incorporating its vernacular languages in the ecosystem for the benefit of its vast population. And this endeavour still has an old problem to tackle – the lack of real world data on Indic languages.   Most of us may have struggled with copying text from a PDF in an Indic language like Hindi, Kannada, or Telugu, as it gets pasted as boxes and symbols. The problem may seem related to fonts, but it runs deeper than that, dating back to 1988.  The foundation itself is broken, according to Vivekanand Pani, co-founder and CTO of Reverie Language Technologies, who argued that the effort and investments in sovereign models and data collection would see limitations without fixing Indian language computing standards. 
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Mohit Pandey
Mohit Pandey
Mohit writes about AI in simple, explainable, and often funny words. He's especially passionate about chatting with those building AI for Bharat, with the occasional detour into AGI.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed