Project Vaani to English Gyani, This IISc Professor is Going Places

Professor Prasanta Kumar Ghosh from IISc Bangalore has figured out a unique way to collect speech data in different languages and dialects.
Project Vaani to English Gyani, This IISc Professor is Going Places
Image by Nikhil Kumar
Professor Prasanta Kumar Ghosh from IISc Bangalore has figured out a unique way to collect speech data in different languages and dialects. Travelling to 80 districts in the first phase, showing local people a picture, asking them to describe it and then recording it has given the Google funded Project Vaani around 16,000 hours of speech data.  The team is open-sourcing the Vaani corpus and is transcribing around 10% of the data manually. The aim is to collect 150,000 hours of speech data from 773 districts of India. “India is not one language and not even several dialects, it's a continuum of languages, which requires a lot of research and development,” said Ghosh, in an exclusive interview with AIM.  Apart from this, Ghosh and his research team are also working on RESPIN
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Mohit Pandey
Mohit Pandey
Mohit writes about AI in simple, explainable, and often funny words. He's especially passionate about chatting with those building AI for Bharat, with the occasional detour into AGI.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed