Why the Next Leap in Speech AI Comes from Kalpa Labs

“The next frontier is not just expressive voices. It’s about building systems that behave like humans in a real call.”
In recent years, large language models (LLMs) have unified a wide range of text-based tasks under a single architecture. One model can code, translate, summarise and generate with remarkable fluidity. This unification, however, has not yet reached the world of speech. Existing models are either fast but inaccurate or accurate but slow, largely due to high token usage—50 tokens per second—and fixed input padding, which increases computing costs. Speech-focused AI startup Kalpa Labs focuses on creating fast, multilingual, real-time speech-to-text models that resolve the latency and inefficiency issues of systems like Whisper.  Kalpa Labs aims to reduce audio token rates, eliminate unnecessary padding with configurable “register” tokens and use sparse architectures like mix
Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Picture of Smruthi Nadig
Smruthi Nadig
Smruthi brings over two years of experience in reporting on the global energy industry. They hold a Master's Degree from the University of Leeds in International Journalism and a Bachelor's Degree from Christ University in Media Studies, Economics and Political Science.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed