Published on September 10, 2025
In AI Startups

Why the Next Leap in Speech AI Comes from Kalpa Labs

“The next frontier is not just expressive voices. It’s about building systems that behave like humans in a real call.”

By Smruthi Nadig

In recent years, large language models (LLMs) have unified a wide range of text-based tasks under a single architecture. One model can code, translate, summarise and generate with remarkable fluidity. This unification, however, has not yet reached the world of speech. Existing models are either fast but inaccurate or accurate but slow, largely due to high token usage—50 tokens per second—and fixed input padding, which increases computing costs. Speech-focused AI startup Kalpa Labs focuses on creating fast, multilingual, real-time speech-to-text models that resolve the latency and inefficiency issues of systems like Whisper. Kalpa Labs aims to reduce audio token rates, eliminate unnecessary padding with configurable “register” tokens and use sparse architectures like mix

Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Smruthi Nadig

Smruthi brings over two years of experience in reporting on the global energy industry. They hold a Master's Degree from the University of Leeds in International Journalism and a Bachelor's Degree from Christ University in Media Studies, Economics and Political Science.

Meta to Acquire Voice AI Startup PlayAI

Don’t Miss the Next Big Shift in AI.

Get one year subscription for ₹5999

Enterprises Beware: Agent-Washing Clouds the Future of AI

Vendors mislabel copilots as agents, raising regulatory and operational risks for firms chasing the promise of agentic AI.

How Neysa Stands Out in the IndiaAI GPU Race

Unlike other providers focused on GPU allocation, Neysa claims to deliver an end-to-end AI cloud platform.

Two Indian Engineers on a Mission to Automate Home Cooking for the World

In a live demonstration for AIM, Posha prepared paneer tikka masala in approximately 25 minutes

BharatGen and the Pursuit of Sovereign, Scalable AI for India

“Knowledge-driven components are important because we don’t want everything to be just algorithmic innovation.”

How Pradhi AI Embeds Emotional Intelligence in Voice AI

As businesses recognise the potential of voice-driven tech, Pradhi AI is laying the foundation for an empathetic, responsive AI ecosystem.

Mangaluru Looks to Build Its Own Tech Identity, Not Replicate Bangalore

“The coastal city could showcase tangible results by applying deep tech to areas it already dominates”

Google’s Gemini Nano Banana and the Cost of Convenience

The company’s new AI image and photo editor deepens concerns over data use and consent gaps, experts warn.

BharatGen’s ‘Recipe’ for Building a Trillion Parameters Indic Model

The consortium insists sovereignty doesn’t mean shutting the door on global players.

Download the easiest way to
stay informed

Flagship Events ↗