Sign In

Published on November 12, 2024
In AI Trends

This Bengaluru Startup Made the Fastest Inference Engine, Beating Together AI and Fireworks AI

Simplismart’s software-level optimisations enabled Llama 3.1 8B to achieve a throughput of over 343 tokens per second.

Image by Nalini Nirad

By Mohit Pandey

Inference speed is a hot topic right now as companies rush to fine-tune and build their own AI models. Conversations around test-time compute are also heating up with models like OpenAI’s o1 showcasing ‘thinking’ and reasoning skills post-prompt, relying on an infrastructure-powered computation even after training. This is why companies like Groq, Sambanova, and Cerebras Systems have gained traction building their own hardware and providing unparalleled performance in inference, competing with the likes of NVIDIA and AMD. However, Simplismart, a Bengaluru-based startup led by former Oracle and Google engineers, has emerged as a leader in creating high-performance AI deployment tools. It competes in inference speed on the software side, not focusing on the hardware. Sim

Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Mohit writes about AI in simple, explainable, and often funny words. He's especially passionate about chatting with those building AI for Bharat, with the occasional detour into AGI.

Related Posts

Mangaluru Looks to Build Its Own Tech Identity, Not Replicate Bangalore

The H-1B Visa Policy Change Might be Good News for Indian IT

H-1B Shockwave: What a $100,000 Visa Fee Means for Indian AI Startups

TCS

TCS, Qualcomm to Set Up Innovation Lab in Bengaluru

Chennai’s OrbitAID Opens Bengaluru Facility for On-Orbit Refuelling, Satellite Servicing

Bengaluru GCC

Bengaluru to Get India’s First Quantum City, 6.17 Acres Sanctioned

Naval Ravikant

Naval Ravikant Expects Employees to Work 24/7 at His New Startup

Don’t Miss the Next Big Shift in AI.

Get one year subscription for ₹5999

How This Coimbatore SaaS Firm Cracked Hidden Enterprise Problem Costing Millions

Founded in 2015, now based in Portland, Responsive supports more than 20% of Fortune 100 companies

Enterprises Beware: Agent-Washing Clouds the Future of AI

Vendors mislabel copilots as agents, raising regulatory and operational risks for firms chasing the promise of agentic AI.

How Neysa Stands Out in the IndiaAI GPU Race

Unlike other providers focused on GPU allocation, Neysa claims to deliver an end-to-end AI cloud platform.

Two Indian Engineers on a Mission to Automate Home Cooking for the World

In a live demonstration for AIM, Posha prepared paneer tikka masala in approximately 25 minutes

BharatGen and the Pursuit of Sovereign, Scalable AI for India

“Knowledge-driven components are important because we don’t want everything to be just algorithmic innovation.”

How Pradhi AI Embeds Emotional Intelligence in Voice AI

As businesses recognise the potential of voice-driven tech, Pradhi AI is laying the foundation for an empathetic, responsive AI ecosystem.

Google’s Gemini Nano Banana and the Cost of Convenience

The company’s new AI image and photo editor deepens concerns over data use and consent gaps, experts warn.

Karya Google

BharatGen’s ‘Recipe’ for Building a Trillion Parameters Indic Model

The consortium insists sovereignty doesn’t mean shutting the door on global players.

Download the easiest way to
stay informed

Flagship Events