Google’s Gemini 2.5 Pro Now Best at Building Web Apps

Is this Google’s response towards OpenAI’s reaching an agreement to buy Windsurf? 
Image by Diksha Mishra

Google announced an update to the Gemini 2.5 Pro Preview model on Tuesday, enhancing its coding capabilities. 

The model now leads the WebDev Arena Leaderboard, which measures its capability to develop visually pleasing and functional web apps according to human preferences. It tops the chart with a score of 1419.95 points, with Anthropic’s Claude 3.7 Sonnet in second place, scoring 1357.10 points. 

Michael Truell, CEO of Cursor, was all praise for Gemini 2.5 Pro, and said, “We’re observing internally that the new model has a significant reduction in its failure to call tools, an improvement we believe our users will find makes 2.5 Pro even more effective than before in Cursor.”

The model is available for developers through the Gemini API in Google AI Studio and Vertex AI. Additionally, the new model is also accessible for users on the Gemini app, and its web app development capabilities can be harnessed using the built-in Canvas feature. 

Having said that, Google’s Gemini 2.5 Pro model was already ranked highly, both on benchmarks and by developers with real-world coding experience. 

In the Aider Polyglot leaderboard, which evaluates LLMs’ capabilities in writing and editing code, Gemini 2.5 Pro Preview scored 72.9%. It performed better than Claude 3.7 Sonnet (64.9%), OpenAI’s o1 (61.7%), and the o3-mini high at 60.4%. However, OpenAI’s newly released o3 model scored higher than the Gemini 2.5 Pro with a score of 79.6%. 

On several other benchmarks, Google’s Gemini 2.5 Pro is either second to or on par with OpenAI’s new o3 and o4-mini model on several other benchmarkss. 

Interestingly, Google said that the update was supposed to be released in the next ‘couple of weeks’, but the company said that, “But based on the overwhelming enthusiasm for this model, we wanted to get it in your hands sooner so people can start building.”

Note that this update comes only hours after reports emerged that OpenAI has reached an agreement to buy Windsurf, the AI-enabled coding platform for $3 billion. This marks OpenAI’s attempt to further strengthen the coding capabilities it can offer to users—alongside the already powerful o3 and o4-mini models, and the Codex CLI tool. 

To further intensify the competition, Anysphere, the company behind Cursor, the AI-enabled coding platform, recently raised a $900 million investment, led by Thrive Capital, Andreessen Horowitz (a16z), and Accel Ventures. 

📣 Want to advertise in AIM? Book here

Picture of Supreeth Koundinya
Supreeth Koundinya
Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.
Related Posts
AIM Print and TV
Don’t Miss the Next Big Shift in AI.
Get one year subscription for ₹5999
Download the easiest way to
stay informed