Published on May 29, 2025
In AI News

New DeepSeek-R1 Is as Good as OpenAI o3 and Gemini 2.5 Pro

The new DeepSeek-R1-0528 has “significantly improved its depth of reasoning and inference capabilities.”

Image by Supreeth Koundinya

By Supreeth Koundinya

Chinese AI model maker DeepSeek announced a new update to its R1 reasoning model on Wednesday. The updated model, DeepSeek-R1-0528, is available on Hugging Face.

“In the latest update, DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimisation mechanisms during post-training,” said DeepSeek.

The company also shared the model’s benchmark results, which showed that it achieved performance parity with OpenAI’s o3 and Google’s Gemini 2.5 Pro models on multiple evaluations.

In the AIME 2025 test, DeepSeek-R1-0528 scored 87.5%, close to OpenAI-o3 (88.9%) and outperformed Gemini 2.5 Pro’s (83.0%).

Besides, the model achieved scores on par with leading AI models on other coding, mathematics, and reasoning evaluations, as seen on Artificial Analysis.

It scored 77% on LiveCodeBench (coding benchmark), matching Gemini 2.5 Pro (77%) and nearly OpenAI’s o3 (78%) in coding ability. On the reasoning and general knowledge benchmark MMLU-Pro, DeepSeek-R1 achieved 85%, comparable to Gemini 2.5 Pro (84%) and OpenAI’s o3 (85%).

Source: Artificial Analysis

Several users have already downloaded and deployed the model locally, as per their social media posts. Ivan Fioravanti, CTO of CoreView, said on X that he could run the DeepSeek-R1-0528-4bit at around 21 tokens per second on an Apple M3 Ultra chip-based device.

The DeepSeek-R1 reasoning model, released last year, created quite a storm across the AI ecosystem. During its launch, the model surpassed several competing ones in benchmarks.

DeepSeek prioritises using efficient techniques in the model’s architecture to improve performance rather than relying on high computing power.

One of DeepSeek’s previous models, V3, used 2048 NVIDIA H800 GPUs to achieve performance better than most open-source models.

Andrej Karpathy, former OpenAI researcher, said the DeepSeek V3’s level of capability is ‘supposed to require clusters of closer to 16,000 GPUs’. This caused numerous entities to doubt the demand for AI-related hardware, resulting in a market cap loss of over $500 billion for NVIDIA in just one day.

Numerous startups and products use the open-source DeepSeek model for deployment, and its capabilities are extensively recognised across various sectors in China. Recently, it was reported to be used for research and development for the country’s ‘most advanced warplanes’. Besides, German automotive leader BMW revealed plans to incorporate DeepSeek into its vehicles in China.

Last month, The New York Times revealed that courtroom officials are utilising DeepSeek to draft legal documents in minutes. Additionally, doctors and agencies are employing the model to locate missing persons. The report further noted that numerous companies are “encouraging” employees to adopt DeepSeek for design and customer service tasks.

📣 Want to advertise in AIM? Book here

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.