Global Tech - Latest Global Development in Artificial Intelligence

Spencer Kimball on How CockroachDB Will Power Efficient AI Agents

Ankush Das — Mon, 29 Sep 2025 08:24:37 +0000

The database industry is undergoing a fundamental transformation as AI reshapes how organisations design and operate their data systems. With billions of human interactions soon to be joined by trillions of AI-driven agents, the demands on infrastructure are unprecedented.

At the centre of this change is a new emphasis on scalability, resilience, and affordability, all while navigating the complexities of multi-cloud environments.

In a conversation with AIM, Spencer Kimball, CEO of CockroachDB, shared his perspective on the company’s current standing in a rapidly evolving database market.

While acknowledging the competition from other players in the space, like MongoDB and AWS, Kimball emphasised CockroachDB’s focus on solving scale, resilience, and multi-cloud challenges.

Scaling for an AI-driven world

Kimball underlined that the next big transformation is the sheer demand created by AI agents. “All of the activity that databases have had to deal with up till now has been humans. Now it’s going to be agents… You could have a trillion. And that’s going to happen.”

He explained that CockroachDB’s distributed design positions it to handle such explosive growth, which traditional databases might struggle to handle.

But scale is not just about growth; it is about efficiency as well. “Cockroach needs to become the most efficient and performant database at scale,” he said, adding that optimisation is underway.

Version 25.2 improved performance by 50%, and the long-term vision is to become “the cheapest database available at scale.”

For enterprises already straining under AI workloads, the promise of efficiency without compromising resilience is an attractive proposition.

Kimball also gave a glimpse into the architectural choices that enable such efficiency. He highlighted that CockroachDB uses data triplication instead of duplication, ensuring automatic self-healing in the event of failures.

While this is more expensive than traditional duplication, it prevents costly data loss and operational chaos.

“If that happens frequently, your teams burn out, your customers leave you, you get reputational brand damage, and the dollars and cents add up,” he warned, underscoring why resilience is worth the investment.

Competing, Collaborating, and Differentiating

Kimball acknowledged MongoDB as “a smart company” with a developer-friendly model but noted that CockroachDB rarely goes head-to-head with it.

“The team will typically know in advance whether they want a relational database or a document database,” he said.

He sees AI reshaping that dynamic, since rational, AI-driven decisions about architecture will favour capability over familiarity. Developers might have reservations about CockroachDB’s additional complexity, but AI tools are expected to render the adoption curve irrelevant in the long run, and may help developers realise the additional benefits they offer.

On partnerships, Kimball highlighted CockroachDB’s complicated, but fruitful relationship with AWS.

“In many ways, we think of AWS as our true north competitor. But they’re also one of our best partners,” he said.

In a $100 billion and growing operational database market, there is enough room to collaborate, compete, and still thrive. AWS itself sometimes may recommend CockroachDB to customers who need scale beyond Aurora or multi-cloud support, showing how competition and cooperation coexist.

Kimball also pointed to the broader market reality. “The hardest thing isn’t beating the competitor. It’s just trying to find a way to help the customer migrate, which is very difficult right now. But AI, I think, can substantially change those costs.”

Migration complexity remains a bottleneck, and Kimball believes AI will play a critical role in easing that process.

India As a Proving Ground

Discussing India, Kimball stressed that the market’s scale and regulatory environment align well with CockroachDB’s strengths.

He pointed to UPI and brokerage platforms as prime examples of workloads requiring resilience, compliance, and massive scalability.

“India actually does match up extraordinarily well with Cockroach’s differentiators. So that does help us command a premium price because we’re bringing such value.”

One example is Groww, one of India’s leading brokerage platforms, which relies on CockroachDB to manage enormous transaction volumes at scale. With only 5% market penetration so far, Kimball noted the growth opportunity is immense and a perfect match for CockroachDB’s distributed architecture.

He added that India’s financial sector, with rapid digitisation and strict regulatory oversight, creates strong demand for distributed systems that ensure regional survivability.

“Indian regulators are very concerned about the cloud provider risks. And that is also true in Europe. It’s not so true in the United States.” This makes CockroachDB’s ability to guarantee region survivability a natural fit for Indian enterprises.

Kimball noted CockroachDB’s commitment to Bengaluru as its APAC hub, and described India as an early, yet natural choice for enterprise-scale innovation.

“We’re very committed to Bangalore, despite the traffic situation,” he quipped, adding that the company is scaling operations and visibility across the region. Over the past three years, India has grown from an experiment to the anchor of their Asia-Pacific strategy.

The Age of Agentic AI

Looking ahead, CockroachDB is aligning itself with the rise of agent-based systems. Kimball emphasised CockroachDB’s decisive move towards becoming the optimal solution for agentic AI in the enterprise, highlighting features like multi-tenancy, bring-your-own-cloud, and Kubernetes operator for enterprise flexibility and control.

Kimball also discussed how developers at CockroachDB are adapting to the AI-driven pace of change.

While workloads have intensified, AI itself has become a productivity multiplier.

He emphasised that the best engineers are those who learn to “manage AI” as a resource, turning it into a career-defining capability.

As Kimball summed up, the database wars will not be won on features alone but on efficiency, resilience, and cost-effectiveness at unprecedented scale.

CockroachDB is betting on a future where distributed systems are not a luxury but a necessity, as billions of humans and trillions of agents demand reliable data infrastructure.

The post Spencer Kimball on How CockroachDB Will Power Efficient AI Agents appeared first on Analytics India Magazine.

With Rust, Cloudflare is Trying to Tackle the Industry’s Inference Bottleneck

Ankush Das — Thu, 25 Sep 2025 05:30:00 +0000

Cloudflare has introduced Infire, a new LLM inference engine built in Rust, to run AI workloads on its distributed network.

Unlike hyperscalers that rely upon large centralised data centres packed with expensive GPUs, Cloudflare operates a lean global network that sits within 50 milliseconds of 95% of internet users. That unique architecture demands a more efficient way to serve inference.

Mari Galicer, group product manager at Cloudflare, in an interaction with AIM, explained how inference is a different challenge for them compared to hyperscalers. “Most hyperscalers operate large, centralised data centres with nodes dedicated to AI compute, whereas Cloudflare operates a lean, distributed network, with each compute node needing to serve different types of traffic.”

“This makes CPU and GPU overhead a challenge, and as a consequence, we have to manage resources much more dynamically and efficiently,” she said.

The company shared in a blog post that the motivation behind this is that they did not want to tackle scaling challenges by just throwing money and buying more GPUs. Instead, the company needed to utilise every bit of idle capacity and be agile with where each model is deployed.

Building Infire in Rust

The company had initially relied on vLLM, the widely used open-source inference engine, but discovered it was not optimised for dynamic, distributed edge workloads. Running Python-based vLLM also required sandboxing for security, which slowed performance and consumed valuable CPU cycles.

So, the team decided to build Infire in Rust. “The primary tradeoff was the up-front development cost of building something from the ground up,” Galicer said. “But because we have quite a few engineers with deep expertise in Rust, we found this was a worthwhile investment.”

This makes sense when Rust, as a choice of programming language, could spell trouble for the wrong team.

Rust’s safety guarantees play a central role. Galicer explained that Rust’s compile-time memory safety ensures protection against common vulnerabilities without the performance overhead of a garbage collector. This inherent security enables Cloudflare to deploy Infire directly and reliably on servers, alongside other services, thereby eliminating the necessity for resource-intensive sandboxing.

The architectural shift might explain Infire’s improved CPU efficiency. Benchmarks show Infire sustaining over 40 requests per second while using just 25% of CPU resources, compared with vLLM’s 140% usage on bare metal.

“Infire’s design reduces CPU overhead primarily by being built in Rust,” Galicer explained.

“This eliminates the need for a heavy security sandbox like gvisor, which Cloudflare had to use with the Python-based vLLM, thereby removing a major source of CPU consumption.”

Also Read: Cloudflare Just Becamne an Enemy of All AI Companies

Performance Edge

At the technical core, Infire employs techniques such as continuous batching, paged KV caching, and just-in-time CUDA kernel compilation. The latter is tailored for Nvidia Hopper GPUs.

“Infire compiles CUDA kernels at runtime that are specifically tailored for the exact model architecture and Cloudflare’s Nvidia Hopper GPUs,” Galicer said.

She highlighted that customised kernel generation, tailored to the specific operations and parameter sizes of a model, offers superior optimisation opportunities compared to the traditional method of integrating generic, pre-compiled kernels.

The result is higher throughput, lower latency, and greater GPU utilisation. Benchmarks show Infire delivering up to 7% faster completions than vLLM on unloaded hardware and significantly better performance under real-world load.

According to Galicer, workloads with “high-throughput and many concurrent requests, which are typical in a large, distributed edge network, see the most improvement.”

Currently, Infire powers the Llama 3.1 8B model in Workers AI, and Cloudflare says more models will follow.

While the company has not yet tested models like DeepSeek or Qwen, Galicer confirmed that Infire will evolve alongside Cloudflare’s AI catalogue.

Future Direction

When asked about open sourcing, Galicer was cautious: “We’re in the very early stages of developing Infire, and as the project matures, we will continue to evaluate whether we should open source it.”

For Cloudflare, the project sits at the centre of its long-term strategy.

“Infire is a foundational part of our AI strategy because it provides a highly efficient and secure engine for running AI inference directly on our globally distributed network,” Galicer said.

Infire serves as the platform through which the company believes it can address numerous performance hurdles, ultimately leading to faster and more cost-effective inference at Cloudflare.

Also Read: Why Gleam Could Be The Next Most Admired Programming Language After Rust

The post With Rust, Cloudflare is Trying to Tackle the Industry’s Inference Bottleneck appeared first on Analytics India Magazine.

From J.P. Morgan to Netflix to Meta: The H-1B Conversation Continues

Ankush Das — Wed, 24 Sep 2025 06:53:12 +0000

The H-1B visa program has long proved to be both a pathway for talent and a flashpoint in political debates. When sudden shifts occur, as recently seen after the Trump administration raised the visa fee exorbitantly, the ripples stretch across the globe. Corporate leaders, entrepreneurs, and former big tech employees have all weighed in, each highlighting a different aspect of the controversy.

These debates lay bare how deeply Silicon Valley has relied on immigrant talent.

Together, their perspectives outline a divided landscape—one part concerned about America’s attractiveness to skilled workers, another pushing for more targeted reform, and a third grappling with the reality that US schools may not be producing enough “builders.”

The CEO Perspective: Shock and Structural Change

In an interview with The Times of India, Jamie Dimon, CEO of J.P. Morgan, described the new H-1B announcement as an unexpected blow. “It came out of the blue,” he said, pointing to how global firms depend on mobility.

“For us, visas matter because we move people around globally—experts who get promoted to new jobs in different markets.”

Dimon framed immigration as a foundational American strength.

“My grandparents were Greek immigrants who never finished high school. America is an immigrant nation, and that’s part of its core strength.”

His concern was not only about talent pipelines, but also about the broader message sent to skilled professionals considering a future in the US.

Netflix co-founder Reed Hastings had a different perspective.

Writing on X, he supported the introduction of a hefty tax on H-1Bs. “Trump’s $100k per year tax is a great solution. It will mean H1-B is used just for very high value jobs, which will mean no lottery needed, and more certainty for those jobs.”

Hastings’ view implies that by filtering visas through cost, only specialised roles will remain, reducing randomness and instability for applicants.

Both perspectives underline the tension between openness and restriction. While Dimon stressed America’s need to remain welcoming, Hastings suggested a leaner, more premium version of the program could offer long-term clarity.

The Tech Worker Reality

The reliance on H-1B workers is not abstract for those who have worked inside Silicon Valley.

Zach Wilson, founder of DataExpert.io and a former Meta employee, recalled his team composition in a post on X.

“I was on a team of 17 people. 15 of the 17 were on H1b visas. I was one of two Americans on the team.”

His remark on the new visa rules: “just for core growth data engineering, that’s $1.5m in visa fees.”

Wilson had a sharper observation about the shift in opportunity. For Americans aspiring to secure a prominent role in the tech industry, this is an opportune moment, as over 80% of their competitors have, in essence, disappeared overnight.

For him, restrictions might open doors for domestic graduates who had previously struggled to find space in the market.

Netizens on X, however, pushed back. Nitin Ahirwal, a full stack developer argued online that “Meta’s core growth data engineering wasn’t built on passports, it was built on talent. US schools are producing users of tech, not builders.”

He warned that displaced engineers would not vanish, but instead re-establish themselves in Bangalore, Hyderabad, Toronto, or Singapore.

The conversation drew in more voices. Another user pointed to the data shared by Wilson, and said that the reality was evident in the fact that 15 out of 17 engineers at Meta were on H1Bs.

He highlighted that the current talent gap, which was evident in the past, persists because US schools are not producing enough qualified individuals. Eliminating visas would only highlight this existing deficit rather than resolving it.

A Fragile Balance

The latest H-1B debate has become less about paperwork and more about national positioning.

For Dimon, the question is whether America can stay competitive while shutting its doors. For Hastings, the solution lies in transforming the programme into something more exclusive. And for workers like Wilson, the changes expose just how dependent big tech has been on immigrant skill.

What emerges is a fragile balance, the reform may reshape opportunity, but disruption risks driving innovation elsewhere. The conversation about visas could be, in reality, a deliberation about where the future of technology will be built.

The post From J.P. Morgan to Netflix to Meta: The H-1B Conversation Continues appeared first on Analytics India Magazine.

TikTok’s US Future Shaped by Trump, Powered by Oracle and Murdoch

Ankush Das — Tue, 23 Sep 2025 12:33:51 +0000

The fate of TikTok in the United States has evolved into a theatre of politics, technology and business. At the centre of the latest chapter is US President Donald Trump, who announced progress on the app’s sale following a phone call with Chinese President Xi Jinping.

With Oracle positioned as the guardian of TikTok’s algorithm and American investors, including the Murdoch family and Silicon Valley heavyweights, the deal is shaping up to be one of the most scrutinised transactions in recent times.

On Truth Social, Trump set the tone of optimism. “I just completed a very productive call with President Xi [Jinping] of China. We made progress on many important issues, including…the approval of the TikTok deal.”

“I also agreed with President Xi that we would meet at the APEC Summit in South Korea,” he added.

Oracle as the Algorithm Gatekeeper

The agreement is built around Oracle Corp. retraining and securing TikTok’s algorithm for American users, a Bloomberg report stated, citing a White House official.

“Oracle, the US security partner, will operate, retrain and continuously monitor the US algorithm to ensure content is free from improper manipulation or surveillance,” the official said.

Under the proposed plan, ByteDance would lease a copy of its algorithm, which would then be retrained “from the ground up”.

With all US user data to be stored securely in an Oracle-managed cloud, Beijing-based ByteDance would lose access to both user data and the algorithm within the US market. While this setup might satisfy some security experts, certain aspects still require further clarification.

The future of TikTok in the US faces a dilemma: the US law mandates ByteDance’s complete divestiture, yet Chinese export regulations prohibit the transfer of sensitive software.

Lawmakers, including US representative John Moolenaar, have raised concerns that any continued reliance on ByteDance technology would result in the “ongoing reliance by the new TikTok on a ByteDance algorithm”.

The White House is “100% confident” that the deal will go through, a senior White House official told Politico.

Ownership will tilt firmly in favour of the American side. Six of seven board seats would go to US representatives, with ByteDance restricted to one seat and excluded from the security committee. The new entity will be majority-owned by a consortium that includes Oracle, venture capital firm Andreessen Horowitz, and private equity firm Silver Lake Management.

Oracle’s pre-existing Project Texas hosting deal ensures the company deepens its hold as TikTok’s infrastructure partner. Moreover, the company’s stock jumped ~6% with analysts estimating that the deal could generate $1-2 billion in additional revenue annually under Oracle’s oversight role, as per MarketWatch.

The Murdochs, Ellison and the Politics of Ownership

The personalities circling the deal have made headlines of their own.

During an interview on Fox News, Trump implied the Murdochs’ involvement, stating, “I hate to tell you this, but a man named Lachlan is involved. Lachlan is…a very unusual name, Lachlan Murdoch. And Rupert is probably going to be in the group.”

As per the Bloomberg report, Trump also brought up Larry Ellison, Oracle’s chairman, and Michael Dell, chairman of Dell Technologies.

The White House’s official announcement named Oracle, Andreessen Horowitz and Silver Lake, but not the Murdochs or Michael Dell, who was also cited by Trump. A source told Bloomberg that Fox Corporation would be engaged in the consortium, rather than the Murdochs personally.

For Trump, the deal carries multiple meanings. The US President reportedly suggested that the US should receive a “fee plus” as part of the deal. His administration intends to extend ByteDance’s liquidation deadline by 120 days to allow more time for negotiations to conclude.

The Chinese government’s own language has been measured. Its foreign ministry noted that the Chinese government respects the company’s wishes and hopes that productive commercial negotiations, conducted in accordance with market rules, will lead to a solution. They hope that the solution will comply with China’s laws and regulations while considering the interests of both parties.

TikTok’s future in the US now lies at the intersection of campaign politics, national security concerns and billion-dollar corporate stakes.

Oracle is poised to become both a watchdog and an enabler of the platform’s algorithm, while investors ranging from Silicon Valley to others line up for influence.

Whether the arrangement satisfies both Washington’s office and Beijing’s regulators is the next test, one that could decide whether TikTok remains a cultural mainstay or a casualty of geopolitical brinkmanship.

The post TikTok’s US Future Shaped by Trump, Powered by Oracle and Murdoch appeared first on Analytics India Magazine.

Amid H-1B Row, Can Remote Work Save Big Tech?

Supreeth Koundinya — Tue, 23 Sep 2025 10:34:30 +0000

While US President Donald Trump and commerce secretary Howard Lutnick initially announced that the H-1B visa fee would be revised to $100,000 annually, it was later clarified that this was a one-time fee, and wouldn’t affect existing visa holders or renewals.

It likely brought relief for companies like Amazon, which reportedly instructed employees to return to the US immediately, or not leave the country to avoid any troubles.

And even as the $100,000 is a one-time fee, it brings a massive change to the price tag of an H-1B visa. So, how are big tech companies going to deal with this?

The primary option is to relocate work and projects to offshore offices or GCCs if they struggle to pay the new fees. India is a leading hub for GCCs and hosts most of the big tech companies from the United States.

But how does this idea of remote work look in practice? After all, it costs nothing to obtain what is jokingly called a “TCP/IP visa” for an employee.

How Does Remote Work Sound?

“There is always a preference for working in the same time zone,” said Sriram Subramanian, founder at CloudDon, a market research firm. Companies still feel that, under in-person interactions, they can accomplish more,” he told AIM, adding that “collaborations, relationships, and camaraderie are invaluable aspects.”

Alongside difficulties arising in the work environment, remote work may also require additional investment. Neeti Sharma, CEO of TeamLease Digital, a staffing and recruiting firm, told AIM that “managing people across multiple time zones needs another level of people managers, adding to the overall costs.”

Additionally, employees working outside the US may lack access to essential infrastructure, labs, or specialised hardware, which can limit their performance.

Therefore, big tech companies shifting projects previously staffed by H-1B workers to GCCs will inevitably face trade-offs associated with remote work.

But what about startups and small-scale companies, which won’t be able to afford both the $100,000 visa fee and a physical presence in a country like India?

Hiring employees on a contractual basis in India is an option, but that presents its own challenges.

Sharma said that these companies will face issues stemming from contractors having to manage their own income tax and the lack of standard employment benefits, such as social security and insurance. This renders these employment contracts less appealing.

“Payments add another layer of complexity with cross-border transfers, forex charges, and compliance paperwork. On top of that, contractors often see these roles as short-term since they lack stability and career growth,” she added.

Several voices in the industry also collectively agree that small-scale companies and startups may be the worst affected by this new ruling.

Steady Growth of H-1B Visas

The H-1B visa is capped at 85,000 new slots annually, allocated through a lottery system, and granted for three years with an option to extend it up to six.

However, the Trump administration also suggested modifications to the allocation process, giving applicants with higher wages and better positions a greater chance of winning the lottery.

The revised system allows applicants earning at Level IV wages to submit four entries to the lottery, followed by Level III wages with three entries, Level II wages with two entries and Level I wages with one entry.

Notably, Level I (~17th percentile) wages pertain to entry-level positions, Level II (~34th percentile) to qualified workers, Level III (~50th percentile) to experienced workers, and Level IV (~7th percentile) to fully competent workers for the specific role at the relevant location.

The administration states that this would encourage employers to offer higher wages to H-1B workers and “disincentivise the existing widespread use of the H-1B program to fill lower paid or lower skilled positions”.

Amazon led all employers with 10,044 H-1B visa approvals in FY 2025, followed by Tata Consultancy Services with 5,505 and Microsoft with 5,189. Meta obtained 5,123 approvals, while Apple secured 4,202.

Note that these numbers include both new H-1B petitions and renewals/extensions of previously approved ones.

Over the years, albeit with mixed trends among a few companies, most have consistently secured new H-1 B visas every year, with no signs of significant decline.

Barney Hussey-Yeo, founder of Cleo, said in a post on X: “Spoke with a senior big tech exec [executive] about the H1-B changes today. They’re totally unaffected.”

“They’ll just hire the same people through their international offices for less, or pay the $100k when they really need someone in the US,” he added.

Is $100,000 Negligible for Big Tech?

George Fishman, a senior legal fellow at the Centre of Immigration Studies, argued that H-1B workers are paid less than locals for the same role. Big tech firms, in his opinion, will not hesitate in compensating H-1B workers alongside the new fee.

Citing a report from the US Citizenship and Immigration Services (USCIS), he wrote in a blog post that in 2023, the average annual salary that employers promised to their petitioned-for H-1B ‘computer-related occupations’ (CRO) workers for initial employment was around $99,000.

He estimated that if an employer had to pay the new one-time fee for an H-1B worker, the average H-1B salary would only rise to $115,667.

He also stated that the Bureau of Labour Statistics (BLS) data show that software developers at the 75th percentile earned $167,540 and at the 90th percentile $208,620 in 2023.

Under a one-time $100,000 fee, the average H-1B CRO worker would still earn $51,873 less than the 75th percentile and $92,953 less than the 90th percentile.

“These higher fees would not be unreasonable amounts for employers to pay for those H-1B workers who truly are ‘the best and the brightest,’” reasoned Fishman.

Several industry experts claim a minimal impact on a company’s expenditure if the new structure is implemented.

Deedy Das, partner at Menlo Ventures, estimates that the new H-1B visa costs would represent just 0.4 to 0.7% of total research and development expenditure across major technology companies, assuming new petitions comprise 35% of all applications.

The good news on the $100k H-1B ruling is that BigTech / FAANG can easily eat the cost of all new H-1Bs for only +0.2-0.7% increase in R&D cost! pic.twitter.com/Ut6xFwu6RW
— Deedy (@deedydas) September 21, 2025

Current data suggests this assumption may be generous.

Amazon, which secured the most H-1B approvals in fiscal year 2025 with 10,043 total petitions, filed ~1,700 new applications, less than 20% of the total. Maintaining similar petition levels going forward would cost Amazon roughly 0.4% of its fiscal 2024 marketing budget.

Yet, the significance of these sums remains debatable. The same technology giants that might absorb H-1B fees without difficulty have recently conducted mass layoffs affecting thousands of employees, generating savings of comparable magnitude.

Meanwhile, companies demonstrate a willingness to pay premium rates for exceptional talent. Meta was widely reported to have offered over $100 million to secure top AI researchers and engineers from competitors.

But no major technology company has issued public statements regarding the proposed changes, leaving critical questions unanswered.

Microsoft, Meta or Amazon did not respond to AIM‘s queries on this issue until the time of publication.

Note: The headline has been updated for clarity.

The post Amid H-1B Row, Can Remote Work Save Big Tech? appeared first on Analytics India Magazine.

OpenTelemetry Is Ageing Like Fine Wine

Supreeth Koundinya — Tue, 23 Sep 2025 04:28:58 +0000

Founded in 2016, OpenTelemetry set out to end the observability nightmare. Enterprises were drowning in custom integrations, where each data source demanded its own connector for metrics, traces, telemetry, and logs.

Today, in an AI-first world, OpenTelemetry continues to affirm its original purpose: to provide enterprises with an open, standardised, and vendor-neutral framework for collecting machine data from numerous sources.

For example, Splunk, the leading contributor to the OpenTelemetry project, incorporates the framework extensively into its newest AI observability solutions. Its AI Agent Monitoring feature, in its cloud platform, tracks LLM-based applications through performance, cost, and behaviour metrics – all built on OpenTelemetry’s vendor-neutral foundation.

Morgan McClean, the director of product management at Splunk and the co-founder of OpenTelemetry, told AIM that OpenTelemetry is the only agent mechanism being used in Splunk’s Observability Cloud platform.

Splunk originally used the Smart Agent from SignalFX (a company it acquired a few years ago), but in 2022, switched entirely to OpenTelemetry as a standalone mechanism.

“It was a big strategic bet we made at the time, but it has paid off,” said McClean, adding, “we also switched all of the Kubernetes log instrumentation to the Splunk Platform. That all now uses OpenTelemetry.”

Even Splunk’s recently announced Database Monitoring solution follows the same pattern. The tool provides query-level insights such as wait times, CPU usage, memory consumption, and execution plans, via OpenTelemetry instrumentation.

The trend extends across major cloud providers. Companies like IBM, AWS, Dynatrace, Insightfinder and others provide OpenTelemetry-based AI observability solutions. Recently, AWS also launched a Generative AI Observability Preview, where, using the AWS Distro for OpenTelemetry (ADOT) SDK, AI agents and frameworks can be auto-instrumented.

The recently announced Gemini CLI GitHub Actions from Google is integrated with OpenTelemetry, allowing users to stream logs and metrics to platforms such as Google Cloud Monitoring.

But platform adoption is only part of the story. The more important change is that AI frameworks now themselves generate OpenTelemetry-compliant data natively.

AI Frameworks Embrace Native Telemetry

This closes the loop. Frameworks produce standardised data using OpenLLMetry, and platforms like Splunk or AWS ingest and analyse it without custom adapters.

“One of the other things happening in OpenTelemetry due to AI is just a lot more semantic standardisation around AI. The most visible part of this is OpenLLMetry,” said McClean.

The OpenTelemetry community has recently established semantic conventions specifically for AI workloads, known as OpenLLMetry. These conventions standardise how telemetry for aspects such as model inputs/outputs, token usage, response metadata, and others are recorded.

OpenTelemetry normalises these signals, enabling consistent comparison and analysis of AI workloads, regardless of whether the underlying model is served via OpenAI, Hugging Face, Anthropic, or a custom deployment.

To support this further, instrumentation libraries are being developed and released. For example, there are ongoing efforts to provide plug-in instrumentation for the OpenAI Python API and other similar software development kits (SDKs). These libraries automatically capture telemetry data like prompts, responses, and token counts, without manual instrumentation from developers.

And these semantics are what make it possible for AI frameworks to support OpenTelemetry data out of the box. As McClean notes, this means the community no longer has to build one-off integrations reactively as each new framework emerges.

Instead, frameworks like CrewAI, LangGraph, and PydanticAI emit OpenTelemetry data natively, aligning with OpenLLMetry semantics from the start.

This makes it easy for teams who are using these frameworks in their AI/agentic applications to export telemetry data to platforms that offer observability features.

“This further validates the vision and strategy that everyone in the OpenTelemetry community has had,” said McClean, as he discussed the motivation with which the project began.

The Origins

Organisations needed to capture data from applications written in different languages, running on various operating systems, and using countless frameworks and libraries.

The challenge was both technical and economic. “Each individual integration is pretty straightforward, but there are so many of them. And even if you go and build them all, you need to maintain them perpetually,” he said.

Traditional platform monitoring vendors faced the same constraints. Even with teams of 40-100 engineers dedicated to building monitoring software and integrations, they could only support a limited set of languages.

“They would support Java and .NET, but if you went in and said I have an application written in Go, they’d say ‘Well, we can’t help you’,” added McClean.

OpenTelemetry solved this problem by creating a standardised, open-source framework that could be used to instrument, generate, collect, and export telemetry data. Companies like Google, Microsoft, Splunk and others came together to create, maintain and update this standard.

In 2019, the framework was integrated into the Cloud Native Computing Foundation (CNCF), home to the popular Kubernetes framework today. OpenTelemetry is the second most popular project in CNCF, just behind Kubernetes.

However, what highlights the importance of OpenTelemetry in the AI era is what occurs after OpenTelemetry does its job, which is providing observability to teams.

Observability Is AI’s Trust Foundation

Jeetu Patel, president and chief product officer at Cisco, explained why observability is the need of the hour.

“One of the big constraints of AI is a trust deficit. And if people don’t trust the system, they’re not going to use it. You can’t really trust something that you don’t see,” Patel told AIM, adding that visibility is essential across the stack, from the silicon to the agent.

“And you know how the GPU is performing, you know how the models are performing, you know the data going into the models, you know what the tokenomics look like,” Patel explained.

All of those things put together are what allow organisations to start reimagining and automating workflows, he concluded.

OpenTelemetry plays a key role in integrating data. As tools and platforms enhance their observability features, it’s still essential for companies to input the correct data into these systems to achieve meaningful insights. Poor data quality obscures outage root causes and system bottlenecks.

Although AI platforms and products largely meet their advertised capabilities, occasional downtimes remain a challenge. In recent months, there have been several instances where ChatGPT experienced a downtime, and situations where Google Cloud faced an outage, causing disruptions to related platforms.

These reliability challenges fuel continued innovation in observability. As platforms work toward zero downtime, observability companies find endless opportunities to build more sophisticated monitoring solutions.

The post OpenTelemetry Is Ageing Like Fine Wine appeared first on Analytics India Magazine.

How Confluent Helped Notion Scale AI and Productivity for 100M+ Users

Ankush Das — Mon, 22 Sep 2025 09:08:35 +0000

Notion has become the productivity companion for more than 100 million people worldwide. But maintaining that scale needed more than sleek design and collaborative features. It needed a real-time backbone.

In an interaction with AIM, Rohit Vyas, director of solutions engineering and customer success for South Asia at Confluent, explained how the company powered Notion’s scale, productivity gains, and AI ambitions.

The official Notion-Confluent case study echoes this transformation, noting how Confluent’s data streaming platform enabled Notion to triple its productivity while overcoming scaling limits. But what the case study summarises in numbers, Vyas unpacked the details, describing a story of event-driven architecture, lean engineering teams, and real-time AI.

The SaaS Match: Notion and Confluent

Vyas said Notion was “born in the digital ecosystem” and already ran as a SaaS-first company. Its architecture had to handle millions of daily events, from task creation to collaborative edits.

“All of these events that get triggered, they are all managed and funnelled and distributed and collected over Confluent’s data streaming platform,” Vyas said.

Confluent offered Notion a fully managed SaaS service running on AWS. This gave Notion the “affordable, flexible, highly scalable and capable” backbone it needed, Vyas said.

The official blog post added that Confluent’s managed Kafka solution integrated with AWS, allowing Notion to stream events into Amazon S3, Snowflake, and PostgreSQL without managing infrastructure. This freed the lea

Powering Notion AI with Real-Time Streams

Notion’s AI features, centred on “the find” and “the do,” require more than static data lakes.

“There is no AI without data, and there is no trustworthy AI without trustworthy data,” Vyas said. Confluent became the real-time supply chain for Notion’s AI workloads.

That meant streaming events as they happened, enabling retrieval-augmented generation for precise search, and feeding large language models with fresh, contextual data.

Vyas explained that Confluent Flink, the platform’s stream processing engine, played a central role. It transforms, aggregates, and routes real-time streams for Notion’s AI features.

He compared it to “tapping into a flowing pipe to modify, combine, or divert the water.” This allows real-time functions, such as helping a user find relevant information as they type.

The blog post noted that Confluent’s backbone ensures changes in Notion are instantly reflected in its vector database, keeping search and generation tools up to date.

Cost Optimisation and Reliability

Scaling is not only about performance but also about cost. Vyas explained that the total cost of ownership for data platforms comes from infrastructure, human resources, downtime risks, and reputational costs.

“If you look at these four drivers of total cost of ownership, on every [factor], Confluent comes out as extremely capable and robust,” he said.

Reliability was also key. Confluent provides 99.99% uptime annually, which is less than one hour of downtime per year.

Vyas stressed that such guarantees let Notion’s engineers stay focused on product experiences rather than firefighting outages. The Notion case study also highlighted this leap, noting how Confluent helped save time and triple productivity.

Privacy and Security at Scale

Handling sensitive user data across millions of workspaces makes privacy and security critical.

Vyas said Confluent works within cloud safeguards such as encryption at rest and in motion, firewalling, and identity management, while adding its own controls.

“We also expose a very strong capability called client-side field-level encryption, which basically encrypts the data payload completely to an extent that even Confluent can’t see it,” he explained.

With role-based access, strong authentication, and compliance with regional data residency rules, Notion ensures customer data is protected throughout its journey.

This layered security posture, combining AWS infrastructure, Confluent’s features, and Notion’s governance, allows it to deliver AI-driven features while meeting global privacy expectations.

The Bigger Picture

For Notion, the move to Confluent was more than infrastructure modernisation. It created a cultural shift across teams.

“If you are Confluent, become the central data platform, then the transactional teams, operational teams, analytical teams, cyber security teams, event and logging teams and GenAI teams all play a part of that,” Vyas said.

In effect, Confluent helped Notion not only support 100 million users but also build a centralised, secure, AI-ready data foundation. This enabled faster innovation, easier data access across teams, and a sharper focus on the future of connected work.

As Vyas summed up, Notion could focus its resources on improving the platform once freed from managing complexities.

The post How Confluent Helped Notion Scale AI and Productivity for 100M+ Users appeared first on Analytics India Magazine.

Traditional ROI Models Don’t Work for AI

Siddharth Jindal — Fri, 19 Sep 2025 07:15:01 +0000

For decades, businesses measured technology success through traditional return on investment (ROI) models. These worked when costs and outputs were predictable. In the age of AI, where models evolve and markets shift constantly, that approach falls short.

At Cypher 2025, India’s biggest AI conference organised by AIM in Bengaluru, Vikas Behrani, vice president of AI practice and innovation lab at Genpact, urged organisations to rethink ROI. He said AI’s value is fluid, intangible, and always changing. “You have to keep looking at it, innovate, and at the same time recalculate what AI value will mean in the next few years.”

Beyond the Obvious Gains

Behrani noted that AI returns extend beyond cost savings, automation, or revenue growth. The bigger impact lies in what he called invisible ROI—customer delight, brand reputation, talent attraction, agility, and knowledge retention. These are hard to measure but build long-term advantage.

He also flagged hidden costs. “More often than not, invisible costs are two to three times your direct costs,” he said. Legal reviews, compliance checks, employee training, and integration with legacy systems quietly drain budgets.

Treat ROI as a Journey

Behrani stressed that ROI should be seen as a continuous process.

In the pre-implementation stage, businesses must define the problem, set success metrics, ensure data quality, and test pilots. During rollout, they should train employees, work with domain experts, monitor performance, and plan for maintenance. Post-implementation, AI should be managed like a portfolio investment, with expectations reset and ROI reassessed regularly.

From Cost to Value

For Behrani, the shift is about mindset. “You have to move from cost reduction to value creation. From short-term gains to long-term impact. From single solutions to portfolio solutions.”

He concluded with a clear message. “The bottom line is not just a cost case anymore, but a strategic value equation. Those who master AI ROI measurement will define the future of value in this AI era.”

The post Traditional ROI Models Don’t Work for AI appeared first on Analytics India Magazine.

Why Decision Making Remains the Missing Piece in AI Adoption

Siddharth Jindal — Thu, 18 Sep 2025 05:44:22 +0000

Soumendra Mohanty, chief strategy officer at Tredence, wants the industry to rethink AI. Human instinct and cognition, when combined with AI, form the foundation for decision-making and innovation, he said while terming this approach as IA, or individual dexterity.

Mohanty was addressing the audience at Cypher 2025, India’s largest AI conference organised by AIM from September 17-19.

Drawing a parallel between human evolution and the rise of AI, he said that instinctive, fast responses need to coexist with deliberate, strategic thinking.

While AI tools are solving narrow problems, Mohanty said strategic decision-making remains a challenge. Citing MIT research, he noted that about 95% of GenAI pilots fail to reach production. Managing paradoxes, bounded rationality, and human cognition are key to strategic decision making, he said.

He explained that people struggle when faced with multiple options involving value trade-offs. At the same time, they face information overload from the rapid pace of AI innovation. Balancing instinctive “fast thinking” with deliberate “slow thinking,” described by Daniel Kahneman as System 1 and System 2, is critical, Mohanty said.

As organisations test autonomous systems, he questioned the role of humans in the workflow. Should humans remain in the loop, above it as observers, or outside it entirely, setting rules and policies? Developing clarity on this balance, he said, is vital for both efficiency and accountability.

Mohanty argued that many AI pilots fail because they are designed around patterns, but ignore anti-patterns. “In software engineering, we also think about anti-patterns. But in the case of AI and agentic AI, especially LLMs at the core of things, we actually don’t understand how it does,” he said.

To bridge this gap, he proposed Agentic Operating Systems—engines that orchestrate scale, performance and governance with built-in guardrails. Governance, he said, cannot be an afterthought. It must be visible, calibrated, and embedded in the system.

He also introduced the Ambidex Framework, a blueprint for balancing AI and IA. It is designed to assess workflows, mitigate anti-patterns, build skills for agentic systems, integrate mechanisms, and design for ambiguity. “We live in a world full of uncertainties. Ambidex is about creating systems that can adapt to ambiguity, just like humans do,” he said.

Closing his session, Mohanty unveiled Milky Way, a workflow system developed by Tredence aimed at analysis-oriented workflows, which he called one of the hardest problems in AI.

Mohanty concluded that the future of AI is not just about smarter machines but about amplifying human dexterity. “AI will only reach its true potential when it works in harmony with IA, which is you and me,” he said.

The post Why Decision Making Remains the Missing Piece in AI Adoption appeared first on Analytics India Magazine.

Quantum Computing Could Make GPUs Obsolete, Says NxtGen CEO

Siddharth Jindal — Wed, 17 Sep 2025 11:36:19 +0000

Artificial intelligence today is influenced as much by capital and geopolitics as by technology. Control of GPUs has become a strategic lever for countries to maintain their sovereignty.

At Cypher 2025, India’s largest AI conference organised by AIM, AS Rajgopal, founder and CEO of NxtGen, built on this theme in his keynote. He said only a handful of countries can build frontier models, while almost 95% of GPUs come from companies such as NVIDIA.

“Over the last 10 years, big tech companies have doubled their operating profit, from 34% to 63%,” Rajgopal said. He pointed out that nearly ₹3 lakh crore is being spent this year alone on compute infrastructure.

Between AGI Dreams and Quantum Horizons

Rajgopal cautioned against conflating generative AI with artificial general intelligence. “GenAI is token prediction. It’s statistical pattern matching. Please don’t mistake it for creativity,” he said. “AGI, if it comes, is much farther away because humans themselves are still trying to understand consciousness.”

He described quantum computing as a force that could make GPUs redundant. He recalled visiting the IBM Watson Research Centre in New York to explore IBM’s System Two quantum computer. “Quantum is right on the horizon, and it will happen in my lifetime,” Rajgopal said.

IBM has since announced plans for the world’s first large-scale, fault-tolerant quantum system, the IBM Quantum Starling, to be developed at its new quantum data centre in Poughkeepsie, New York, by 2029.

India’s Edge Lies in Talent and Use Cases

Despite constraints in infrastructure and funding, Rajgopal sees India’s advantage in its talent base and adoption of open source. “India doesn’t have the kind of AI talent like Singapore proportionally, but in terms of AI awareness and usage, we are way ahead. This awareness will drive our leadership position,” he said.

NxtGen is building a cooperative platform called M, combining open-source models, GPU infrastructure, and India’s growing talent pool. Rajgopal stressed that the aim is not to make another large model, but to deliver real-world use cases that improve productivity and quality of life.

He invited Anil Porter, NxtGen’s partner, to share how ITQ is modernising the travel industry. Porter said legacy systems remain a bottleneck, but ITQ solved a 50-year-old problem around airline fare rules through work with Air India and other carriers. “Travel is intrinsic to human behaviour, and with 70% of India’s population under 30, these young consumers are real,” Porter said.

He added that agentic, hybrid AI platforms could reshape travel experiences within a single quarter, continuously adapting to consumer preferences.

A Balanced Path for India

Rajgopal argued that India’s position lies in blending global innovations with local strengths. “Perhaps India is evolving to be one where we will leverage what’s happening across the globe, and we will also leverage our own innovations, and then bring it together so that we do something pretty balanced for the world to understand,” he said.

He also warned that regions without infrastructure might become “neo-colonies” in the new AI order. Yet he praised the government’s seven-pillar India AI mission, spanning GPUs, startup financing, and data access, for creating a niche for the country.

Rajgopal underlined the need for tens of thousands of GPUs to scale fully but noted that open-source models are helping level the field. “Open-source models are now very, very good and can do stuff that is very comparable to frontier models from closed-source big tech companies,” he said.

He concluded with a reminder that the task ahead is massive, but India must play smart.

The post Quantum Computing Could Make GPUs Obsolete, Says NxtGen CEO appeared first on Analytics India Magazine.

Oracle Wants AI Agents to Turn HR Hassle into HR Hustle

Siddharth Jindal — Tue, 16 Sep 2025 12:23:17 +0000

Oracle has launched new AI agents within its Fusion Cloud Applications to help HR leaders automate workflows, manage talent and improve employee performance across organisations.

Embedded within Oracle Fusion Cloud Human Capital Management (HCM), the new AI agents help streamline HR processes from hiring to retirement, enabling smarter decisions and greater efficiency for HR teams, managers, recruiters and employees.

Embedded AI Agents

In an interview with AIM, Yvette Cameron, senior vice president of global HCM product strategy at Oracle, shared insights into the company’s new AI Agents and the significant impact they are set to deliver.

“These agents are not just answering questions,” Cameron explained. Unlike traditional RAG-based systems that only answer questions, Oracle’s AI agents can automate workflows, deliver real-time insights and even take action on behalf of users to reduce friction in everyday tasks.

What sets Oracle apart from competitors like Microsoft or Salesforce is that these agents are built into Oracle Cloud Infrastructure at no additional cost, which ensures strong security, data privacy and high performance, while keeping deployment quick and cost-effective.

“Getting started is low cost, fast, and customers don’t have to worry about integration and data risk. It’s all delivered on their trusted, unified Fusion applications,” Cameron said.

The new AI capabilities include agents that assist employees in exploring internal job opportunities and assessing job fit while providing recommendations for career growth. Recruiters can automate interview scheduling, coordinate calendars and manage reminders.

For career development, the agents summarise team performance updates, help managers set and track goals, support skill development and guide promotion and talent planning decisions.

Oracle’s AI agents also provide support for core HR functions, such as responding to employee questions about compensation, benefits, leave and payroll, while helping managers make staffing and budget decisions. Moreover, the suite of agents extends to employee lifecycle, succession planning and payroll administration, offering support on company policies, succession coverage and payroll irregularities.

Early Adoption and Measurable Impact

Since the launch of Oracle’s AI Agent Studio in March, adoption among customers has been swift, particularly for tasks that alleviate repetitive or time-consuming work. Cameron said that when Oracle first introduced AI agents in 2024, employees primarily used them for goal writing and performance reviews. Notably, these continue to be the most widely adopted use cases even today.

She revealed that companies using these AI capabilities have seen up to 70% improvement in performance review completion and a 27–54% reduction in time-to-hire, boosting employee satisfaction and productivity.

With India emerging as a hub for talent, Oracle sees enormous potential for its AI-driven HCM solutions in the region.

“To scale the growth of skills and individuals, you need to understand the skills that employees have. Our agents guide individuals on career paths, training opportunities and dynamic skill assessments,” Cameron said.

The broader vision, she explained, is not just about AI readiness but also change readiness. “AI is going to change the way we work. Organisations need to embrace new ways of working and new interactions to achieve results that weren’t previously possible,” she added.

Cameron concluded by highlighting the strategic role of people in Oracle’s growth. “Every business that Oracle is engaged in, whether it’s cloud infrastructure, financial management or supply chain, is operated by people. Our approach is about removing friction and enabling employees to focus on what truly matters,” she said.

With over 100 AI capabilities already embedded within Oracle Cloud HCM and more rolling out quarterly, the company is positioning itself as a leader in transforming how work is done, making AI an integral and trusted part of everyday workflows.

The post Oracle Wants AI Agents to Turn HR Hassle into HR Hustle appeared first on Analytics India Magazine.

Canva’s Fight for Relevance in the Age of Google Nano Banana

Siddharth Jindal — Mon, 15 Sep 2025 12:26:26 +0000

Google’s new image generation and editing model, Nano banana, has taken over the internet by storm. Officially known as Gemini 2.5 Flash Image, this AI model has quickly gained millions of users, processing millions of image edits in just a few days.

Creators across India are using Nano Banana to transform selfies, animals and famous personalities into miniature 3D figurines. Visuals are detailed, stylised and made to resemble collectables one could display on a shelf.

One of Nano Banana’s standout features is its ability to interpret natural language prompts. Users can simply type commands like “turn this photo into a 3D figurine” or “place me in a medieval landscape”, and the AI executes the edits with remarkable precision.

But Google isn’t the only player in this space. Chinese company ShengShu Technology has also launched a new Reference-to-Image feature for its flagship generative AI platform Vidu, which allows users to create true-to-life images from up to seven reference photos. Similarly, ByteDance, the parent company of TikTok, has unveiled Seedream 4.0, its newest AI image generation tool. The company claims that the tool outperforms Google DeepMind’s viral Nano Banana editor on several major benchmarks.

The success of Nano Banana and similar models signals the beginning of a new era in image editing, one that is pushing players like Canva and Adobe to change their strategy. In an exclusive interview with AIM, Danny Wu, head of AI products at Canva, said that he sees Canva as both a design and a visual communications platform.

Rethinking Design in the AI Age

New additions like Canva Sheets and Canva Docs expand the platform beyond traditional design. Citing the former, Wu explained that he doesn’t see it as a traditional spreadsheet, but rather as a record or database layer to help users track assets while working on marketing campaigns.

Meanwhile, tools like Magic Switch allow users to instantly convert designs across multiple formats, from TikTok videos to pitch decks.

Wu said that Canva follows a three-pronged AI strategy, which includes developing proprietary models for core design tasks, integrating best-in-class models from partners like OpenAI, Google and Anthropic, and supporting third-party apps through its ecosystem. Each model is chosen based on quality, latency, cost and suitability for specific tasks.

For instance, Canva collaborated with Google Cloud to integrate the generative video model Veo 3, which Wu described as a significant leap in quality for AI videos and audio. He added that use cases range from storyboarding and inspiration to creating b-roll for presentations.

Google DeepMind CEO Demis Hassabis recently said the future of creative tools will increasingly focus on interacting with them conversationally, allowing users to iterate naturally and refine outputs.

Meanwhile, Josh Woodward, VP of Google Labs and Gemini, wrote on X that the Gemini app has added 13 million new users in the last four days, bringing the total to over 23 million. Moreover, the app has surpassed 500 million images, with 300 million new ones added during this period.

What is Even Real Anymore?

Hayao Miyazaki, a legendary Japanese animator and the original creator of Studio Ghibli, has been a vocal critic of AI-generated art. In a widely circulated old video from 2016, Miyazaki saw a demo of an AI-generated animation project and described it as “an insult to life itself.”

Wu acknowledged the ongoing debates about whether AI-generated content counts as art. “Personally, I think AI art is art,” he said. “One definition I love is that art is transforming your hallucinations into tangible form. That’s true whether you’re painting, taking a photo or using AI.”

To address concerns, Canva launched Canva Shield, which includes a $200 million creator fund and gives users control over how their data is used for training. “We can’t be hand-waving around the impacts of AI,” Wu stressed. “We have to think about the right ethical and safe way to approach this technology.”

India, a Key Market

India is a market that embraces visuals with great enthusiasm. After ChatGPT introduced its image generation feature with the popular Ghibli-style filter, users worldwide created over 700 million images in just one week. India emerged as the fastest-growing market during this surge, showing an especially eager response to visual creativity.

The same is true for Canva. India has become the platform’s fourth-largest market, with more than 660 million designs created in the past year. Tools like background removal and Magic Resize features are especially popular among Indian users.

Wu said that Canva’s approach is to be truly local. Beyond app translation, this involves sourcing stock photography and templates from local creators and ensuring that the content is culturally relevant. “We really want to make Canva not just localised, but local for every single country,” he said.

The race is no longer about who builds the smartest AI, but who delivers the most useful one. Canva’s relevance will rest on embedding intelligence into workflows so naturally that it feels invisible.

The post Canva’s Fight for Relevance in the Age of Google Nano Banana appeared first on Analytics India Magazine.

OpenAI’s Stargate Could Find Its First Indian Home in Aamchi Mumbai

Siddharth Jindal — Thu, 11 Sep 2025 13:23:27 +0000

Yotta, one of India’s leading data centre companies, is working with NVIDIA to deploy Blackwell GPUs. Co-founder and CEO Sunil Gupta said that the company has ordered 8,000 GPUs, expected to go live by December–January. Given the global shortage and sky-high demand for Blackwell-family GPUs (NVIDIA’s post-Hopper architecture), this scale of procurement is significant.

During an exclusive interaction with AIM, when asked about OpenAI’s search for partners to set up a data centre in India, Gupta said Yotta is open to collaboration.

He emphasised that India’s push for digital sovereignty should not be mistaken for isolationism. “Sovereign for sure never means we as a country start looking inwards and close ourselves to the rest of the world,” he said.

Instead, government initiatives like Make in India are designed to attract investment and build local infrastructure, while keeping global players engaged.

India’s Digital Personal Data Protection Act (2023), along with draft operational rules and sectoral localisation mandates, will force or strongly incentivise many firms to keep data and compute local. This is expected to create predictable demand for domestic AI infrastructure.

If OpenAI were to invest directly in India, Gupta said it could drive economic development, job creation, lower latency, and a better user experience for Indian users.

Typical round-trip latency from India to US cloud regions is 200–350 ms, depending on telecom providers and exact destinations. By comparison, local cloud hosting in India typically delivers 20–50 ms latency. This means hosting inference or cloud services domestically can cut round-trip times by roughly 150–250 ms versus US-based hosting.

India’s data centre market itself is in a period of explosive growth. According to the Economic Survey 2024–25, the industry is expected to expand from $4.5 billion in 2023 to $11.6 billion by 2032. A JLL report further estimates that overall capacity will surge by 77% by 2027, reaching 1.8 GW.

To put this into perspective, the global data centre market was valued at $347.60 billion in 2024 and is projected to reach $652.01 billion by 2030, growing at a CAGR of 11.2% between 2025 and 2030.

Mumbai dominates the sector, accounting for 45–50% of India’s data centre market. The city’s stability of power supply, private distribution, and position as the country’s financial hub make it the preferred choice for hyperscalers. Cushman & Wakefield ranks Mumbai sixth globally for under-construction data centre capacity, ahead of London and Dublin.

Chennai follows with about 21% of capacity, thanks to subsea cable connectivity with Singapore. Hyderabad, despite being landlocked, has emerged due to aggressive state incentives. Bengaluru, Delhi NCR, and Pune make up the rest, though at a smaller scale.

Why Mumbai stands out

Gupta noted that Mumbai’s early lead came from the availability of three separate operator facilities that could form a resilient “triangle” of availability zones. While historical details explain the city’s legacy role, Mumbai continues to outpace rivals due to its connectivity, infrastructure, and sheer market demand.

Along with Yotta Data Services, OpenAI is in preliminary discussions with other data centre providers, including Sify Technologies, E2E Networks, and CtrlS Datacenters.

Expanding data centres into smaller Indian cities faces a major hurdle: fibre availability. Gupta explained that while land and power are available in cities like Indore, Bhubaneswar, or Lucknow, redundancy in fibre networks is lacking. “Even after multiple cuts, your data centre cannot afford to be cut off,” he said.

For large-scale projects, location is critical not just for connectivity but also cost and sustainability. Gupta added that cheaper and greener energy will be key, and argued that the government will need to take the lead in developing large, planned clusters.

As India races towards an $11.6 billion valuation for its data centre market, partnerships like Yotta–NVIDIA and potential collaborations with OpenAI could redefine the digital infrastructure in the country. Whether Stargate lands in Mumbai, or another hub, will depend on how India balances sovereignty, scalability and sustainability.

The post OpenAI’s Stargate Could Find Its First Indian Home in Aamchi Mumbai appeared first on Analytics India Magazine.

Why Groq Loves Mixture of Experts Models

Supreeth Koundinya — Thu, 11 Sep 2025 03:26:27 +0000

Mixture-of-Experts (MoE) architectures power most of today’s frontier AI models, at least the ones which we are aware of, thanks to their open weights nature.

This includes models from DeepSeek, Moonshot AI’s Kimi, and even the recently announced OpenAI’s gpt-oss series.

For context, MoE architecture activates only a subset of parameters per token, while retaining a large number of parameters for counts. And for companies like Groq, which have built their entire business around inference, MoE models present a perfect match for the company’s LPU (Language Processing Unit) chips, as per CEO Jonathan Ross.

Groq’s LPUs are hardware systems designed specifically for AI inference, and they outperform traditional GPU systems in output speed.

Ross was in Bengaluru recently at an event hosted by Lightspeed Ventures India, where he narrated how MoE models directly align with the company’s fundamental advantages. While GPUs struggle with MoE architectures due to memory bottlenecks, Ross said that Groq’s LPUs thrive on them for inference.

“If you look at NVIDIA GPUs, you can calculate their performance on a problem by calculating how long it takes to load the weights and the KV [key-value] cache from the HBM [High Bandwidth Memory]. They have way more compute than they have memory,” said Ross.

GPUs achieve performance through batching, which is running the same model weights across many users simultaneously. But MoE models disrupt this efficiency.

Ross said that with mixture-of-experts models, each input only uses a small subset of the model’s weights, so when you process one batch after another, the GPU often has to load a completely different set of weights.

That breaks the normal batching advantage, where the same weights can be reused across many inputs, so you end up moving more data. To compensate for this, GPUs are encouraged to use larger batches, but this increases latency.

Groq’s architecture sidesteps this entirely. “In our case, we have everything in the memory on the chips, so we don’t have to read from an external memory. We can still use smaller batching and still get good performance,” said Ross.

The LPU stores model weights in on-chip SRAM rather than external DRAM, eliminating the memory bandwidth penalties that plague GPU-based MoE inference.

Ross demonstrated this with Moonshot AI’s new open-weights large language MoE model, Kimi K2, running on 4,000 chips. “Each chip has a small part of the model. Each chip does a very small amount of computation and then hands it off to the next set of chips to do their part. Almost like an assembly line or a factory,” he described.

This distributed approach creates a memory efficiency advantage: instead of requiring 500 copies of the model across GPU clusters (which would necessitate 500 terabytes of memory), Groq’s architecture needs only one terabyte in total, since each chip holds just its portion.

The economics reveal how different Groq’s architecture is from GPU-based inference. Ross explained that his team debated whether to charge the same price for both their 20B and 120B parameter MoE models, since the actual serving costs are nearly identical. Both models have similar active parameter counts despite the massive difference in total parameters.

“And I really wanted to serve both of them at the same price, just to show the market how different our architecture is,” said Ross.

Lessons from AlphaGo

Before Ross set out to start his journey as the founder of Groq, he was an integral part of Google’s team that built the Tensor Processing Units (TPUs).

Ross said that during his time at Google, his team was approached by a group that, at that time, did not reveal the competition but had a million-dollar prize purse. However, they lost badly and subsequently contacted Google to ask if their TPUs were fast enough.

When Google claimed that it was, and the group decided to port their AI model over to their TPUs, they then revealed that this was AlphaGo, and they were playing a world champion in Go in 30 days. “He [world champion] had played the test game on a GPU, and he had beaten the GPUs badly — it wasn’t even close,” said Ross.

“So, 30 days later, we had it on the TPUs at Google, and we won four out of five games,” added Ross.

The famous shoulder hit move by AlphaGo was a 1 in 10,0000 probability play that was in the training data. Ross said that when they tested the model again on a GPU, it could not generate the move, despite existing in its training data within the actual gameplay that the TPUs could.

The insight connected Go to language models in a fundamental way: “Go is a sequence of moves on a board, and you have almost 300 different moves you can pick from. An LLM is the same thing, except you have about 100,000 moves that you can pick from.”

Both problems involve sequential decision-making where each choice affects future options. “Just like you can’t pick the 30th move in Go until you figure out what the 29th is, you can’t pick the 100th token until you know what the 99th is,” Ross explained.

This sequential dependency is where his architecture thesis crystallised: “CPUs are good at sequential, GPUs are good at parallel, and LPUs are a perfect blend between both of them.”

The Competition

Having said that, Ross dismissed any claims that GPU makers, such as NVIDIA, are at war with inference-providing companies like Groq on multiple occasions.

In a podcast episode earlier this year, he stated that AI model training must be conducted on GPUs and noted that if Groq were to deploy high volumes of lower-cost inference chips, the demand for training would increase.

“The more inference you have, the more training you need, and vice versa,” he said.

While Groq offers access to LPUs via GroqCloud, the company faces intense competition from Cerebras, which on several occasions has claimed to achieve the fastest inference speed among all.

According to Artificial Analysis, a benchmarking platform, Cerebras’ lead in output speed is reflected among several AI models, providing an output speed of over 3,000 tokens per second on MoE models like OpenAI’s latest gpt-oss-120b model.

The company’s Wafer-Scale Engine (WSE) incorporates massive on-chip memory, targeting both throughput and scale, while also supporting model training. Cerebras excels in performance when it comes to larger AI models.

However, Groq has its own advantages — it still offers the lower latency and the quicker time to first token, and the company also claims that it provides the lowest cost per token.

For deploying smaller models, which are increasingly being preferred for agentic AI systems today, Groq also offers a much larger context window (exceeding 100k tokens) compared to Cerebras (32k tokens), for models like Llama 4 Maverick.

The post Why Groq Loves Mixture of Experts Models appeared first on Analytics India Magazine.

Trump’s HIRE Act Push Could Hurt Big Tech More Than India

Siddharth Jindal — Wed, 10 Sep 2025 07:02:21 +0000

US President Donald Trump has signalled a policy shift, urging US tech companies to stop outsourcing jobs to India as part of his broader America First push to protect domestic employment and innovation.

Trump and his allies are backing a new US bill, the Halting International Relocation of Employment (HIRE) Act, which discourages American companies from outsourcing jobs—particularly to countries like India—by imposing a 25% tax on payments made for foreign-based services consumed in the US.

Contrary to the expected outcomes, the move could open new opportunities for India’s tech sector, especially for startups, as per industry voices across platforms and a few that AIM spoke to.

Even if US outsourcing restrictions trigger workforce reductions, India’s vast and highly skilled tech talent pool is unlikely to sit idle, Gaurav Vasu told AIM. Vasu is the founder & CEO of UnearthInsight, a software firm that enables analysing and comparing operating metrics with competitors at a click of a button.

Native product engineering firms, boutique consultancies, and the vibrant startup ecosystem will continue to absorb top talent, he said.

Vasu added that opportunities beyond the tech sector are also thriving, with BFSI, retail, healthcare, manufacturing, and logistics investing heavily in digital transformation.

Caleb Friesen, a tech influencer, recently suggested on X that a US ban on IT outsourcing to India could paradoxically act as a huge blessing in disguise. He argued that major Indian IT firms like TCS, Infosys, Wipro, and HCL could see significant revenue losses, triggering mass layoffs. However, he added that tens of thousands of talented engineers, many based in Bengaluru, might take this as an opportunity to finally start their own companies or join existing startups.

Union minister Ashwini Vaishnaw said that India’s active discussions with multinationals and foreign governments to safeguard the growth of its nearly $300 billion tech sector are ongoing, in a recent interview with Moneycontrol.

Similarly, on September 9, Karnataka’s IT minister Priyank Kharge said that proposed US tariffs on Indian IT services are not expected to significantly impact the country’s billion-dollar export industry. “It’s my personal opinion, the dependence of various geographical entities is way too much on the talent that we have here,” he said.

Big Tech’s Deep Dive

Vasu said that as of 2025, 174 Fortune 500 companies have established a presence in India, primarily through Global Capability Centers (GCCs). These companies operate over 390 centres and collectively employ around 950,000 professionals in India, he said, stressing upon the importance of India’s tech ecosystem and talent.

The likes of Microsoft, Amazon, Goldman Sachs, Google, JP Morgan Chase, Eli Lilly, Pfizer, AstraZeneca, Roche, Bayer, Ford, Renault Nissan, Caterpillar, Hitachi Energy, McDonald’s, and Sandoz—industry leaders across sectors—depend on their India operations, said Vasu.

Leading tech companies such as Google, Apple, Meta, Microsoft, and OpenAI are deepening their bets on India.

By expanding office footprints, establishing new engineering hubs, and pursuing AI-focused partnerships, global tech giants are positioning India as a central pillar of their international growth plans.

Google and Meta have announced AI and digital partnerships with Reliance while expanding local R&D and data infrastructure. OpenAI, meanwhile, is setting up a large data centre in India through local collaborations, has opened its India office in Delhi, and has recently begun hiring.

In an exclusive interview with AIM, Microsoft’s Puneet Chandok spoke about the company’s two major commitments in India—a $3 billion investment over two years in cloud and AI infrastructure, and a pledge to train 10 million Indians in AI by 2030.

“We’re on track with our investment plans, and on the skilling front, we’ve already trained 2.4 million people,” he said, adding that 65% of those trained are women and 74% come from tier-2 and tier-3 towns. Chandok added that Microsoft’s goal is to scale from 2.4 million to 10 million over the next five years.

Similarly, AWS is ramping up its India investments. By 2030, it plans to invest $12.7 billion across its Mumbai and Hyderabad regions, contributing $23.3 billion to India’s GDP and supporting over 1.31 lakh full-time jobs annually.

Google is investing $6 billion in India to build Asia’s largest data centre and green energy infrastructure in Visakhapatnam, Andhra Pradesh, marking its single biggest project in the country and its first hyperscale facility there.

How the HIRE Act, if brought into practice, affects these investment plans and collaborations remains to be seen.

Who Really Gets Hurt?

Vasu highlighted the scale of US reliance on Indian outsourcing, noting that about $190–200 billion, or nearly 65–70% of India’s IT services, GCCs, BPO, and product engineering business, is tied to the US market. He cautioned that while a complete halt would certainly hurt India, the bigger shock would be felt in America.

“For US enterprises, eliminating outsourcing could drive up costs by $700–800 billion, cut margins by 2–4% or more depending on destinations like India, Mexico, Poland, or Southeast Asia, and even impact share prices across sectors,” he explained.

Beyond cost, Indian engineers contribute significantly to innovation, with 60% of some firms’ tech solutions and patents originating from India-based teams, according to UnearthInsight.

The story of Indian talent goes beyond cost savings. With nearly 6 million professionals in STEM fields and strong growth in non-US markets, India remains indispensable for global tech innovation.

As US companies grapple with talent shortages and wage inflation, Indian engineers, startups, and growing domestic industries are set to benefit, turning a potential policy shock into an opportunity for India to consolidate its position as a global tech powerhouse.

The post Trump’s HIRE Act Push Could Hurt Big Tech More Than India appeared first on Analytics India Magazine.

Databricks Says Enterprises Are Flying Blind on AI Agents, And It Has a Fix

Ankush Das — Tue, 09 Sep 2025 11:30:00 +0000

Enterprises worldwide are still wrestling with the challenge of taking AI agents from polished demos to working production systems. Databricks thinks it has cracked the problem with two key innovations, Agent Bricks, a framework for domain-specific AI agents, and Lakebase, an AI-native operational database. Both are part of the company’s larger push to unify data, analytics, and AI.

Recently, the company announced that it hit a $4 billion revenue run-rate, surpassing $1 billion from AI, and is raising $1 billion Series K at over $100 billion valuation to expand Agent Bricks, Lakebase, and global AI growth.

Nick Eayrs, VP of field engineering for APJ at Databricks, in an exclusive interaction with AIM, said he believes the missing piece has been automation in evaluation and optimisation.

He explained that most enterprises are “flying blind” when building agents, without reliable ways to measure quality or balance costs. The complexity, he added, makes progress slow and expensive.

Solving the Agent Bottleneck

Agent Bricks was introduced to address these hurdles directly. Rather than forcing teams to manually tweak prompts, models, and retrieval pipelines, Databricks’ system allows users to simply declare the task in natural language and let the framework auto-optimise. The platform automatically generates evaluation suites, applies techniques such as prompt engineering or reward models, and balances quality with cost.

Eayrs said, “Agent Bricks delivers both higher quality and lower cost through many breakthroughs from our Mosaic AI research team.”

Eayrs mentioned some examples of breakthroughs. Test-time Adaptive Optimisation (TAO), for instance, teaches models to improve at tasks using past input examples, often bringing open source models up to the quality of expensive proprietary ones.

Similarly, Prompt-Guided Reward Models allow rules to be updated with prompts while still delivering reliable assessments.

What sets Agent Bricks apart is the role of human feedback. With Agent Learning from Human Feedback, domain experts can guide agent behaviour without needing deep technical knowledge. As Eayrs put it, this “democratises agent development,” making it possible for subject experts to shape outcomes directly.

The result, Databricks claims, is that enterprises can iterate faster, reduce costs, and bring agents into production with far less friction. Crucially, the framework continues to re-optimise agents even after deployment, ensuring adaptability in real-world settings.

Read: AI Agents Work, But Why Aren’t They Mainstream Yet?

A Database for the AI-First Era

On the data side, Databricks is challenging the dominance of decades-old operational databases with Lakebase. Built on Postgres and powered by Neon technology, Lakebase is designed for the fast, concurrent data demands of AI applications and agents.

“Traditional operational databases (OLTP) are a $100-billion-plus market, but they are based on decades-old architecture designed for slowly changing apps, making them difficult to manage, expensive, and prone to vendor lock-in,” Eayrs said.

Unlike traditional OLTP systems, which are often expensive and prone to vendor lock-in, Lakebase sits within the lakehouse architecture and offers seamless autoscaling.

By converging operational and analytical layers, Lakebase reduces latency and gives enterprises real-time access to current information.

This is key, Eayrs noted, because “every AI application, agent, recommendation, and automated workflow needs fast, reliable data at the speed and scale of AI agents.”

Governance and security are also central to the design. Lakebase integrates with Unity Catalog to enforce consistent permissions, separates storage and compute for efficient scaling, and supports real-time synchronisation between operational and analytical systems.

This combination, Eayrs said, ensures enterprises can move faster without compromising compliance or security.

He explains, “At its core, Lakebase is built upon proven open source technologies. Unlike proprietary systems, it avoids vendor lock-in, promotes transparency, and enables community-driven innovation.”

“Lakebase leverages Postgres, which is widely used by developers and has seen rapid adoption over the last few years,” he said.

With many large language models already trained on Postgres data, the system aligns naturally with the workflows of AI-native applications.

Read: Why TiDB Thinks S3 Will Power the AI-First Database Era

The Road Ahead

Together, Agent Bricks and Lakebase sketch out Databricks’ vision of a unified platform where data, analytics, and AI converge.

Eayrs sees them as “key pillars” of the company’s strategy, with both expected to drive adoption in different ways. While Agent Bricks tackles the complexity of deploying agents, Lakebase reimagines the database for an AI-first world.

If enterprises do manage to move beyond demos, it will likely be through systems that blend simplicity, scalability, and openness. Databricks is betting that the combination of intelligent agents and AI-ready databases is exactly what the market has been waiting for.

The post Databricks Says Enterprises Are Flying Blind on AI Agents, And It Has a Fix appeared first on Analytics India Magazine.

The $900 Billion Google Business NVIDIA Fears

Siddharth Jindal — Tue, 09 Sep 2025 07:53:02 +0000

Google is stepping up its push into the AI hardware business, taking aim at NVIDIA with its proprietary Tensor Processing Units (TPUs). The company has begun courting smaller cloud providers such as Fluidstack, Crusoe and CoreWeave to host TPUs, offering an alternative to NVIDIA’s dominant chips.

According to D.A. Davidson analysts, cited by MarketWatch, combining Google’s TPU business with its DeepMind AI research unit could be valued at around $900 billion.

The sixth-generation Trillium TPUs are already in demand, and the upcoming seventh-gen Ironwood TPUs, built for large-scale inference, are expected to drive further interest.

The full stack wave is coming! $GOOG is going to be the most valuable company on earth within 2 years.
— Chad Hurley (@Chad_Hurley) September 6, 2025

Google’s TPU Business Could Rival Search in Value

Dylan Patel, chief analyst at SemiAnalysis, an independent research and analysis firm, believes Google could unlock massive value if it decided to sell its TPUs beyond its own cloud.

In a recent YouTube podcast, Patel pointed out that demand for custom silicon from companies like Amazon, Google and Meta has surged. While Amazon is still figuring out how to fully scale its Tranium chips, Patel said, “I totally think Google should sell TPUs externally, not just renting, but physically.”

According to him, such a move could even rival the market value of Google’s search businesses, given the growing adoption of open-source models and the falling cost of AI deployment.

“It’s kind of funny if a side hobby, in theory, has a higher company value potential than your entire business,” he said, adding that internally, Google has discussed the idea, but it would require “a big reorg of culture” across its TPU and software teams.

Sunil Gupta, CEO of Yotta, a cloud services company, in an exclusive interview with AIM, said he is excited about the prospect of Google selling TPUs and would be open to hosting them in Yotta’s data centres if the opportunity arises.

Meanwhile, as part of the third tender under the IndiaAI Mission’s compute infrastructure expansion, 1,050 Google Trillium TPUs were added to the national cluster, marking their first inclusion alongside thousands of GPUs.

Reliance recently unveiled its new venture, Reliance Intelligence, which will use Google Cloud infrastructure running on TPUs.“Google and Reliance are partnering to transform all of Reliance’s businesses with AI—from energy and retail to telecom and financial services. To enable this adoption, we are creating a dedicated cloud region for Reliance, bringing Google Cloud’s AI and compute, powered by Reliance’s clean energy and connected by Jio’s advanced network,” said Sundar Pichai, CEO of Google.

Developer activity around TPUs on Google Cloud grew by 96% in just six months, indicating the growing momentum among engineers and researchers outside Google.

Stiff Competition

The sixth-generation Trillium units are already in high demand, while the upcoming seventh-generation Ironwood TPUs are expected to attract even more interest as the first designed specifically for large-scale inference, the stage where AI models are deployed after training.

“Ironwood is comparable to Blackwell GPUs from NVIDIA. The Edge TPUs that Google is selling will face stiff competition from newer generation CPUs, and for inferencing, they will be up against players like Sambanova and Cerebras,” A S Rajgopal, MD and CEO of NxtGen Cloud Technologies, a specialised cloud platform for financial services, healthcare and government, told AIM.

He added that at the edge, Google TPUs will compete with regular CPUs from AMD and Intel. “We are running the OSS (20B) model on a 128-core AMD CPU with strong results. Today, newer CPUs can already handle models of up to 10B parameters, and they are progressively integrating more AI capabilities that will compete with Google’s Edge TPUs,” he said.

Globally, OpenAI has begun using Google’s TPUs for AI inference to reduce costs compared with NVIDIA GPUs. The company, which had primarily rented NVIDIA hardware from Microsoft and Oracle datacentres, has now added Google’s Tensor Processing Units to its infrastructure.

Last year, Apple also revealed that it employed Google’s cloud-based TPU clusters, specifically the v4 and v5p versions, to train its Apple Foundation Model (AFM).

Some in the developer community believe Google should go all in on TPUs rather than continuing to support NVIDIA hardware. As one developer put it on X, “Google should just go all the way and stop supporting NVIDIA with JAX—just start selling TPUs to people. Full TPU, full JIT heaven.”

Sasank Chilamkurthy, the founder of Qure AI and Von Neumann AI, however, told AIM that TPUs remain a “niche but the best alternative to NVIDIA” at scale, even though their adoption outside Google’s ecosystem faces challenges.

He noted that TPUs are tightly integrated with Google’s own software stack—particularly JAX, which has largely replaced TensorFlow—but support for PyTorch, the industry’s favourite, remains weak. “PyTorch doesn’t really work well with TPUs. JAX is the way to go if you use TPUs, but it is still very Google-centric,” he explained.

While he sees TPUs as competitive on performance, Chilamkurthy believes the bigger challenge lies in commercialisation. “For Google, selling TPUs externally is a sales problem—they would need to build customer support teams, software integrations, and a broader ecosystem,” he said, pointing out that Google’s previous attempt with Edge TPUs fizzled out for similar reasons.

Echoing similar thoughts, Ankush Sabharwal, founder of Corover.ai, a Bengaluru-based conversational Gen AI platform, told AIM that Google’s Trillium TPUs are apparently great for scalable and cost-effective training in Google Cloud, especially with JAX and TensorFlow. “NVIDIA’s H100 and Blackwell GPUs offer flexibility and work efficiently well with PyTorch and CUDA,” he added.

Sabharwal added that GPUs would be the better choice for projects requiring diverse framework support.

Rajgopal explained that Google’s Edge TPUs are designed for inferencing on local, compact, and low-power devices, making them suitable for embedding into consumer appliances, surveillance systems, and even cars.

He noted that these TPUs can run existing or pre-trained models very efficiently, which helps bring down costs and opens opportunities for startups to build new applications. “Many companies are also working on providing options for running inferencing in the cloud, such as SambaNova and Cerebras,” he added.

On potential customers, Chilamkurthy was sceptical about startups or smaller players, suggesting instead that large enterprises and big tech firms could be early adopters. “Startups don’t buy GPUs, they just use APIs. If anyone bites, it will be enterprises like Meta, who control their entire stack,” he said.

The post The $900 Billion Google Business NVIDIA Fears appeared first on Analytics India Magazine.

Ban on Social Media Giants Leaves Nepal in Limbo

Smruthi Nadig — Fri, 05 Sep 2025 13:30:00 +0000

The Nepal government has initiated one of the most sweeping digital restrictions in South Asia, instructing internet service providers to ban 26 popular social media and messaging platforms.

The move cuts off access to global giants such as Facebook, Instagram, YouTube, X (formerly Twitter), WhatsApp, Reddit, LinkedIn and Clubhouse. Officials said the platforms had failed to comply with regulations that required them to register with the government.

The ban, which took effect late September 4, has triggered confusion, backlash and fears of censorship, with critics warning of severe economic and democratic repercussions.

The Official Rationale

As per Nepal’s communication and IT ministry, the ban followed a Supreme Court ruling that these companies must also establish local offices, implement grievance-handling systems and uphold mechanisms for self-regulation to legally operate in the country.

“We cannot allow foreign companies to operate without accountability in Nepal,” a ministry spokesperson told Al Jazeera, emphasising the risk of cybercrime, misinformation and lack of consumer recourse.

The government argues that registration is a matter of sovereignty. Officials reportedly believe the framework will bring foreign tech companies within the ambit of Nepalese law, enabling oversight and potential taxation.

Yet, the sudden enforcement, which came with little warning and no phased compliance plan, has drawn sharp criticism as drastic. “While ensuring compliance and accountability is important, social platforms are also vital enablers of economic opportunity, free expression and civic engagement,” Harshi Gilara, CMO of Petonic AI, told AIM.

A Digital Blackout with Real-World Consequences

Countries in South Asia, such as Pakistan, Bangladesh and Sri Lanka, are already debating stricter digital laws. Nepal’s move could embolden them to adopt sudden bans as a compliance tool rather than pursuing dialogue with platforms.

India has had its own tussles with Twitter, WhatsApp and YouTube over compliance with IT Rules. Nepal’s enforcement may strengthen arguments within India’s policy circles for more extreme crackdowns.

Economic Shockwaves

The economic fallout may prove even more damaging than the social disruptions. Small and medium enterprises (SMEs), which account for over 90% of Nepal’s businesses and employ nearly three million people, are heavily dependent on social media for marketing, payments and customer engagement.

“Over 1.5 million SMEs likely rely on Facebook, Instagram, and YouTube for commerce,” said Shantanu Sahajpal, co-founder of IntelliContent Studios and former Twitter India executive. “Blocking these platforms endangers livelihoods, risks substantial business losses and severely undermines Nepal’s digital entrepreneurial backbone.”

Sahajpal further warned that the move could erode public trust in Nepal’s democratic institutions.

A Call for Balance

Experts argue that while platform regulation is legitimate, blanket bans are disproportionate.

“Social media has evolved into a critical layer of public life, from conversations and entertainment to business,” said Sagar Vishnoi, director at Future Shift Labs. “For Nepal, it is imperative that policy decisions around these platforms are guided by evidence, foresight and balance. Oversight mechanisms must be designed to protect national interests without stifling innovation or public expression.”

Vishnoi urged the government to “recalibrate”. He believes this will be “key to preserving Nepal’s economic growth and reinforcing its democratic credibility”.

Legal analysts echo that sentiment. Soham Mukherjee, a senior legal analyst at NexInfo Solutions and former counsel at the Calcutta High Court, highlighted the economic risks.

“For Nepal, where the majority of internet users rely on platforms like Facebook and YouTube for commerce, learning and remittances, the disruption is more than social; it’s economic. Take examples of nations like the EU and Australia, which have shown that platform accountability can be enforced through dialogue and phased regulation, not blackouts,” he explained.

Notably, some platforms like TikTok, Viber, WeChat and Nimbuzz, are still accessible, which further adds to the confusion. Government officials maintain that these companies have complied with registration requirements, but critics argue that the selective enforcement appears arbitrary.

“Why block Facebook and YouTube but leave TikTok untouched?” a digital rights researcher asked the Kathmandu Post. “The inconsistency undermines the government’s credibility and suggests the bans are political as much as regulatory”.

Free Expression Under Threat

International watchdogs have regarded the move as a dangerous precedent. The Committee to Protect Journalists (CPJ) said the ban “severely restricts press freedom and public access to information”.

“Nepal’s order to block these platforms is a disproportionate response that stifles free expression,” Beh Lih Yi, CPJ’s Asia programme coordinator, told reporters. “We urge authorities to reverse this measure and instead pursue transparent, consultative regulation.”

Digital rights activists fear that the bans could prompt other developing nations to follow suit. “If Nepal can silence platforms overnight, what stops other South Asian states from doing the same?” one advocate told Al Jazeera. “This is part of a larger trend of digital authoritarianism.”

Moreover, VPN downloads have reportedly surged across Nepal as citizens scramble to bypass restrictions. However, experts caution that not everyone has the technical literacy or financial means to do so.

This blackout has also hit students, social media content writers and freelancers, cutting them off from key communication links. One Kathmandu University student told the Kathmandu Post, “My entire study group is on WhatsApp. Now we don’t know how to coordinate”.

“For journalists, activists, and everyday citizens, they provide a voice and a means to participate in public discourse. Shutting them down risks silencing that potential,” Gilara said.

What Comes Next

Observers also noted that Nepal has previously banned platforms like TikTok and Telegram, only to reinstate them after the platforms complied with local regulations. That precedent has left some analysts cautiously optimistic that the current ban may not last.

Yet, the damage to Nepal’s reputation as a democratic, open society could linger. As Vishnoi observed, “Preserving Nepal’s economic growth and reinforcing its democratic credibility will require the government to step back from the brink and adopt a more inclusive regulatory path.”

“Dialogue and phased compliance, not sudden blackouts, are the hallmarks of sound governance. Nepal must find a way to balance sovereignty with freedom,” Mukherjee agreed.

Nepal’s social media blackout is more than a regulatory dispute; it is a stress test for the country’s democracy, economy and digital future. The coming weeks will reveal whether the ban is a temporary show of force to compel compliance or the start of a more authoritarian trajectory in the nation’s digital governance.

“Constructive engagement between governments and tech platforms can lead to frameworks that uphold national interests without compromising digital rights. The focus should be on building trust and transparency, not barriers,” Gilara concluded.

The post Ban on Social Media Giants Leaves Nepal in Limbo appeared first on Analytics India Magazine.

Reliance Intelligence Ditches NVIDIA GPUs for Google TPUs

Siddharth Jindal — Fri, 05 Sep 2025 10:05:38 +0000

The future of the Indian AI ecosystem looks promising with Reliance entering the scene. The company that reshaped the telecom landscape with Jio now aims to replicate its success in AI through Reliance Intelligence, launched in partnership with Meta and Google.

The timing couldn’t be more perfect, as reports have emerged that OpenAI is currently seeking local partners to develop a 1 gigawatt data centre in India, which is its second biggest market after the US. Previously, it was reported that OpenAI was also looking to partner with Reliance.

However, it appears that the deal didn’t go through, and Reliance partnered with Meta instead, which had also been in the discussions.

At the same time, Reliance is also building a data centre in Jamnagar with a total capacity of three gigawatts. It will surpass existing facilities, which typically operate at less than one gigawatt.

During the NVIDIA Summit in Mumbai last year, CEO Jensen Huang had a fireside chat with Mukesh Ambani, where he announced an upgraded deployment of Blackwell AI processors for a 1-gigawatt data centre in Jamnagar.

This was a follow-up to the September 8, 2023 announcement, when Reliance Industries’ telecom arm, Jio Platforms, entered a partnership with NVIDIA to develop India’s own foundational LLM for diverse languages.

As part of the agreement, NVIDIA committed to supplying its advanced GH200 Grace Hopper Superchip and DGX Cloud supercomputing services.

There had been no updates on the partnership until Google announced a fresh collaboration with Reliance. “Google and Reliance are partnering to transform all of Reliance’s businesses with AI—from energy and retail to telecom and financial services. To enable this adoption, we are creating a dedicated cloud region for Reliance, bringing Google Cloud’s AI and compute, powered by Reliance’s clean energy and connected by Jio’s advanced network,” said Sundar Pichai, CEO of Google, at Reliance AGM 2025 in a pre-recorded video.

“Reliance announced a partnership with NVIDIA about a year back, with Jensen & Mukesh having a chat. They seem to have dropped that process with NVIDIA a year later and aligned with Google for infrastructure and Meta for LLMs,” A S Rajgopal, CEO and MD at NxtGen Cloud Technologies, told AIM.

AIM reached out to Reliance for an update on the deal, but did not receive a response.

Rajgopal said that “Reliance Intelligence will essentially serve as a front-end to open-source models from Meta, running on Google Tensor Processing Units (TPUs),” adding that Stargate, in contrast, is focused on deploying “much larger and more versatile GPU infrastructure.”

Reliance is also using AMD GPUs. The company has already deployed a significant proof of concept (POC) with Jio and is working on future deployment plans that are yet to be announced.

Rajgopal added that open-source models are rapidly catching up with systems like ChatGPT in performance and capabilities, and since these models are publicly available, Reliance does not have exclusive access.

He further added that Meta & Google have done a lot of work on Indian languages, and both of them have large data pertaining to Indians. The new alliance between Reliance, Google and Meta could benefit Indian citizens with services that are contextually Indian.

Can Reliance Challenge OpenAI?

For Reliance to succeed in the consumer segment, it needs to challenge OpenAI, which recently introduced ChatGPT Go in India at INR 399.

“Reliance is not attempting to replicate Stargate’s 10 GW ambition but to counter it with a different playbook. Its strategy is to localise compute at a gigawatt scale inside India, insulate it with self-supplied green energy, and distribute it through Jio’s digital grid,” Sanchit Vir, CEO of Greyhound Research, told AIM.

This sovereign-first model directly addresses India’s compliance needs under the DPDP Act and resolves the chronic scarcity of affordable GPU capacity.

Similarly, Rajgopal said that Reliance Intelligence will not have a role to play globally. “It will be focused on delivering value to users within India. Meta & Google have their own established agenda globally and do not need Reliance to play a role.”

Vir said Stargate is built to safeguard OpenAI’s product advantage. “It underwrites successive GPT generations and powers premium API, subscription, and enterprise sales,” he said, adding that the infrastructure itself is more of an enabler than a retail product.

In contrast, Reliance Intelligence seeks to monetise across layers — from sovereign cloud and GPU-as-a-service through its Google Cloud partnership to packaged AI solutions with Meta’s Llama, and distribution via Jio’s connectivity, retail, and consumer platforms.

Notably, Reliance launched its AI Cloud 2.0, which offers AI-powered features beyond storage, allowing users to search photos, videos, and documents in their own languages using voice commands. The company also introduced an AI voice assistant named Riya, integrated with JioHotstar.

Rajgopal said the success of ChatGPT has given OpenAI access to large amounts of capital, enabling it to build the infrastructure needed to process vast datasets. “Through the Stargate project, OpenAI is deploying infrastructure in Abilene in the US, Abu Dhabi in the UAE, and has also announced sites in Norway and India,” he explained, adding that Stargate is a long-term investment aimed at powering the next generation of AI.

What About Indian LLMs?

On the LLM front, instead of building LLMs from scratch, Reliance announced a joint venture with Meta to build and scale enterprise AI solutions for businesses in India and select international markets. Under the agreement, the two companies have committed an initial investment of approximately INR 855 crore (US$100 million) to capitalise the venture. RIL will contribute 70% of the capital, while Meta will provide the remaining 30%.

Reliance’s use of Meta’s expertise remains an open question. Since Llama 2, Meta has seen limited success with LLMs and is now concentrating resources on its Superintelligence Labs to develop a frontier model.

Rajgopal said Reliance should have invested in developing the core technologies and capabilities needed for Sovereign AI in India. Instead, he noted, the company chose to focus on commercial ventures—likely to be successful given Jio’s access to millions of Indian users.

He added that sovereign AI is being built by small Indian companies with very limited resources, unlike projects such as Stargate. “Our ‘India moment’ will come from an innovation like DeepSeek. Constraints will bring out the best in people.”

The post Reliance Intelligence Ditches NVIDIA GPUs for Google TPUs appeared first on Analytics India Magazine.

Why Everyone is Rushing to Build Reinforcement Learning Environments

Siddharth Jindal — Tue, 02 Sep 2025 12:01:47 +0000

Building reinforcement learning (RL) environments is quickly emerging as the next big thing in AI. OpenAI co-founder Andrej Karpathy recently noted in his post on X that the evolution of AI training can be broken down into three distinct eras—pretraining, supervised finetuning and, now, reinforcement learning environments.

“In the era of pretraining, what mattered was internet text,” Karpathy explained. The priority then was to gather a large, diverse and high-quality collection of online documents to train models.

With supervised finetuning, the focus shifted to conversations. “Contract workers were hired to create answers for questions, a bit like what you’d see on Stack Overflow or Quora, but geared towards LLM use cases,” he said.

According to Karpathy, neither of these approaches is going away. Instead, the current era adds something—environments.

Unlike static text or curated conversations, environments allow models to interact, take actions, see outcomes and improve. This creates opportunities to move “beyond statistical expert imitation.” Environments can be used for both training and evaluation, but just like before, the challenge is assembling a large, diverse, high-quality set.

This is not the first time Karpathy has spoken about RL environments. Earlier this year, he asked his friends working in the open-source space to help construct a high diversity of RL environments that would help elicit LLM cognitive strategies.

Industry Moves Into RL Environments

In a recent podcast with Nikhil Kamath, Deedy Das of Menlo Ventures stated that he considers RL as a service and RL environments to be a booming area.

Companies like OpenAI, Google and Anthropic are already working in this space. For instance, Google DeepMind recently launched Genie 3, a world model that allows AI systems to use their understanding of the world to simulate aspects of it and predict both how an environment will evolve and how their actions will impact it.

OpenAI’s first project, Gym, launched in 2016 as an open-source toolkit for RL. It created a framework for standardised environments, offering a common way to develop and test algorithms across tasks ranging from Atari games, like Pong, to robotic control.

In 2023, OpenAI also acquired Global Illumination to build creative tools, infrastructure and digital experiences for AI agents.

So far, RL in LLMs has performed best in domains with clear, verifiable rewards, such as coding and math, where success is easy to measure. In contrast, models have struggled in areas where reward functions are less defined.

Das explained that RL differs from traditional data collection because it requires an environment where models learn by pursuing rewards. These rewards can be straightforward, such as solving a math problem, or more nuanced, like text generation, where an LLM can act as the judge.

For him, RL is less about generating synthetic data and more about building environments where models learn by pursuing rewards, much like AlphaGo, where the reward was simply winning the game.

Sachin Dharashivkar, chief executive officer at AthenaAgent, told AIM that RL environments can approximate aspects of real life, allowing models to specialise in different domains. “Right now, everybody is asking which LLM is going to win. I don’t think that’s going to happen, because as we get newer environments, new domains will open up,” he said.

According to him, this will lead to the development of domain-specific AI systems. “If you want a companion, most of the time you don’t need Einstein-level intelligence. You want to create an environment that fosters engagement. Coding is one example, but accounting can be another, where you want extremely precise instructions to follow. In insurance, it might be something very different.”

When asked how RL environments differ from fine-tuning, he said, “In fine-tuning, we are changing the parameters of the model with supervised examples. But in RL…you don’t care how it is solved; you care about the end result. The simulator enables that.”

Reinforcement as a Service

Today, several companies are offering reinforcement as a service. One of them is Prime Intellect, which recently launched Environments Hub. The company argues that RL environments are the key bottleneck to the next wave of AI progress, but big labs are keeping them closed.

Similarly, Shyamal Anadkat of OpenAI believes RL for large language models is still in its early stages, but will soon gain traction as companies look to adapt models for their own domains. “We’ll see many organisations customising or optimising domain-specific intelligence models with RL—once platforms and evaluations improve and make it easier to justify the activation energy,” he said.

According to Banghua Zhu, assistant professor at Washington University and principal research scientist at NVIDIA, the role of environment engineer is emerging. These are specialists who design high-quality RL environments with verifiable rewards.

In reinforcement learning from human feedback (RLHF), people gave feedback to train models. In reinforcement learning with verifiable rewards (RLVR), the feedback comes directly from the environment through verifiable rewards. This makes the job about creating challenging tasks, understanding what LLMs can or can’t do, and carefully engineering reliable environments that give clear signals. Zhu said doing so requires an understanding of LLM limits, creativity in task design and engineering rigour to build reliable environments and reward systems.

The range of possible environments is vast—from terminals, operating systems, enterprise workflows, and even text or video games. Startups are already entering this space, which some see as a critical pathway towards developing generalist agents.

“Setting up the environment to have the right reward function is one thing, but another aspect is engineering it well. Creating scalable, robust environments is a key technical challenge,” said SemiAnalysis in its blog post.

“RL environment specifications are among the most consequential things we can write as AI researchers. A relatively short specification (eg, less than 1,000 words of instructions saying what problems to create and how to grade them) often gets expanded either by humans or via synthetic methods into thousands of datapoints,” said Meta’s Jason Wei, on X.

Evaluation and the Future

Building effective RL environments, however, is not a straightforward task. “RL environment is not just about creating the interface. The knowledge of the environment comes from the domain…You need domain experts,” Dharashivkar said.

He added that horizontal RL environment startups may provide some value, but domain-specific expertise will be critical for production use cases. “You can always reach 70-80% accuracy by covering common cases. The real challenge is in corner cases. Demos work well, but production is where systems fail.”

Similarly, Ross Taylor, CEO at General Reasoning, in his recent post on X, said that there are hardly any high-quality RL environments and evaluations available. “Most agentic environments and evaluations are flawed when you look at the details. It’s a crisis, and no one is talking about it because they’re being hoodwinked by labs marketing their models on flawed evaluations,” he said.

On evaluating RL environments, Dharashivkar argued that current benchmarks may not be sufficient. “We have so many professions and people—one brain does not solve for everything. Just like in human work, specialisation matters. I personally believe we will have thousands or millions of models for separate use cases.”

The post Why Everyone is Rushing to Build Reinforcement Learning Environments appeared first on Analytics India Magazine.

Anthropic Throws Users’ Data Privacy to the Wolves

Ankush Das — Mon, 01 Sep 2025 11:00:22 +0000

Anthropic’s latest consumer policy update has left many users feeling more puzzled than reassured. On August 28, the company announced that chats and coding sessions from Claude’s Free, Pro and Max plans may now be used for model training, unless individuals explicitly opt out. Yet, the process of opting out seems less straightforward than suggested.

At the core of this update is a new five-year data retention rule, replacing the previous 30-day limit for users who allow their data to be used for training. Anthropic argues that this will strengthen safeguards against scams and abuse, while also improving Claude’s coding, analysis and reasoning skills. Yet, because the option is enabled by default, critics worry that many users may not realise what they’re consenting to.

The update does not apply to Claude’s enterprise, education or government versions, nor to API access via Amazon Bedrock or Google Cloud’s Vertex AI. For millions of individual users, however, unless one intervenes, their conversations and code could now be stored and processed for years.

Confusion With Data Sharing Could Lower Adoption

The ambiguity has struck a nerve in workplaces, where many developers already tread carefully when it comes to AI use.

Eduard Ruzga, former staff engineer at Prezi, told AIM that the shift could complicate Claude’s adoption within companies.

“Up until now, with ChatGPT and Claude, the default was that data was not used in learning.” That made it simpler for company policies to permit their use. “Now things will be much more confusing when it comes to allowing the use of Claude for work,” he explained.

Ruzga suggested businesses may be forced into stricter rollouts or dedicated team plans to avoid the risk of unintentional data leaks.

In a LinkedIn post, he also mentioned, “I can only imagine what kind of mess that could mean for companies in terms of policies of how to allow such tool use at work with internal information.”

That corporate unease is already surfacing. Denis Stebunov, CTO of ivelum, shared on X that his developers had been instructed to disable model training in privacy settings.

While praising Claude Code as “best in class”, he warned that leaving the feature on would expose proprietary client code, something their firm cannot risk. His frustration was blunt. The update, he argued, “is not user-friendly”, and if such practices persist, migration to alternatives will be inevitable.

Developers, Users are Not Happy

The backlash has also taken on an emotional edge. AI researcher Pierre-Marcel De Mussac described the change as Anthropic “reversing their entire privacy stance”. In his view, the opt-out toggle was buried, the retention window stretched unreasonably and everyday users were left unprotected compared to enterprise customers.

“Big businesses protected, users thrown to the wolves. Classic,” he wrote on X.

Joel Latto, threat advisor at F-Secure, argued that Anthropic’s motives are less about safety and more about necessity.

“LLM companies are running out of usable training data, which makes retaining all chat logs essential for their further development,” he told AIM.

For him, the real problem lies in the defaults, which are opt-out, not opt-in, as he described them as “by design anti-user, anti-privacy”. Latto also warned that many users may not realise that opting out later does not erase data already collected.

In his view, Anthropic’s emphasis on AI safety in its announcement serves more as a polish to soften what is essentially a business-driven decision. At the same time, Latto highlighted that the company commends their transparency in Threat Intelligence Reporting.

Sooraj Sathyanarayanan, a security researcher, told AIM that the update highlights the unsustainability of Anthropic’s “privacy-first” stance.

“The five-year retention period is a complete philosophical reversal that fundamentally expands the attack surface,” he warned, pointing out that longer storage multiplies the risk of breaches, subpoenas or insider misuse. He also criticised the interface as a dark pattern, with an oversized ‘Accept’ button and a default-on toggle designed to maximise opt-ins rather than informed consent.

Anthropic, for its part, maintains that user choice remains central, promising that deleted conversations will not be used in training and that settings can be changed at any time. However, the optics tell a different story, where the default settings nudge users towards data sharing, while the responsibility to protect sensitive material falls squarely on individuals.

The broader question is whether this shift undermines trust. Enterprises may shield themselves with team plans, but independent professionals and casual users face the murkier end of the policy. For a company built on trust and safety as its brand identity, this new chapter in data collection may test just how much control users truly have over their data.

The post Anthropic Throws Users’ Data Privacy to the Wolves appeared first on Analytics India Magazine.

AI Is Entering Its DOS to GUI Era, Says Microsoft CPO

Siddharth Jindal — Sun, 31 Aug 2025 05:42:12 +0000

Microsoft 365 is focused on combining intelligence with enterprise context and data to unlock new possibilities, according to the chief product officer (CPO) at Microsoft, Aparna Chennapragada, who told AIM in an exclusive conversation that “this is the golden age of building.”

Chennapragada said she thinks this is “the DOS-to-GUI moment of AI,” and added that Microsoft is pursuing two paths—integrating AI into familiar tools like Word, Excel and PowerPoint, and developing AI-first interfaces such as the M365 Copilot app and new AI editors.

She mentioned the recently launched COPILOT() function in Excel, which lets users analyse, generate content, and brainstorm directly in the grid. “We want to meet users where they are and help them there,” she said, pointing to use cases like taking meeting notes in real time. “We will continue to build such products and make them better.”

The CPO said Microsoft is also exploring new interface ideas, including a notebook-like experience for projects. Instead of repeatedly prompting an agent or Copilot, users would be able to view a document-style record of their work. “I do think there’ll be a lot of experimentation in interfaces, and different use cases will need just the right combination of chat plus canvas,” she noted.

Similarly, Ryo Lu, head of design at Cursor, wrote in a post on X that future interfaces should “flow like water”—meeting people the way they actually think, not the way we assume they should. “Some of us think in bullet points, some in conversations, others need to see everything mapped out visually,” he explained.

The real breakthrough, he noted, will come when interfaces can shift seamlessly between chat, canvas, tables, or timelines depending on a user’s cognitive style—all while acting on the same ground truth.

“We’re moving from ‘here’s one single-purposefully built interface, deal with it,’ to interfaces that genuinely fit how minds and teams operate. not just understanding what you want, but how you want to receive and work with it,” Lu added.

Taking the idea further, Karina Nguyen, research & product at OpenAI, wrote on X, “My vision for the ultimate AGI interface is a blank canvas. The one that evolves, self-morphs over time with human preferences and invents novel ways of interacting with humans, redefining our relationship with AI technology and the entire Internet.”

Interface for AI Agents

There is a growing demand to build dedicated interfaces for AI agents. At YC’s AI Startup School, Andrej Karpathy argued that we need a new interface designed specifically for agents, noting that a new kind of software user has emerged—neither a human clicking through a GUI nor a backend system making API calls.

The changing face of the internet is another focus area. Microsoft recently announced NLWeb as an open standard to make websites agent-readable, a move Chennapragada believes is crucial for preserving the publishing ecosystem.

“Today people think about search engine optimisation. Tomorrow we must think about making information available to AI engines in a way that preserves the publishing ecosystem,” she said.

Most recently, former Twitter CEO Parag Agrawal launched a new startup which will allow the AI agents to interact with the public web in real time. “Agents are going to be the primary customers of the web going forward. They will use the web a lot more than humans ever have,” said Agrawal.

Karri Saarinen, co-founder and CEO of Linear, a tool for planning and building software products, argues that while AI is here, it still lacks proper form—and without it, its full potential will remain untapped.“Chat is the beginning, but not the final destination. We still need great interfaces to guide and set the context.”

Strategy to Build New Products

Chennapragada acknowledged that Microsoft has had to adapt quickly and operate with greater agility in the age of AI. She highlighted three major changes shaping the company’s approach.

The first, she explained, is making sure customers gain access to the latest advancements in intelligence as early as possible. “When GPT-5 launched, even before it was fully rolled out into ChatGPT, we gave Microsoft customers the ability to try it,” she said.

The second is about bringing customers along the journey so they can adopt and learn new ways of working. To achieve this, Microsoft has introduced what it calls the Frontier Program. “The idea is that the cutting-edge features, whether it’s the latest AI capability in Excel or a new M365 Copilot agent, should be available for any Microsoft customer to use,” she explained.

The third change, Chennapragada noted, is internal—focused on how Microsoft approaches product development in the age of AI. Traditionally, companies began with detailed product requirement documents and specifications before moving to prototypes and stakeholder reviews. With AI, however, the process looks very different. “You start with the prompt sets, not so much the PRDs (product requirements document), and you rely heavily on evaluations and benchmarks,” she said.

Team of AI Agents

Looking ahead, Chennapragada envisions a workplace where every employee has not just a Copilot but also a set of agents supporting their daily tasks. “If you hire a new employee, they should get a badge, a PC, and a team of virtual agents—whether that’s a researcher, a data scientist, or an HR helper. Employees will need to learn how to direct these agents effectively, which is managerial skill in itself.”

The challenge, she said, lies not only in developing these high-value agents but also in training employees to use them and ensuring enterprise-grade trust, security, and compliance.For students and young professionals worried about jobs, Chennapragada’s advice is to treat AI as a contact sport. “Start building and using it. Some tasks will be automated, but new jobs like software operators will emerge. Don’t think of AI as a threat but as an amplifier.”

The post AI Is Entering Its DOS to GUI Era, Says Microsoft CPO appeared first on Analytics India Magazine.

Inside Tredence’s Milky Way, Where AI Agents Learn to Work Like a Team

Siddharth Jindal — Fri, 29 Aug 2025 10:13:41 +0000

Data analytics and AI services company Tredence is taking a bold step in enterprise AI with the launch of Milky Way, its new multi-agent decision intelligence platform. Positioned as a leap beyond traditional AI tools, Milky Way has been designed to function less like software and more like a team of collaborators that can think, reason, and solve problems alongside human decision-makers.

Praveen Koushik Satyanarayana, director of customer experience management at Tredence, has been closely involved in conceptualising and building the product. Talking about the inspiration behind Milky Way, he told AIM: “Complex functions within the enterprise, like the job of an analyst, require numerous steps. There is no one tool today that does all of those things.”

Milky Way introduces more than 15 prebuilt agents for business roles and over 50 specialised agents trained on real-world enterprise scenarios. These agents act as digital co-workers across functions, including marketing, supply chain, customer analytics, shopper insights and product lifecycle management.

From Tools to Teammates

Tredence is framing Milky Way as a transition from conventional AI-driven tools to what Satyanaryana describes as teammates. Instead of simply generating an output based on a prompt, Milky Way deploys multiple specialised agents that interact with each other to arrive at meaningful insights.

This approach is particularly relevant for exploratory analytics, where the end goal is not always known in advance and the system must work through ambiguity, much like a human analyst would.

Satyanarayana explained the difference with an example. “When you know the output you want, you can prompt your way to it with AI. But generating insights is different,” he said, adding that this is where autonomous systems come into the picture, which can think and reason with you, rather than just produce a response.

According to Tredence, early deployments in retail, consumer packaged goods, telecommunications and healthcare have shown a five-fold improvement in time-to-insight and 50% lower analytics costs. A global retailer reportedly cut manual effort in merchandising operations by 60% through assortment planning and pricing optimisation. Healthcare organisations used the system to automate patient data aggregation and triage.

Inside the Milky Way

The architecture of Milky Way reflects this philosophy. It is built around a network of domain-specific and specialised agents, orchestrated on a single platform. These agents break down complex questions into smaller, manageable steps.

For example, when asked a diagnostic question like “why are my sales down?,” Milky Way deploys a series of agents that work together in sequence. A clarification agent refines the question, followed by a hypothesis agent that proposes possible reasons for the decline. A data sufficiency agent then connects the business terms with enterprise data, while a text-to-SQL agent retrieves the relevant information. Finally, analysis agents process the data, generate insights, and compile reports.

Satyanarayana said that this kind of orchestration allows Milky Way to behave more like a team of analysts than a single tool.

Under the Hood

Another distinguishing feature of Milky Way is its composable architecture. Rather than binding clients to a single vendor or technology stack, the platform has been designed to integrate with both open-source and proprietary tools. Enterprises can swap out specific components, such as text-to-SQL agents, with their own solutions if they already exist.

“We’re not married to one model or vendor,” Satyanarayana said. “Our goal is to solve the problem. If a client already has a capability within their ecosystem, we can integrate that.”

This flexibility is expected to be particularly appealing to large enterprises that already have significant investments in platforms such as Snowflake, Databricks, or Azure. Milky Way sits on top of existing data systems without requiring companies to replicate or migrate data, which reduces friction and accelerates adoption.

The Bigger Vision

Beyond the product’s immediate capabilities, Tredence sees Milky Way as part of a broader shift in how AI will augment enterprise decision-making. The company is developing custom benchmarks to evaluate the system’s ability to replicate the reasoning processes of expert analysts and is also experimenting with proactive background agents that can surface insights without explicit prompts.

Satyanarayana summed up the vision by explaining that the goal is to make insights available at scale, 24/7. In this model, a wealth advisor could quickly get answers about a client’s portfolio, a store manager could find out why stocks are running low, and a marketing manager could see why customer churn is rising. The idea is that insights should no longer depend on technical expertise or long wait times, they should be instantly accessible, anytime.

If Tredence can deliver on its vision, Milky Way could mark a turning point in how organisations think about decision intelligence.

The post Inside Tredence’s Milky Way, Where AI Agents Learn to Work Like a Team appeared first on Analytics India Magazine.

NVIDIA Positions Itself as Backbone of $4 Trillion AI Gold Rush

Siddharth Jindal — Fri, 29 Aug 2025 05:26:07 +0000

NVIDIA is no longer just a chipmaker. Its latest earnings suggest it has become the world’s most critical supplier of infrastructure for AI.

The company posted record quarterly revenue of $46.7 billion, up 56% from a year ago, driven by insatiable demand for its data centre products. Sales of its latest Blackwell and Blackwell Ultra GPUs are ramping at unprecedented speed, with NVIDIA now producing 1,000 AI racks a week.

By 2030, NVIDIA expects trillions to be spent worldwide on AI infrastructure, from chips and data centres to software platforms and supercomputers. “Blackwell and Rubin AI factory platforms will be scaling into the $3 trillion to $4 trillion global AI factory build out through the end of the decade,” CEO Jensen Huang said on the earnings call.

Huang said the company expects revenue of $54 billion – plus or minus 2% – for the next quarter, excluding any shipments to China.

CFO Colette Kress added that sales to China could contribute between $2 billion and $5 billion.“There is interest in our H20s. We have received the initial set of licenses, and we have supply ready,” Kress said. “That is why we indicated that shipments this quarter could potentially fall in the range of $2 billion to $5 billion.”

Competitors in China

Huang estimated that China alone represents about a $50 billion opportunity for NVIDIA this year, if the company could compete there with its latest products. “And if it’s $50 billion this year, you’d expect it to grow 50% annually, just as the rest of the world’s AI market is growing,” he added.

At the same time, China is actively competing with NVIDIA in the GPU space primarily through domestic semiconductor companies like Huawei, Alibaba, Baidu, Cambricon Technologies and others who are developing GPUs for AI and gaming.

Cambricon on Thursday reported a record-breaking profit in the first half of the year, fueled by rising demand from companies like ByteDance for locally produced semiconductors as alternatives to NVIDIA chips. The Beijing-based firm announced on Tuesday that it earned Rmb1 billion ($140 million) in profit over the first six months, a sharp turnaround from a loss of Rmb533 million in the same period last year.

On the other hand, Huawei’s Ascend 910C, an evolution of Huawei’s 2019 Ascend 910, is engineered for AI inference tasks. Initially slated for mass production in mid-2025, the Ascend 910C’s rollout has been delayed due to technical challenges. Huawei now plans to commence production by the end of 2025.

Recent benchmarks indicate that it delivers approximately 60% of the performance of NVIDIA’s H100 GPU in inference workloads

Huawei is preparing to test its new AI chip, the Ascend 910D, in China, while it has also introduced the Ascend 920 AI chip.

According to DigiTimes Asia, the chip is slated to enter mass production in the second half of 2025. Industry experts believe the Ascend 920 could be a viable alternative to NVIDIA’s H20 GPUs.

The Rise of Sovereign AI

Beyond big tech and China, governments worldwide are racing to build sovereign AI infrastructure to reduce dependence on foreign cloud providers. NVIDIA has quickly become the go-to supplier.

The EU is investing €20 billion in 20 AI factories, while the UK’s Isambard-AI supercomputer, built with NVIDIA, is now the most powerful in the country. In the Middle East, sovereign wealth funds are pouring billions into AI data centres powered by NVIDIA’s hardware.

Kress said NVIDIA expects over $20 billion in sovereign AI revenue this year, more than double last year. In other words, nations are treating AI as a matter of national security, and NVIDIA is cashing in.

Chatbots to Reasoning Systems

The company’s biggest bet is that today’s generative AI systems are just the beginning. Huang argued that the next generation, so-called agentic or reasoning AI, will make current chatbots look primitive.

“Where chatbots used to be one-shots, you give it a prompt and it would generate the answer, now the AI does research,” Huang explained. “It thinks and does a plan, and it might use tools.”

This shift will require exponentially more computing power. “The amount of computation necessary… could be 100x, 1,000x and potentially even more,” Huang said.

That explosion in demand is why NVIDIA built its Blackwell NVLink 72 rack-scale system, a new architecture designed to link thousands of GPUs with extreme efficiency.

If AI systems are going to be 100x more compute-intensive, they’ll also be 100x hungrier for power. In Huang’s view, that turns energy efficiency into revenue efficiency.

“In a world of power-limited data centres, perf per watt drives directly to revenues,” he said. “The more you buy, the more you grow.”

Can Rivals Catch Up?

Huang shared that Blackwell’s data centre revenue had increased by 17% sequentially, reflecting its strong market adoption. Looking ahead, he discussed the upcoming Rubin architecture, expected to enter volume production in late 2025 and become available in early 2026. He revealed that six new Rubin chips are already in fabrication at TSMC.

Some rivals are betting on custom ASIC chips to challenge NVIDIA’s GPUs. But Huang dismissed the threat, arguing that accelerated computing is “the ultimate, most extreme computer science problem the world’s ever seen.”

What sets NVIDIA apart, he said, is its full-stack approach, including GPUs, CPUs, networking, and CUDA software, all designed to work seamlessly. That ecosystem, developed over decades, is why developers and enterprises keep flocking to NVIDIA.

The post NVIDIA Positions Itself as Backbone of $4 Trillion AI Gold Rush appeared first on Analytics India Magazine.

Intel is ‘Too Big to Fail’ But Too Weak to Win

Siddharth Jindal — Thu, 28 Aug 2025 11:16:40 +0000

In a highly unusual move, the Trump administration on Monday announced an $8.9 billion investment in Intel, giving the US government a 9.9% stake in the company.

The investment will be funded through $5.7 billion in pending CHIPS and Science Act grants and $3.2 billion from the Secure Enclave program. Combined with $2.2 billion in earlier CHIPS grants, total government support for Intel now stands at $11.1 billion.

“As the only semiconductor company that does leading-edge logic R&D and manufacturing in the US, Intel is deeply committed to ensuring the world’s most advanced technologies are American-made,” Intel CEO Lip-Bu Tan said.

The agreement also provides the government with a five-year option to purchase an additional 5% of Intel shares at $20 per share, exercisable only if Intel’s ownership stake in its foundry business falls below 51%. Existing claw-back and profit-sharing provisions tied to earlier grants will be removed.

This follows SoftBank’s $2 billion investment in Intel, becoming one of its top shareholders.

“The primary significance of the government’s investment is its powerful endorsement, reinforcing the belief that Intel is ‘too big to fail’. While this does not directly help Intel’s advanced node technology, it encourages the market to rerate the company’s Price-to-Book (P/B) ratio using a lower discount rate,” said Ming-Chi Kuo, market analyst at KGI Securities. He further added that this helps raise Intel’s valuation floor, improves its stock performance, and indirectly benefits its operations.

The chip maker has invested $108 billion in capital and $79 billion in research and development over the past five years, primarily in US-based manufacturing. The company is currently building new facilities in Arizona with high-volume production expected later this year.

Tan, who became CEO in March, has moved to strengthen Intel’s financials and execution. The new agreement, Intel said, will help advance its US manufacturing and technology expansion plans.

However, Intel has also flagged potential risks from the US government’s equity stake. In an SEC filing, the company warned that foreign customers or governments might retaliate or reconsider partnerships, a concern given that 76% of Intel’s revenue comes from overseas, including 29% from China.

Analysts believe the government’s involvement is a double-edged sword.“It encourages partnerships from industry, and we think that this means a higher chance of orders and partnerships from American fabless companies to avoid tariffs,” Doug O’Laughlin, president, SemiAnalysis, a research and analysis company specialising in the semiconductor and AI industries, told AIM.

He added that to win significant business, Intel must deliver on performance, cost, and time-to-market compared to TSMC, regardless of funding sources.

Intel’s Turnaround Gamble

After years of setbacks, Tan has aggressively retooled Intel’s strategy. He slashed thousands of jobs, cut costs, and reworked the troubled foundry plans pushed by former CEO Pat Gelsinger.

Notably, Intel is re-prioritising its process roadmap. According to recent reports, Tan is shifting focus from the 18A node to 14A for external customers.

“The US needs Intel. The world needs Intel,” said Dylan Patel of SemiAnalysis in a recent interview, arguing that despite setbacks, Intel still holds a critical role in the semiconductor ecosystem. He added that while Intel lags TSMC, it remains ahead of Samsung in advanced process technology.

According to Patel, customers who have tested chips at both companies generally view Intel as “further along in 2nm-class process technology than Samsung”. That positions Intel as the clear second player behind TSMC.

Patel said that some investors have pushed for Intel to separate its design and foundry businesses, while cautioning against it. “The process of splitting Intel would take so much executive time and effort that the company could go bankrupt in the process.”

Competition and Product Roadmap

This government-industry alignment must be weighed against the global competitive landscape. Taiwan’s TSMC remains the undisputed leader. It is building a $65+ billion cluster of Arizona fabs (with CHIPS grants and loans) to produce 2nm chips for Apple, NVIDIA, AMD and others.

Meanwhile, Samsung scored a major win this month. Tesla announced it would manufacture its next-generation A16 chip at Samsung Foundry’s Taylor, Texas, facility. For Samsung, long trailing TSMC, this marks both a symbolic and practical boost in the US market.

Intel, by contrast, is racing to prove that 14A can attract external customers. Without landing a marquee client such as Apple, NVIDIA, or AMD, Intel has warned it may even exit the leading-edge foundry business if it fails to secure customers for 14A.

O’Laughlin said the focus on 14A is to listen to customers early in the design cycle and attract a major whale customer (e.g. mobile), as Intel’s internal products alone would not generate enough volumes to make up for the development cost.

On the other hand, Intel’s 18A node, powering its upcoming ‘Panther Lake’ client CPUs, is now positioned primarily as an internal ramp. The real bet for external customers comes with the next-gen 14A process, which Intel says will be competitive with TSMC’s leading-edge offerings.

Beyond CPUs, Intel is also expanding its GPU and accelerator lineup. In mid-2025, Intel unveiled the Arc Pro B60/B50 GPUs for workstation AI and launched Gaudi 3 AI accelerators for data-centre inference.

Despite the product launches, technical challenges loom. Recent reports from Reuters indicate Intel’s 18A process has struggled with low yields on its first chips, meaning only a small fraction of manufactured dies are currently usable.

“Gaudi is not competitive, and Jaguar Shores remains questionable with an unproven software stack,” said analyst Gerald Wong from SemiAnalysis.

“Panther Lake only serves to maintain Intel’s leading position in the mobile PC space. If Intel can increase its margins from using 18A, that would help with attracting external customers,” he added.

However, Intel CFO David Zinsner expressed confidence during an interview that yields will improve by year-end, but many investors note that Intel’s turnaround depends on these next-generation nodes achieving volume viability. And crucially, foundry success hinges on landing big outside customers.

In summary, the government’s $8.9B equity investment is about more than balance sheets – it’s about geopolitics and supply chain strategy. For Intel, it provides a bolstered balance sheet to fund fabs and R&D in the US while management tries to restore technical leadership and attract foundry clients.

The post Intel is ‘Too Big to Fail’ But Too Weak to Win appeared first on Analytics India Magazine.

The ARR Mirage

Supreeth Koundinya — Tue, 26 Aug 2025 05:35:24 +0000

When 11x raised $50 million at a $350 million valuation from venture capital firms Andreessen Horowitz and Benchmark, the US-based AI startup proudly touted nearly $10 million in annual recurring revenue (ARR).

But months later, investigations revealed that the actual figure was closer to $3 million. The company, critics alleged, had been counting short-term trial customers of three months as if they were locked into full-year contracts.

While 11x disputed the findings of the investigation, this fiasco brought to light a deeper problem plaguing startup finance.

ARR, once a straightforward measure of contractual commitments, has morphed into an ambiguous and at times manipulated metric in venture capital.

As AI companies race to demonstrate explosive growth, the gap between reported ARR and business reality has never been wider.

In this scenario, founders often lean on extrapolation, the villain in this story, taking a single month’s revenue and multiplying it by 12.

True ARR represents contractual certainty—customers who have committed to paying for a whole year or more. Annualised revenue or annual run rate, by contrast, is mere mathematical wishful thinking, assuming that one month’s performance will repeat itself 12 times.

When ARR is Meaningful and When It Isn’t

Consider the mathematics: a startup that earns $100,000 in July can legitimately claim $1.2 million in annualised revenue.

“As an investor, when we see someone say, ‘I have annualized $1.2 million of revenue’, versus a founder that says ‘I have $1.2 million in ARR’, I know immediately that the founder with $1.2 million annualised revenue does not have a 12-month contract,” Jay Krishnan, head of investments of India at Beyond Next Ventures, said in an interview with AIM.

“The likelihood that an early founder has an ARR is reasonably slim,” he added.

"Got five $20 sales in one hour on my AI product after launch"
"That means $2400 a day"
"Meaning $72k MRR"
"or in other words, ~$1M ARR reached"

this is how i see 99% of AI companies advertising their revenue
— Mehul Mohan (@mehulmpt) August 21, 2025

The confusion is not limited to terminology. Joseph Johnson, founder of LedgerUp, told AIM, “Folks often include pilots or one-time fees in ARR, not because they mean to mislead investors, but just because they’re confused on what ARR is. It’s also challenging if you have a heavy usage-based business to share ARR, and you typically go with run rate.”

Besides, even when used correctly, ARR may fail to paint the whole picture. A startup could boast seemingly high recurring revenue, yet still operate at a supremely high net loss if acquiring a single customer incurs a high cost.

Complicating matters further, founders often interpret ARR in their own way. “Founders want to calculate ARR in lots of different ways, and everyone treats it slightly uniquely,” Johnson said. Founders, especially in the software-as-a-service (SaaS) sector, face a fair bit of mathematical challenge while calculating ARR.

In the traditional mathematical sense, monthly recurring revenue (MRR) is simply multiplied by 12 to obtain ARR.

“Let’s say you are a client of mine. I signed up with you in August. Tomorrow, in September, I will sign up with another customer. In October, I signed up with another customer. Each of these contracts has an MRR value, but they are staggered over time. Therefore, calculating ARR for a particular company becomes difficult. You can’t just sum up all three clients. That’s one challenge,” Krishnan explained.

Besides, when customers finish their monthly contractual obligation and do not resume further, MRR simply multiplied by 12 cannot lead to ARR.

Moreover, AI startups are using tokens, compute time, number of queries and more ways to bill customers. “And these tend to go up or down based on utilisation, because even the clients are trying to figure out how to best use AI, “ he added.

Over the last few months, ARR has taken the stage in some of the biggest startup growth stories. Yet, when calculated correctly, ARR is a good way to see the predictability and legacy of both customers and business, Krishnan argues. This is probably why it remains one of the most widely used metrics.

Besides, for a startup that is only two months old and speaks of an annualised revenue, say 1.2 million, it still matters. “Because in two months, he or she is able to crack 100k of revenue, and therefore, they’ve just used the word annualised to project forward. So it has to be taken in context,” he further said.

AIRR

The brand new AI "Marketing" ARR wherein you can take your recent best day revenue (even if it comes at a massive loss) and multiply it by 365 days and call it your ARR :-)
— Vaibhav Domkundwar (@vaibhavbetter) August 21, 2025

If Not ARR, Then What?

For investors seeking genuine signals amid the ARR noise, Krishnan advocates for a more comprehensive approach to evaluating startup health. Rather than fixating on revenue metrics alone, he emphasises looking at multiple characteristics that reveal the true trajectory of a business.

He recommends examining the relationship between ARR and MRR. “ARR divided by MRR gives you an indication of growth,” he said. Yet, tracking the progression from bookings to billing to revenue is equally important. “If startups have a business development team that has done bookings, the finance team has done collection, and then there’s revenue. If that demonstrates a funnel, that means there’s growth.”

Customer retention, however, may be the single most telling metric. “Net revenue retention is a great metric because it shows upsell opportunity, which means that the customer is using the product and wants to use it more,” Johnson emphasised. He warns against superficial enterprise deals: “It’s dangerous when someone signs a large enterprise that has an AI budget but does not actually use the product enough.”

Krishnan points to gross revenue retention and net revenue retention as essential metrics, but reserves special attention for churn, which refers to the rate at which customers stop doing business with an entity.

“The golden metric for SaaS models for retention is churn, measured in percentages…The lower the churn, the better.” This focus on keeping customers reveals far more about product-market fit than inflated acquisition numbers.

For measuring customer engagement, he suggests examining daily-to-monthly active user ratios, feature adherence rates, and time spent in the product. “The more you launch features and the more the customer adheres to it, as opposed to launching features but nobody using them, that’s a good way to measure engagement.”

Besides, the efficiency metrics that matter most revolve around unit economics. Krishnan highlights the “golden ratio” of lifetime value to customer acquisition cost at 3:1.

“That gives you an indication of how much money you need to raise or at least a percentage of the money that you raise goes towards demonstrating that your efficiency gets to a point where LTV to cap will flip from one or two to eventually get to three,” he added.

This measurement problem requires discipline from founders and investors, including clear definitions, honest reporting and a focus on business fundamentals that ARR signifies. Only then can it accurately measure genuine, sustainable growth instead of hiding accounting tricks.

The post The ARR Mirage appeared first on Analytics India Magazine.

Why Google DeepMind CEO Demis Hassabis Can’t Sleep At Night

Siddharth Jindal — Mon, 25 Aug 2025 13:47:22 +0000

Google DeepMind may well be the most underrated AI lab today. In the past few weeks alone, the company has shipped more real-world breakthroughs than most AI companies manage in an entire year.

In just two weeks, DeepMind and Google have rolled out a wave of big AI launches, including Genie 3, Gemini 2.5 Pro, AlphaEarth and Aeneas, alongside tools such as Storybook, Kaggle Game Arena, Jules, AI Mode for Search in the UK, and NotebookLM Video Overviews. Their open model, Gemma, has already been downloaded over 200 million times.

“Now you know why I don’t get much sleep, too busy pushing the frontier,” wrote Google DeepMind chief Demis Hassabis in a post on X.

“DeepMind is my favourite lab. I love how they are not limited to research just about LLMs, but healthcare, Physics, robots, vision, world models, Biology, etc,” said an AI student on X.

Building Thinking Models

In a recent podcast with Logan Kilpatrick, Hassabis said that the pace of progress inside the company has reached a level where the company is pretty much releasing something every day. “It’s hard to keep up, even internally.”

“The biggest achievement in the past two years is that Google and DeepMind now trust each other,” said Shixiang Shane Gu, senior staff research scientist at Google DeepMind, in a post on X.

He further said that Google has committed to using AI to reimagine its core businesses and internal tools, and that DeepMind has proven it can deliver industrial-grade models beyond research.

As AI competition intensifies globally, DeepMind has positioned itself as both a research leader and a contributor to Google’s commercial AI offerings. Hassabis discussed the lab’s recent announcements, including Deep Think, the International Mathematical Olympiad (IMO) gold medal model and Genie 3.

Hassabis further said that Deep Think draws on DeepMind’s long history with agent-based systems such as AlphaGo and AlphaZero. “We’ve always worked on systems that can complete a whole task,” he said. These systems now combine multimodal inputs with reasoning and planning capabilities.

“Once you have ‘thinking’, you can do ‘deep thinking’ or ‘extremely deep thinking’ and then have ‘parallel planning’,” he explained during the discussion. This allows AI to plan before producing output, especially for complex domains. “For things like Maths, coding, scientific problems and also gaming, you’re going to need to process and plan and basically do this thinking and not just output the first thing that the model comes up with.”

Genie 3 and World Model

Genie 3, DeepMind’s demonstration of a world model, understands the structure and behaviour of the physical world, including solids, liquids and reflections. Hassabis said generating realistic worlds is one way to demonstrate the depth of understanding such a model can achieve.

Hassabis said the same technology is being applied in Scalable Instructable Multiworld Agent (SIMA), a simulated agent that can take control of and play existing computer games. This, he explained, can produce unlimited training data for robotics and artificial general intelligence research. He also said that simulation-based training could have potential applications in other domains where physical testing is expensive or slow.

Deedy Das of Menlo Ventures told AIM that DeepMind’s recent work on Genie 3 and generative environments was one of the most incredible tech demos he had ever seen in his life.

Regarding the model’s applications in robotics, he added, “It’s still too early to know whether that will work. Startups continue to operate across different layers of the robotics stack—from training environments and data labelling to software, intelligence, and hardware—with several attempting to tackle multiple parts of it.”

According to Hassabis, work is also moving towards an omni model capable of performing a wide range of tasks. “We’re starting to see convergence of those models together into what we call an omni model, which can do everything,” he said, adding that the goal is for an AGI system to handle all tasks at the same quality as today’s specialised models, but within a single, unified model.

Setting New Standards for AI Evaluation

On evaluation, Hassabis said there is a need for new benchmarks that go beyond current tests, which mostly measure performance on specific, static tasks. DeepMind’s Game Arena is one such attempt, with games that evolve in complexity and are unique to each match to prevent overfitting.

“Evaluation is an unsolved problem,” Hassabis said. “You need to be able to measure reasoning, planning, memory and physical intelligence, not just pattern matching.”

He also discussed the importance of safety benchmarks. These would test for behaviours such as deception or manipulation, ensuring systems remain reliable when deployed in real-world environments.

Anticipating Technology Shifts

On the topic of integrating tools into AI systems, Hassabis said the decision depends on impact. “It’s very much an empirical question. Does adding that capability help the other capabilities? If it does, then do it. If it harms the other general capabilities, then maybe consider using it as a tool.”

Hassabis said that product managers and designers must think ahead about where technology will be when a product launches. “You’ve got to be really close and understand the technology world to intercept where that technology will be in a year’s time,” he said.

He noted that DeepMind has crossed the milestone of processing more than a quadrillion tokens each month. While not a direct measure of capability, it signals the scale at which the company is now operating.

What Google Could Do Better

Das said that he genuinely believes that Google DeepMind is one of the “least evil” and most engineering-centric companies in the world. “They push the frontier of innovation, have tremendous results on difficult problems with no clear revenue, they make AI models cheaper, they’ve built the best video model by far.”

However, he sees gaps in how Google ships and shares innovation. “They launch blog posts often without a product users can play with, and I feel strongly this should just be reprimanded by the entire tech community,” he said. “We’re in 2025. Tech companies should ship products, not words. If it’s not ready, ship nothing.”

He also pointed to speed, decision-making and ecosystem engagement. “They move slower than I’d like,” Das said. “They hesitate to make bold decisions while trying to appeal to everyone and maintain their brand image…They are less entrenched in the external ecosystem, and I wish they would partner more closely with startups and other companies where they can do experimental stuff.”

Sometimes, he noted, the culture can be insular. “Their researchers are more Google-minded than they are tech-minded because of how high the metaphorical walls around the city of Google are.”

Das suggested that Google release a larger open source model than Gemma. “People would love it,” he said. He also believes Google’s marketing and product packaging could be improved.

Citing Google’s Veo 3 video model launch as an example, he described the friction in accessing it. “It was packaged in an unrelated product called Flow and you had to click 10 things to get to it,” Das said.

The post Why Google DeepMind CEO Demis Hassabis Can’t Sleep At Night appeared first on Analytics India Magazine.

Big Tech Brings Back In-Person Interviews For Technical Rounds

Siddharth Jindal — Sat, 23 Aug 2025 06:09:15 +0000

What was once the norm during the pandemic, remote hiring is now facing new challenges as candidates increasingly turn to AI for support in interviews. The concern has become widespread enough that big global firms are responding. Companies like Google, Cisco, and McKinsey are reinstating in-person rounds to evaluate applicants based on their own merit without reliance on AI tools.

However, to strike a balance, some organisations are experimenting with hybrid approaches. Microsoft, for instance, allows candidates to use AI tools during specific stages of the recruitment process, such as take-home projects, while reserving in-person sessions to evaluate unassisted skills.

According to Puneet Chandok, president of Microsoft India and South Asia, the company is integrating AI across its recruitment and HR processes while ensuring human involvement at every stage.

“AI helps in the process with the human in the loop always,” Chandok told AIM in an exclusive interview. “Our recruitment processes…are using AI where required to complement and augment capabilities, but always keep a human in the loop.”

On whether candidates themselves are allowed to use AI during the hiring process, Chandok confirmed there are no restrictions. He linked this approach to Microsoft’s emphasis on AI fluency.

“You need a workforce that is fluent in AI, that knows how to work with the tools and build these new agents,” Chandok said. “Every time I’m looking at hiring and interviewing people, I go really deep into the AI skills that they bring to the table, because that’s what you need.”

Chandok also clarified that interviews are not entirely conducted in person. “It’s a combination. We’re using AI as part of all our processes, including hiring, and always with the human in the loop,” he said.

Meanwhile, earlier in June, Google CEO Sundar Pichai said on the Lex Fridman podcast that the company is reintroducing at least one round of in-person interviews, particularly for engineering and programming roles. The move comes in response to the widespread use of AI tools by candidates during virtual technical interviews, where more than half have been suspected of using AI to cheat in real-time coding assessments.

Pichai explained that in-person assessments are essential to confirm whether candidates truly possess the computer science fundamentals and technical depth required for the role, something that AI-assisted virtual interviews make harder to evaluate. He added that this hybrid approach will also give candidates a clearer sense of Google’s culture while offering interviewers a more reliable way to verify technical competencies.

Why Tech Giants are Returning to Face-to-Face

An NVIDIA engineer told AIM that in-person interviews are becoming increasingly important in the age of AI, as subtle cues often reveal when candidates are relying on external help. He noted that behaviours such as avoiding eye contact with the camera or constantly shifting their gaze are clear signs. He also pointed out that tools like CoderPad are being used to conduct technical interviews.

CoderPad is an online platform that allows candidates to write, execute and debug code in real time while being observed by interviewers. Often described as a “virtual whiteboard for coding”, it has become one of the most widely used tools for live programming assessments.

Similarly, Ankush Sabharwal, CEO of Corover.ai, told AIM that at their startup, they ensure at least one of the rounds is in-person. “Face-to-face interactions provide invaluable insights that technology can’t replicate. Micro expressions and body language reveal a wealth of information about a person, which humans are biologically trained to pick up on,” he said.

Citing examples of instances where he realised candidates were misusing AI to cheat, he said, “I’ve noticed situations where candidates almost read out AI-generated answers. The responses are polished and bookish but lack depth, spontaneity and a personal touch.”

Sabharwal observed that many candidates struggle to think on their feet. When asked to expand or think aloud, they often falter. He noted that while AI can provide quick information, it cannot replace genuine knowledge, skills, vision and lived experience.

However, he clarified that the company still uses AI for the hiring process. “We use AI to handle repetitive tasks like resume parsing, skill matching and preliminary assessments. Once a candidate passes that stage, I believe in meaningful conversations, whether in person or over video, where we focus on cultural fit, problem-solving ability, stress-handling and authenticity. AI makes hiring faster, but humans make it more meaningful.”

A Rebalancing Act

Notably, not all firms discourage AI in interviews. One candidate on Reddit recalled, “I was told I could use anything—Google, ChatGPT, etc—just had to keep it on the shared screen. I explained my logic, used ChatGPT to get the right line of code, pasted it, made a few changes, tested it, and it worked from all angles.”

The return of in-person interviews does not signal a rejection of AI altogether. Most companies continue to use automation and AI to screen resumes or conduct early-stage assessments. However, final decisions are increasingly shifting back into physical spaces where authenticity is harder to fake.

Several AI startups are offering their services in the hiring process.

In a previous interaction with AIM, Rahul Veerwal, CEO and founder of GetWork, said the company eliminates the first three steps of traditional hiring, which usually consume around 80% of the time spent on recruitment. These steps include matching resumes to job descriptions, conducting initial calls to gather missing data such as salary expectations or shift preferences and assessing soft skills like communication.

For candidates, the message is clear. Using AI as a preparation tool may help polish responses, but relying on it during interviews can be counterproductive. Recruiters are getting better at spotting overly rehearsed or machine-like answers. What stands out now is the ability to think on one’s feet, build trust and engage naturally in conversation.

The post Big Tech Brings Back In-Person Interviews For Technical Rounds appeared first on Analytics India Magazine.

India Gives Siemens an Upper Hand in the Digital Twin Race

Ankush Das — Tue, 19 Aug 2025 05:19:52 +0000

Siemens Digital Industries Software is likely positioning India as the cornerstone of its global expansion, with the country now hosting the German industrial giant’s largest workforce concentration outside its home base. The move comes as the company’s comprehensive digital twin strategy forces competitors into billion-dollar acquisition sprees to keep pace, as mentioned by the company.

Addressing the Siemens Realize LIVE event, the company executives revealed that India houses the firm’s biggest employee base globally, with close to 10,000 software professionals across various divisions. This represents a significant portion of the company’s 28,000-strong digital industries’ workforce.

The India focus coincides with Siemens racing ahead of its own cloud transformation timeline, with executives disclosing that 50% of business will likely shift to cloud-based revenue by year-end, a full year ahead of schedule.

Cloud Numbers Tell the Story

Robert Jones, chief revenue officer at Siemens Digital Industries Software, shared striking metrics that suggest the company’s Software-as-a-Service pivot is gaining serious momentum. “We have over 23,000 customers that adopted the hybrid SaaS capability [since launching the cloud strategy in fiscal 2021],” he said.

“Existing customers adopted the hybrid SaaS so they could do some basic data management collaboration…and that’s continuing to grow.”

The transformation appears to be democratising advanced engineering tools. Brenda Discher, CMO of Siemens Digital Industries Software, revealed that 87% of customers transitioning to Siemens’ X cloud portfolio comprise small to medium enterprises. This shift is particularly evident in India and Asia, where numerous smaller companies are adopting technology that was previously accessible only to larger organisations.

More impressive perhaps are the 8,000 enterprise cloud accounts representing over two lakh active users on the company’s X solutions portfolio. “That, to me, is just tremendous,” noted Joe Bohman, executive vice president of PLM Products at Siemens Digital Industries Software. He added that many customers switched from competitive products specifically owing to Siemens’ cloud offering.

Indian Success Stories and Competitive Pressure

Indian companies are already showcasing the potential of Siemens’ integrated approach.

Aether Energy, a leading two-wheeler scooter brand, used the full Siemens suite to bring their new model to market in just 17 months—record time for the company.

Tony Hemmelgarn, president and CEO of Siemens Digital Industries Software, said, “The largest concentration of people we have is in India. We have a broad presence here, including all of our EPA, electronics design automation.”

Other notable success stories include Sedemac, a tier-one auto supplier that achieved a 50% increase in workflow efficiency. Meanwhile, advanced companies like SkyRoot, an aerospace manufacturer and commercial launch service provider for small satellites, are leveraging Siemens’ full lifecycle coverage—from design to testing.

The results haven’t escaped competitors’ notice. Hemmelgarn pointed to Synopsis’ recent $35 billion acquisition of software firm Ansys as evidence that rivals are scrambling to match the capabilities of Siemens. “One knows they’re doing something right in the comprehensive digital twin when the competition is trying to copy what they’re doing,” he remarked.

However, Hemmelgarn claimed such acquisitions may be insufficient. Notably, Synopsis and Ansys, even combined, cannot perform manufacturing simulation or create 3D models—gaps that leave their digital twin offering incomplete compared to Siemens’ comprehensive approach.

Acquisition Strategy and AI Integration

Siemens’ aggressive acquisition strategy, totalling 36 or 37 deals during Hemmelgarn’s tenure, is designed to build what the company calls its “digital currency”. Recent additions include downstream technologies for printed circuit board design and Wevolver for supply chain intelligence.

Moreover, Altair Engineering’s acquisition stands out as particularly strategic for India. Siemens invested $10 billion to acquire Altair Engineering to add advanced simulation, high-performance computing (HPC), data science and AI capabilities to strenghten its industrial software portfolio.

“Right now, Altair is the leader in computer-aided engineering (CAE) here in India as well as China,” Hemmelgarn noted. The deal filled critical gaps in CAE capabilities, especially nonlinear simulation for sudden impact scenarios.

Looking ahead, Siemens is betting heavily on AI integration across its platform. The company’s approach leverages vast data repositories through Teamcenter and manufacturing systems.

Hemmelgarn illustrated the potential: “Now, [with AI in place], a new engineer coming into the company has the entire knowledge and the history of anybody that ever machined a part in that company.”

This AI integration, combined with cloud accessibility, could give Siemens a significant advantage over competitors still relying on traditional desktop solutions. The strategy appears to be working—the company expects to hit 50% cloud-based revenue well ahead of its original timeline, whilst competitors rush to acquire capabilities Siemens has been building organically for years.

As manufacturing digitalisation accelerates globally, Siemens’ India-centric approach and comprehensive digital twin strategy likely position the company to capitalise on both local growth and the worldwide shift toward integrated industrial software solutions.

The post India Gives Siemens an Upper Hand in the Digital Twin Race appeared first on Analytics India Magazine.

Peeking Into Microsoft Research India’s AI Experiments

Siddharth Jindal — Mon, 18 Aug 2025 14:40:58 +0000

Many people believe that big tech companies in India primarily hire for coding roles, but Microsoft Research India challenges this assumption. The lab is working on generative AI projects covering foundational technologies, industry applications, and solutions to societal challenges in India.

“A common thread that runs through our work is an open, collaborative style where we partner with academia, government, NGOs, and more and share our work through open publication and open-source software,” said Dr Venkat Padmanabhan, managing director, Microsoft Research India, in an exclusive interview with AIM.

Building the foundations for generative AI

Padmanabhan shared that Microsoft Research is currently pursuing projects in retrieval and energy-efficient AI deployment. One of them is ReFoRM, which aims to improve retrieval and RAG for chat, search, and recommendation applications, especially over private enterprise data.

“ReFoRM’s model seeks to deliver step-function gains in retrieval accuracy over all existing retrieval models akin to what LLMs have delivered for reasoning, while keeping the cost low,” he said.

By comparison, Google also integrates RAG into its Gemini AI and Vertex AI RAG Engine platforms, for enterprise customers, combining large language models with advanced retrieval capabilities.

Meanwhile, social media giant Meta uses RAG to improve search and recommendation systems through large-scale embedding models and retrieval pipelines. Last year, the company introduced RAFT (Retrieval-Augmented Fine-Tuning) approach which combines Retrieval-Augmented Generation (RAG) with supervised fine-tuning to improve domain-specific adaptation of language models.

Alongside ReFoRM, Microsoft Research India is exploring sustainability challenges through Greenferencing, which combines renewable energy with intelligent load distribution across micro data centres. Padmanabhan believes such initiatives can make AI deployments both energy-aware and cost-efficient.

Complementing these initiatives, Microsoft Research India has been developing foundational systems that enable the next generation of AI applications.

One example is the DiskANN project on approximate nearest neighbour search (ANNS), initiated in 2018. It enables generative AI tools to embed documents, images, and user queries as high-dimensional vectors and find the most relevant matches efficiently.

“DiskANN represents the state-of-the-art ANNS algorithm, able to serve an index of trillions of vectors at high quality and a fraction of the cost possible otherwise,” Padmanabhan said. He shared that the technology has influenced industry solutions, including Cassandra and Pinecone.

But the tool is not alone. Meta’s FAISS has become the default open-source choice for similarity search, driving recommendation engines and vector databases across industries.

On the other hand, Google ScaNN is built for efficiency, with a strong focus on speed and accuracy. Unlike FAISS, which supports both exact and approximate search, ScaNN is dedicated solely to approximate nearest neighbour search, making it well-suited for scenarios where speed is critical.

Padmanabhan shared that Microsoft Research India is also working on generative AI for software coding, with two research papers published in this area. The first, CodePlan, combines LLMs with formal methods for program analysis to handle tasks that require coordinated changes across large codebases.

The second, Code Researcher, conducts deep research on large system codebases and commit histories to generate solutions such as crash-preventing patches for the Linux kernel. “The research underlying both these systems has been published openly, sparking further research and development in industry and academia,” he said.

Collaborating for national impact

Microsoft Research India is contributing to IndiaAI’s mission to skill 500,000 individuals by 2026, including students, educators, women entrepreneurs, and government officials. “Our research solutions have already been piloted in partnership with NGOs and various state governments. We hope that working with the IndiaAI Mission would enable us to roll these out nationwide,” Padmanabhan said.

Earlier this year, Microsoft partnered with Yotta Data Services to accelerate AI adoption in India by integrating Microsoft Azure AI services with Yotta’s Shakti Cloud platform.

Yet, despite these initiatives, startups under the IndiaAI Mission have not launched an LLM built from scratch. Meanwhile, Google also announced support for IndiaAI startups at its recent Google I/O event in Bengaluru.

Besides the startups selected under the IndiaAI Mission, Google is also working with others, including CoRover, Glance, Entri, InVideo, Nykaa, Dashverse, and Toonsutra.

The tech giant reiterated its support for India’s AI startup ecosystem through programmes like the Google for Startups Accelerator (GFSA), which has supported over 230 Indian startups.

On the other hand, Microsoft Research India has undertaken extensive benchmarking of LLMs across Indian languages through initiatives like MEGA, Pariksha, and Updesh.

Pariksha involved workers from Karya, an ethical data company, marking the first time rural Indians were directly involved in AI evaluation pipelines. “Karya workers contributed to evaluation and benchmark creation, bringing a wider set of perspectives into AI development,” said Padmanabhan.

The real question is whether Microsoft is truly solving India’s problems or just keeping pace with other global players. Google has been building Indian language datasets through Project Vaani, Meta has released open-source LLMs like Llama 4, which supports 12 languages natively, including Hindi. It provides full language capabilities such as translation, understanding cultural nuances, and generating grammatically correct text for Hindi.

Meanwhile, Microsoft’s research has focused on on-ground deployment, such as teacher training, healthcare assistants, and rural workforce involvement in evaluation.

Padmanabhan said that Shiksha Copilot, piloted in 2024 with 1,000 government schoolteachers in Karnataka, is now scaling to 8,000 teachers in Karnataka and Telangana.

In healthcare, expert-in-the-loop AI assistants operate across hospitals and rural health systems in four states, reaching over 2,500 patients and 3,000 community health workers with multilingual, expert-verified guidance.

“Our approach is to see how AI can work in partnership with humans, with AI helping humans scale and humans helping ensure the accuracy of AI-generated answers,” said Padmanabhan. The lab also launched A4I, a partnership with Microsoft Philanthropies and IIIT Bangalore, to build open-source AI tools as digital public goods.

Microsoft Research India’s work shows that AI research can move beyond the lab and into spaces where it solves pressing, real-world problems.

The post Peeking Into Microsoft Research India’s AI Experiments appeared first on Analytics India Magazine.