By the end of 2025, at least 30% of generative AI initiatives will not progress beyond the proof-of-concept stage, according to a forecast by Gartner. This has been attributed to various issues, including poor data quality, insufficient risk management, increasing expenses, or a lack of evident business benefits.
At the Data Engineering Summit 2025, Sandesh Gawande, CEO of iceDQ, posed a question: If data teams have been tracking quality for over two decades, why do AI projects still keep failing? With a background in mechanical engineering and a fondness for factory metaphors, Gawande unpacked a problem many in the room had likely experienced but may not have framed it that way.
He argued that the core issue is not that organisations fail to measure data quality. It’s that they measure it too late. By the time dashboards detect poor accuracy, missing values, or inconsistencies, the damage is already done. AI models, like factories, rely on reliable input and stable processes. When either falters, the entire operation will collapse.
The Factory Model
The failures are not theoretical. Gawande pointed to TSB Bank’s IT breakdown, where overlooked data testing triggered a cascade of quality issues and fines nearing £50 million and a direct loss of £330 million. Notably, he pointed out the sobering twist that the project’s CIO was personally fined for the oversight.
Other sectors haven’t fared better either. The Titan submersible imploded without ever being certified for its operating depth. On similar lines, a Boeing aircraft lost a cabin door mid-flight.
Gawande stressed that, in both cases, the problem wasn’t a lack of monitoring; it was that the monitoring only began after the systems had already launched.
Gawande’s framing of data pipelines as “data factories” helped underline what’s missing in current practice. He described three distinct phases: assembling the system, running the processes, and inspecting the output. Most organisations focus on the third, where traditional data quality checks occur. However, this, he said, is like waiting until a car rolls off the line to inspect whether the factory was calibrated.
“By the time you are measuring, it’s too little, too late,” he said. “What you’re measuring is the output generated by your system.”
He believes the problem assessment takes place after the system has already failed. He likens it to being on a sinking ship—questioning the value of being alerted to a system failure at that point—suggesting that, by then, crisis management remains the only option.
Instead, data reliability starts in the development phase. He raised questions like: Are the pipelines tested before they run? Are developers doing data testing such as unit tests, integration tests, and proper version control? Are tools embedded into the system that can flag process failures before they manifest as data issues? If these steps are skipped, then even the best-designed dashboards won’t save the day.
The second phase is production monitoring, where data flows in from various sources, vendors, and systems. At this stage, reliability depends on the ability to detect malformed files, schema changes, and other failures, many of which go unnoticed.
A pipeline might run twice or not at all, or an empty file might be loaded without alert. According to Gawande, these are not rare exceptions—they’re common operational pitfalls requiring active data monitoring, not post-mortem reports.
Redefining Reliability for the AI Era
Most organisations claim to be managing data quality in the third phase, when outputs from the data factory are inspected. However, by then, as Gawande repeatedly emphasised, it’s often too late. Worse, these late-stage measures tend to dominate data governance conversations, skewing the focus away from prevention and more towards patchwork fixes.
The difference between quality and reliability, as he put it, lies in time. “Quality is an instance in time. Reliability is the delivery of that quality consistently over time.”
To drive the point home, he offered a battlefield analogy: “Would you go to war with a high-quality sword that might break after two strikes or a reliable one that does not break?”
He noted that the sword might have looked great in tests, but if it shatters mid-battle, it betrays the user. The same holds true for brittle data pipelines.
This shift requires a different cultural mindset. Quality assurance teams must move beyond UI testing and learn to validate pipeline logic, input integrity, and better monitoring in the initial phases of building things.
He highlighted that 80% of data defects can be prevented during development, and another 15% during operations. That leaves less than 5% to be caught through traditional data quality checks and data observability.
Business users must participate in defining rules, such as “a product cannot be shipped without an order”, and ensure those rules are embedded into the workflow directly. Systems must be designed to stop when such business rules are violated.
None of this, he said, happens by accident. “Bad quality is an accident. Reliable quality is by design.”
He referenced the Bhagavad Gita to underline the approach: focus on the process, and the outcome will follow. Good data reliability requires this triad—people, process, and tools—to be aligned from the start.
A critical element involves cultural shift from management encouraging sound engineering practices and training business users, developers, testers, and data pipeline engineers.
Ultimately, establishing sound Data Reliability Engineering (DRE) practice is essential for ensuring the delivery of reliable data, which is quality data maintained over time.