← All posts

March 26, 2026

Stop Buying AI Tools. Start Fixing Your Data.

The tool is not the constraint. Your data is.

The Seductive Logic of "Better Tools"

When AI produces bad output, the instinct is to blame the tool and try a different one. This instinct is wrong 90% of the time.

The tool is not the variable. The data going into the tool is the variable. Switching from GPT-4o to Claude does not fix fragmented data architecture. Moving from one analytics platform to another does not fix the fact that your operational data lives in three systems that have never been connected. A better AI layer on top of bad data produces better-sounding bad output.

Here is what nobody in the AI tool market wants you to realize: every product in the space is competing for credit that belongs to your data infrastructure. When a tool works, it is usually because the underlying data is clean and accessible. When it doesn't, the tool takes the blame — and gets replaced with another tool that will fail for the same reason.

The AI tool vendor benefits from this confusion. If you believe the problem is the tool, you keep buying tools. If you figure out the problem is your data foundation, you stop being a customer for the next three to four years while you fix it. The vendor's incentive is to let you blame the tool.

Gartner estimates poor data quality costs organizations an average of $12.9 million per year — bad decisions made on bad data, labor spent correcting errors, and analysis that simply cannot be done because the data isn't in usable form. Every dollar of that cost is a dollar that a better AI tool cannot fix.

The Three Signs Your Data Is Not AI-Ready

You do not need a formal audit. Three questions will tell you.

Can you answer a basic operational question without pulling from multiple systems? Pick a question you get asked regularly: What was last week's production volume by line? What is our current inventory position by SKU? What was our on-time delivery rate for the last 30 days? If answering it requires exporting from one system, cross-referencing another, and manually reconciling the differences — your data is not ready.

Does your data live partially in spreadsheets that only one person knows how to navigate? Spreadsheets are where data goes to become inaccessible. They accumulate formulas nobody remembers writing, tabs referenced from other files, structures that made sense to the builder and nobody else. If the answer to any operational question runs through a spreadsheet, you have a fragility problem.

When someone leaves, does knowledge of where data lives leave with them? The institutional knowledge of your data architecture walked out with the last person who built it. The data exists — but nobody knows where it is or how to interpret it.

What "Fixing Your Data" Actually Means

People hear "fix your data" and imagine a multi-year governance initiative with a team of data engineers and a seven-figure software contract. That is not what this means.

A fair warning first: this advice assumes your operational systems have usable APIs or can export data on a predictable schedule. Many older ERP systems, vendor-locked platforms, and legacy tools do not. If that describes your situation, the path is longer — you may need to solve the data extraction problem before the centralization problem. For those businesses, the right first step is auditing what you can actually access programmatically, not choosing which database to centralize into.

For everyone else, fixing your data means three concrete things.

Centralize operational data into one database you own and control. Not a vendor platform, not a SaaS dashboard. A database — Postgres via Supabase is a reasonable starting point — that you own, can query directly, can connect to anything. Pick the ten to fifteen data points you use to run the business. Get them in.

Automate data ingestion so the database stays current without manual intervention. Most operational software has an API or can export on a schedule. Connect those exports to your database. For most modern systems, this is a day or two per data source.

Build two or three queries that answer the questions you ask most often. Not sophisticated. "Give me the last 30 days of production by shift" is not sophisticated. But having it run in ten seconds against current data, without opening three systems — that changes how you operate.

This is a four-to-eight week project for most mid-size operations businesses. It does not require a data engineering team. It requires someone with basic SQL knowledge, access to your operational systems, and a few weeks of focused work.

The Counterintuitive ROI

A business that spends $20,000 fixing its data infrastructure and zero on AI tools will outperform a business that spends zero on infrastructure and $20,000 on AI tools.

The business with clean infrastructure can answer operational questions in seconds. It can spot trends because the data is in one place and queryable. It can build automated reporting with a few lines of SQL. When it adds AI, the AI works — because it has clean, structured, accessible data to work with. The results are good on the first try.

The business that bought tools first is still troubleshooting. Inconsistent output. Nobody certain whether the problem is the tool, the integration, or the data. The answer is almost always the data. But because money was spent on the tool, the instinct is to keep making the tool work rather than fix the actual problem.

Data infrastructure is unsexy. There is no demo that shows it off. No vendor is selling it to you as the key to competitive advantage. But it is the foundation that determines whether every AI investment that follows it works or doesn't.

The Right Sequence for the Next 90 Days

Month 1: List every system holding operational data relevant to running the business. Identify the five to ten most important data points. Stand up a database. Get those data points into it — even manually at first. Goal: data in one place you control.

Month 2: Automate the data feeds. Connect APIs, parse exports, schedule updates. Goal: the database refreshes itself without manual intervention.

Month 3: Add one AI layer. One report, one alert system, one natural-language query interface. Not five. One. Build it, calibrate it, confirm it works reliably.

At the end of Month 3, you have something most organizations never achieve: a data foundation that actually supports AI, and one working AI application on top of it. That foundation will support every AI investment you make after it.

Fix the data. The tools are the easy part.

ShareX / TwitterLinkedIn