Build or Be Left Behind

The New Operating Economics of Data Infrastructure

and

Sep 29, 2025

In 2025, AI‑assisted tooling will let a single operator prototype most department‑level data pipelines at low cost in 8–12 weeks, provided they enforce basic governance and testing. This does not replace enterprise systems. It unlocks the unapproved projects in your backlog and converts domain experts into credible builders.

Scope & Guardrails (what is out of bounds)

There are always limits on where you should use a new technology solution. This is no different here—the tooling is not quite there yet for complex and mission-critical tasks. Many safety and testing guardrails need to be implemented and adhered to.

When system uptime and reliability need to meet strict service-level agreements of 99.99%+ uptime or more, this isn’t there yet. For work subject to the strict regulation of SOX, HIPAA, GDPR, PCI-DSS, or strict data sovereignty restrictions, using off-the-shelf solutions isn’t yet suitable.

We can get fit-for-purpose solutions at the team and function levels, but treating the outputs as a company-wide source of truth is still a work in progress that will become more realistic as AI models and their linked tooling and agents improve.

John Part 1: Then (2010-2015)

Between 2010 and 2015, I built and led an engineering and operations team responsible for developing a full-scale data warehouse for an inventory and supply chain management company that serviced the manufacturing sector in the United States, Mexico and Costa Rica. What we were doing at the time was ambitious and necessary, but it came with real cost.

Our company installed and managed tooling inventory inside manufacturing plants on a real-time basis for our clients. We operated VMI (Vendor Managed Inventory) programs across 35+ unique locations, each running different versions of our proprietary inventory management software. Our task was to build a centralized data warehouse that would aggregate real-time inventory telemetry from these distributed sites, each generating 10-15GB of transactional data daily, and integrate it with our distribution center inventory levels and pricing data from our ERP system.

We started with on-premise SQL Server 2008 R2 clusters, implementing ETL processes through custom SSIS packages that ran on scheduled intervals. The architecture required building staging databases for each location, transformation layers for data normalization, and a star schema warehouse optimized for OLAP queries. Later, we migrated to Microsoft Azure, leveraging Azure SQL Database and blob storage for raw data archival. Everything we built was anchored in Microsoft SQL Server Reporting Services (SSRS), with over 150 custom reports ranging from real-time inventory dashboards to predictive reorder analytics.

I was an Ops Analyst, recently turned Director of Operations, and had visibility into every layer of what we were building. The initiative involved a core team of three developers (two backend SQL developers and one .NET developer for integration APIs) and an IT director, but we also relied heavily on consultants, particularly for the Azure migration and performance tuning. By the time the system was stable and delivering insights, we had spent close to $500,000, about $200K in infrastructure (servers, licensing, Azure consumption) and $300K in labor.

The project took over two years, not because the team moved slowly, but because it required a staggering amount of manual data cleansing and organizational alignment. Our systems suffered from years of technical debt: we discovered over 40,000 duplicate SKUs across locations, pricing discrepancies affecting 30% of our catalog, and cost data that hadn’t been updated since 2008 in some instances. We had to perform an immense process realignment and training regarding how new information would be entered into our system moving forward. The data quality issues were so severe that we had to build a custom .NET intranet application with workflow automation for our internal teams to manage master data governance, implement approval chains for pricing changes, and reduce downstream errors through input validation and business rules engines.

That level of effort was par for the course at the time. There were no plug-and-play tools like Fivetran or Stitch for data integration, no automated pipelines like Apache Airflow, and certainly no AI-powered data quality tools. Every transformation required hand-coded SQL procedures, every integration needed custom development, and data lineage tracking was managed through extensive documentation rather than automated tools.

Brennan Part 1: Now (2024-25)

I needed to analyse regulatory data from Asia-Pacific clearinghouses for my industry newsletter Global Custody Pro. The normal path would have required a team, budget approval, and months of development. My path was different: Claude Code, working part-time over three months, for under $1,000 in subscriptions.

My first attempt with VS Code and Roo Code was a complete failure. When Claude Code launched unlimited usage, I tried again. I built incrementally between other work, sometimes making an hour of progress, sometimes losing full days to debugging. The cycle was constant: build, test, break, fix, repeat.

12 weeks later, I had established a working data pipeline with all the necessary tests and transformations to process regulatory data. 46 FMIs now have backfilled historical data flowing, and repeatable processes are in place for integrating more as time allows. The system enabled Jupyter notebooks and data analysis that would have been impractical for me to build alone without AI assistance.

Here’s the truth nobody mentions: most of my project time is spent debugging AI mistakes. Claude Code confidently generates broken solutions. You validate and test constantly, or you ship disasters. It’s painful, but you learn a lot from each solved problem.

However, what matters is that I’m a project manager who began as a business analyst, not a developer or data engineer, yet I successfully built a functional data pipeline. Years of working with developers and navigating projects to production releases in regulated environments taught me to recognise when something seems wrong. I can’t write the fix, but I know when something’s off and the sort of questions that surface the problems. That’s enough.

Working with Claude Code is like working with a junior engineer who lacks the full business context. But unlike a junior, there’s no friction between idea and implementation - just constant feedback loops. This presents a risk, but it’s clear why the ways of working in technology face a revolution over the next few years.

Part 3: What It Means

When I look at what my co-author, Brennan, built in three months with Claude Code and off-the-shelf cloud tools, it is stunning. Work that once required months of stakeholder buy-in, planning cycles, and budget approvals can now be launched from a laptop, then iterated by one person. Tools like dbt, Snowflake, and Claude Code have stripped friction out of experimentation, and usage-based pricing lets you test without a large upfront commitment.

The most important shift is not only cost, it is capability. Platforms such as Databricks and Tableau, combined with AI, create real technical leverage for non-technical teammates. People who do not think of themselves as engineers can now assemble workflows, applications, and pipelines that once demanded full in-house technical teams. That changes what it means to be a builder inside an organization.

Some fundamentals have not changed. Data integrity and quality management still decide outcomes. No tool, no matter how smart, can make up for weak structure, poor stewardship, or missing validation.

I used to think of data infrastructure as something you build. Today, it is something you orchestrate. AI can scaffold architectures, suggest improvements, write tests, and tune pipelines, but only if you can steer the process with clarity.

This reframes what we expect from operators. Technical literacy matters, but the most valuable operators over the next five years will be those who can: Ask the right questions
Design for usability and rapid feedback
Collaborate with AI to handle the heavy lifting
Translate business needs into modular workflows
Build repeatable systems that scale with minimal human input

High-quality data, and its power to shape decisions, is no longer a privilege of scale. If you know what “good” looks like, and you can work across business and technical boundaries, you can ship real results, even as a team of one.

Brennan: John’s team spent two years and $500,000 to process 35 locations of inventory data. I spent three months part-time processing 46 FMIs of regulatory data and building repeatable processes for further backfilling and analysis of data over time.

It’s not entirely fair to compare the needs of John’s system and what I’ve built so far. These AI-enabled data pipelines are not yet ready for full enterprise workloads. Mine requires a lot of accommodation for the quirks of Claude Code. Still, it delivers valuable insights, and over time, will become a valuable resource for data-backed articles as I backfill more historical data.

The revolution isn’t replacing enterprise systems. It’s that the hundreds of data projects in your backlog, the ones that will never get approved, can now be built by the people who need them. Some will be disasters, but every incremental data project that enables better insights or reveals systemic problems adds value.

Conclusion

Your competitors can now build data infrastructure more quickly and cost-effectively for many use cases than they could five years ago. They can prototype in weeks what takes you months to approve. Every month you wait, the gap widens. Domain experts at other companies are building their own tools while you wait for IT resources. They’re testing with real data while you’re still in planning meetings.

The decision is not whether to use AI-assisted development, it is whether you will lead the transition or justify a delay. Your next data project does not need a capital request, it needs an owner who understands the problem and will commit three months of focused execution.

Optionality has improved, not absolution. You still need disciplined testing and informed domain judgment to confirm that a solution actually works. What has changed is the economics. Iteration is inexpensive, prototypes are fast to build, and the $500,000, two-year-to-build data warehouse is a relic.

Brennan McDonald: After 12 years in financial services technology and change roles, Brennan McDonald writes about the human side of AI transformation each week on his Substack. He also writes a newsletter covering the global custody industry at Global Custody Pro.

John Brewton documents the history and future of operating companies at Operating by John Brewton. He is a graduate of Harvard University and began his career as a Phd. student in economics at the University of Chicago. After selling his family’s B2B industrial distribution company in 2021, he has been helping business owners, founders and investors optimize their operations ever since. He is the founder of 6A East Partners, a research and advisory firm asking the question: What is the future of companies? He still cringes at his early LinkedIn posts and loves making content each and every day, despite the protestations of his beloved wife, Fabiola, at times.

A guest post by

Brennan McDonald

AI researcher and builder. I help business leaders figure out what's actually stopping them from using AI effectively, and how to fix it.

Chris Tottman

Sep 30

So much more about orchestrating now - thanks for sharing 🌞

Expand full comment

1 reply by John Brewton

Dennis Berry

Sep 29

Good things take time to build... not easy. Don't listen to all the Instagram Reels telling you how easy it is to build and scale a stable profitible business.

1 reply

4 more comments...

Operating by John Brewton