I ship production AI.And I've run the enterprise that lives with it.
Most opinions on AI have never shipped a model. Most executives who ran technology at scale stopped writing code years ago. I do both. 18+ years owning enterprise P&L, ERP modernizations, and multi-site cloud migrations, now building production AI: local LLMs, agentic orchestration, RAG, and the pipelines that feed real systems. That intersection is the whole point.
A two-model AI pipeline, shipped to production.
Stampede Distribution sells industrial safety gear. Their product data arrived raw from an ERP and from 26 different vendor sites, no two formatted alike, every description duplicated across the web. I built a Cloudflare-native enrichment pipeline that reads the source, extracts structured data, and rewrites every listing for search uniqueness, with hard rules against inventing a single compliance claim. It runs on owned infrastructure. No third-party AI bill, no data leaving the account.
Stage 1 · Extraction
Gemma 3 12B, with a Qwen 2.5 Coder 32B fallback. Reads up to 25KB of cleaned vendor HTML, emits structured JSON: title, specs, features, classified documents (MSDS, spec sheets, compliance PDFs), part numbers, ANSI standards. A quality gate refuses to write anything if the model did not return real content.
Stage 2 · SEO rewrite
The same dual model rewrites vendor-verbatim copy for uniqueness at temperature 0.1, while preserving every technical fact. Hard rule: never invent an ANSI, OSHA, EPA, or CE claim. On a parse failure it degrades to the vendor copy rather than guessing.
- Dual-model fallback with a documented swap history: dropped Llama 3.3 70B for timeouts and Llama 4 Scout for parse failures.
- Idempotent re-runs guarded by an enriched_at timestamp. Safe to re-run, safe to force.
- No-hallucinated-compliance rule enforced in the prompt. Regulatory claims are never fabricated.
- Graceful degradation to vendor copy on failure, plus a full enrichment audit trail.
Four ways to put AI to work.
Production AI integration
RAG, agentic orchestration, and data-enrichment pipelines wired into systems you already run. Built to ship, measured with eval harnesses, not slideware.
Proof: Stampede pipeline + Team-XLocal-first, sovereign AI
Own the model and the data. On-prem or in your own cloud account, no third-party AI bill, no data leaving your perimeter, no vendor that can change the terms on you.
Proof: Vision Studio + Agent-XAI strategy for operators
Where AI actually pays versus where it is theater. Honest scoping from someone who has owned the P&L and the SLA, plus the benchmark harnesses that keep a system honest after launch.
Proof: 18+ years operating at scaleCloud repatriation and cost
The math on what you actually spend renting compute, and a path back to owned infrastructure where it pays. Real break-even analysis, not a migration for its own sake.
Proof: The Cloud Repatriation ReckoningDon't take my word. Run it.
Three production AI systems, all MIT-licensed and public. Fork them, read them, run them locally.
AI is infrastructure to own, not rent.
When you rent your AI from a closed provider, you rent the terms too. They decide what counts as a violation, they change the pricing, and your data lives in someone else's stack. The systems I build run on infrastructure you own. I write about why, with the numbers behind it.
Can you build AI that runs on our own infrastructure?
Yes. That is the default. The Stampede pipeline runs entirely on the client's own Cloudflare account: their models, their database, their assets. On-prem and your-own-cloud deployments are both standard. No data leaves your perimeter and there is no per-call bill to a third-party AI vendor.
How do you keep a model from hallucinating in a regulated workflow?
Hard rules in the prompt plus a quality gate in code. In the Stampede build the model is forbidden from inventing any ANSI, OSHA, EPA, or CE claim, and the pipeline refuses to write a result that lacks real extracted content. On a parse failure it degrades to the verified source rather than guessing.
What does production AI actually cost versus a SaaS subscription?
It depends on volume, but owned inference removes the per-seat and per-call meter entirely. The Stampede pilot enriched 308 products across 616 model calls at no third-party AI cost. For steady workloads, owning the compute usually beats renting it past a break-even point I will calculate for your case.
Do you only do greenfield AI, or integration with existing systems?
Mostly integration. The hard part of enterprise AI is rarely the model, it is the data and the systems around it. I have spent 18+ years inside ERPs, migrations, and infrastructure, so the work plugs into what you already run instead of replacing it.
Who actually does the work?
I do. The person who scopes the engagement is the person who writes the code and ships it. No handoff to junior staff, no account managers in between.
Let's build somethingyou own.
No theater. No rented stack you cannot inspect. A direct conversation about what to build, what it costs, and how to ship it.
