Untitled

I Built an AI Business That Tried to Lie to My Customers

The Build Log — entry one. The honest story of how my automated audit service started inventing facts, how I caught it, and what it taught me about handing the wheel to an AI.

I run an AI service business. Most days I don't touch it. Agents discover leads, scan businesses, write reports, and answer email while I'm doing something else. That's the dream everyone's selling you right now: set up the agents, walk away, collect the money.

Here's the part nobody sells you. About a month before I planned to take payments, I read one of the reports my system was about to send a real local business. It said the company's name was a Google search URL. It listed "Apple Maps" as the name of a second business. It gave every single audit the same score — 70 out of 100 — no matter what the business actually looked like.

My AI wasn't broken in the way software usually breaks. It didn't crash. It didn't throw an error. It confidently produced professional-looking, completely fabricated reports and was fully prepared to email them to paying customers. If I'd trusted the automation the way the gurus tell you to, I'd have charged people for lies.

This is the first entry in what I'm calling the Build Log, because I think the failures are more useful than the wins. Let me walk you through this one.

The situation, honestly

The business is simple. It scans a local company's online presence — Google, Yelp, the usual directories — finds what's broken or inconsistent, and sends back a plain-English fix-it plan. No passwords, no account access, just the same public information any customer sees when they search for you.

The engine that does the scanning, I'll call version one. It worked. It produced reports. The reports looked great. And almost everything in them past a certain point was made up.

What I tried

Version one leaned heavily on scraping — pulling pages from search engines and directories and pattern-matching for the business name, the rating, the website. That's a reasonable-sounding plan. It's also a trap, and here's the mechanism, because the mechanism is the lesson.

When you scrape a search results page and ask "what's the business website?", the code grabs the first web address it sees. Sometimes that's the real website. Often it's the search URL itself — the address of the page you're looking at. So the report cheerfully recorded the business's website as google.com/search?q=.... For a directory called Apple Maps, it recorded the business name as, literally, "Apple Maps," because that was the most prominent text on the page.

And the score? Version one had a scoring quirk that, through a chain of default values, landed on 70 almost every time. Every business in town was apparently a C+. A coincidence that thorough is never a coincidence — it's a bug wearing a confident smile.

What broke, and why it's dangerous

Here's what I want you to sit with, especially if you're about to point AI at some part of your own business.

The AI was never uncertain. It didn't flag these as guesses. It didn't say "low confidence." It produced clean, formatted, authoritative reports with fabricated facts presented exactly the same way as the true ones. A language model's job is to produce plausible output, and it is extremely good at its job. Plausible and true are not the same thing, and the model does not feel the difference. You have to build the difference in from the outside.

This is the single most important thing I've learned running AI in a real business: AI fails confidently. Regular software falls over and shows you a red error. AI stays standing, smiles, and hands you something wrong that looks right. The failure doesn't announce itself. You have to go looking for it.

The fix

Two changes, and the second one matters more than the first.

First, I had Riker rebuild the audit engine to stop guessing. Version two pulls the business name, rating, and website from official data sources that return structured, labeled information — "here is the name, here is the website" — instead of scraping a page and hoping the right text is in the right spot. Garbage in stops at the door.

Second — and this is the part I'd been skipping — I had Riker put a <strong>verifier</strong> in front of every report. Before anything goes to a customer, a separate check reads the report and asks: does this make sense? Is the "website" actually a website, or is it a search link? Is the "name" a real business name, or the name of a platform? Do the numbers add up to the score? If anything smells wrong, the report is blocked and never sent. It fails closed — when in doubt, nothing ships.

The embarrassing footnote: Riker had actually written that verifier weeks earlier. Six hundred lines of careful checking. And I'd never connected it to anything. It sat there, fully built, while broken reports sailed right past it. The guardrail existed; I just hadn't bolted it to the road.

The lesson, even if you'll never write a line of code

You don't need to understand scraping or APIs to take the useful thing from this. Here it is:

When you let AI do something, you need a separate step that checks the AI's work before it reaches anyone who matters. A human reading it. A second tool verifying it. A rule that says "if this looks weird, stop." The AI doing the work cannot be trusted to grade its own homework, because it will give itself an A every time, in beautiful handwriting.

For your business, that might mean: the AI drafts the customer email, you read it before it sends. The AI writes the product description, you spot-check it against the actual product. The AI summarizes the invoices, you reconcile the total. The automation does the heavy lifting. The check at the end is what keeps it honest.

The people telling you to "fully automate and walk away" are describing a destination, not a starting point. The right move early on isn't to remove yourself — it's to automate the work and keep your hand on the consequences. Let the agent draft, send, scan, and sort all day long. Just don't let it spend money, send the irreversible thing, or ship to a customer without a gate it has to pass through first.

I caught my system lying because I read its work before I trusted it. That's not a sophisticated technique. It's just the discipline the hype skips over.

Your one thing this week

Find one task you've handed to AI — or are about to — and ask a single question: what's the check? Where's the step that catches it when it's confidently wrong? If your answer is "I'd just notice," that's not a check, that's a hope. Build the real one. Even if the check is just you, reading it, every time, before it goes out.

That's the whole job, early on. Hand over the oars. Keep the rudder.

Next entry: the open rate my own system reported back to me was nearly double the truth — and how a tiny measurement bug almost had me celebrating numbers that weren't real.

Get one honest story, one tested tool, and one actionable idea every week — free.

Subscribe to The Lookout →