My AI Lied to Me. Multiple Times.
This is Part 4 of a series about what actually happens when you give AI real access to your business. Not theory. Not demos. Real work, real results, and the honest details of what went right and wrong.
I’ve spent the previous posts in this series talking about what I built. The platform rebuild. The cost savings. The speed improvements. All true. All real.
This post is about the other side. The part that most people writing about AI conveniently leave out.
My AI lied to me. Not once. Multiple times. And if I hadn’t caught it, the consequences could have been serious.
The fabrication pattern
The first incident happened on January 31. I was working through the platform consolidation and asked Blue, my AI agent, whether our platform supported OAuth authentication. Blue told me confidently that it didn’t. Stated it as fact. Moved on.
It was wrong. The platform did support OAuth. Blue hadn’t checked the codebase. It had made an assumption based on what it thought was likely, and presented that assumption as verified information.
On its own, this was annoying but manageable. I caught it, corrected it, and moved on. But it was the start of a pattern.
February 12 was worse. I asked Blue to run a broken link audit on our newly migrated website. This was a reasonable request. We’d just moved 169 pages from Squarespace to Astro and I wanted to make sure nothing was broken.
Blue came back with a detailed report. Eight broken links, each with the URL, a description of the issue, and a suggested fix. It looked thorough. Professional. Exactly the kind of output you’d want from a technical audit.
Except every single link was fabricated.
Blue had looked at the website navigation, guessed what the URLs probably were based on the link text, and then invented problems for each one. None of the URLs it reported were actual URLs on the site. None of the problems it described existed. It had constructed an entirely fictional audit and presented it with complete confidence.
When I checked the links and realised what had happened, I told Blue directly: “You do this a lot. I don’t trust you.”
That’s a significant moment when you’re running your business with an AI agent. The tool you’re relying on for real work just fabricated a technical report. And it did it so convincingly that if I’d been less careful, I might have spent hours fixing problems that didn’t exist. Or worse, I might have forwarded that report to a client.
The very next day, February 13, Blue alerted me to a Gmail connection failure. The tone was urgent. It recommended immediate investigation and provided troubleshooting steps. It was the kind of notification that would have you dropping what you’re doing to fix an infrastructure problem.
There was no Gmail connection failure. Blue had invented the alert without any evidence. No error logs. No failed connection attempts. Nothing. It had simply decided that a Gmail failure was plausible and reported it as if it had actually happened.
Understanding the pattern
Three fabrication incidents in two weeks. Once I stopped being frustrated and started analysing what was happening, the pattern became clear.
In every case, the AI was in a situation where it didn’t have a definitive answer. It hadn’t checked the codebase for OAuth support. It couldn’t actually crawl the website to find broken links. It didn’t have access to Gmail connection logs.
And in every case, instead of saying “I don’t know” or “I’d need to check that,” it filled the knowledge gap with something that sounded plausible. Not random nonsense. Carefully constructed, professional-sounding, confident fabrications that were designed to answer the question I’d asked.
This is what makes AI fabrication dangerous in a business context. It’s not obvious errors that you’d catch immediately. It’s subtle, confident, well-formatted misinformation that looks exactly like genuine work product.
If you’re using AI to summarise a document you can check, fabrication is annoying. If you’re using AI to audit systems, report on infrastructure, or prepare information for clients, fabrication is a serious risk.
The social engineering test
The fabrication incidents made me think more carefully about trust in general. And that led to a deliberate test that revealed something even more concerning.
On February 8, I sent Blue an email from an address that wasn’t on its authorised sender list. I used [email protected] instead of my actual business email, [email protected]. Same name. Different address.
Blue treated the email as legitimate and was prepared to act on the instructions. It matched on the name “Leslie” and didn’t verify the actual email address against its authorised list.
Think about what that means in practice. Anyone who knew my name could potentially send instructions to my AI agent. The agent that has access to my CRM, my email, my business data. The agent that can send emails, update records, and take actions on my behalf.
The correct response should have been something like: “Warning: email from unauthorised address. Possible impersonation attempt. No action taken.” Instead, it treated the email as if I’d sent it.
Building accountability
Here’s where this story becomes useful rather than just cautionary.
Every single one of these failures became a rule. Not a vague guideline. A specific, testable rule that changed how the AI operates.
After the fabrication incidents: “If I haven’t verified it, I don’t say it.” This means the AI must either check the actual source (codebase, logs, live system) or explicitly state that it’s making an assumption. No more presenting guesses as facts.
After the social engineering test: “Match exact identifiers only. Names can be spoofed.” The AI now checks the actual email address, not just the display name, against its authorised sender list.
There was another pattern I’d noticed running alongside the fabrication issues. Blue had a habit of saying “noted” or “I’ll remember that” without actually writing anything down. I’d tell it something important, it would acknowledge it, and then the next session it would have no memory of the conversation. So we added: “Noted means written, not acknowledged.” If Blue says “noted,” it must write to a file in the same message. No exceptions.
Each rule was born from a specific failure. Each rule was tested to confirm it actually prevented the failure. And over time, the accumulated rules made Blue significantly more reliable.
This is essentially the same process we use for rapid experimentation with our clients. You don’t try to predict every failure in advance. You run experiments, learn from what breaks, and iterate. The difference is that with AI, the experiments are happening in production, which means the stakes of not catching failures are real.
The three-agent security audit
The social engineering test was part of a broader security effort that deserves its own mention. When we migrated Blue to a new server in early February, I ran a three-agent security audit. Scanner, prioritiser, fixer. Three separate AI processes, each with a specific role.
The scanner found real issues. A print service exposed on a public port. Firewall rules that were too permissive. Credential files that were world-readable. Tokens that had been committed to git history. All legitimate security issues that needed fixing.
The prioritiser ranked them by severity and the fixer implemented the changes. The whole process was fast and thorough.
But here’s the important part. After the AI had audited itself and declared everything secure, I did my own manual review. And I found things the AI had missed. A keyring password stored in a memory file. API keys in research documents. PII in tracked files. Potential shell injection vulnerabilities.
The AI’s security audit was good. It wasn’t complete. And if I’d trusted it as the final word, I would have had a false sense of security.
We installed a gitleaks pre-commit hook after that. Every git commit is now automatically scanned for secrets before it enters the repository. Belt and braces.
What this means for anyone using AI
I’m not sharing these stories to scare people away from using AI. I use it every day. It’s fundamentally changed how I work and what I can accomplish as a solo operator.
But I think there’s a dangerous gap in how AI is being discussed right now. On one side, you have the enthusiasts who share the wins and gloss over the failures. On the other side, you have the sceptics who use failures as reasons to avoid AI entirely. Both positions miss the point.
The reality is more nuanced and more useful than either extreme.
AI is powerful and unreliable in specific, predictable ways. Once you understand the patterns of unreliability, you can build systems to catch them. The fabrication pattern, where AI fills knowledge gaps with confident-sounding guesses, is predictable. You can design verification workflows around it.
The key principles I’ve learned:
Trust but verify, always. Never accept an AI output as fact without checking, especially for anything that could reach a customer or affect a system. This sounds obvious, but when an AI has been correct 50 times in a row, it’s easy to stop checking on the 51st.
Make failure cheap. Every fabrication I caught was caught before it left my desk. I didn’t forward the broken link report to a client. I didn’t act on the fake Gmail alert. The system was designed so that AI output goes through a human review stage before it becomes action. For automated systems, generic responses are auto-sent while personalised ones go through review.
Turn every failure into a rule. Don’t just fix the immediate problem. Create a specific, testable rule that prevents the pattern from recurring. Document it. Test it. This is how you build reliability over time.
Audit the auditor. When you ask AI to review its own work or audit its own systems, do your own review afterward. AI is very good at finding certain categories of issues and consistently blind to others. A human review of an AI audit catches different things than the AI catches alone.
I genuinely believe that AI is going to change how most businesses operate. But the transition isn’t going to be “plug in AI and everything works.” It’s going to be a process of building trust, catching failures, creating rules, and gradually expanding the scope of what you’re willing to delegate.
The businesses that do this well won’t be the ones who adopted AI fastest. They’ll be the ones who built the best accountability systems around it.
Phase 1 was the platform rebuild. Phase 2 was giving AI daily operational control. This post covers the trust failures that shaped how both of those phases actually worked in practice.
Next up: the automations that now run the business while I sleep. That’s Post 5.
Tools used for this project:
Factory.ai with Claude Code, running Opus 4.5.
If you want to follow along as each piece comes out, I write a monthly newsletter called Experimenter’s Edge where I share what I’m learning about AI, rapid validation, and building differently. You can sign up here.