Vibe coding won. That's not a hot take anymore — it's a fact. Andrej Karpathy coined the term in February 2025, and within a year it became Collins Dictionary's Word of the Year. AI now accounts for 42% of all committed code, according to Sonar's 2026 State of Code survey. Developers expect that to hit 65% by 2027. Bolt, Lovable, Cursor, Claude Code — people are shipping real products with these tools, faster and cheaper than anyone thought possible.
This post isn't about whether vibe coding works. It does. It's about the one piece of the workflow that hasn't kept up.
The missing half of the workflow
Here's the thing: AI is perfectly capable of writing tests. You can ask Claude to write Playwright scripts, ask Cursor to generate unit tests, ask Copilot to scaffold a test suite. The technology isn't the bottleneck.
The workflow is.
When you vibe-code a feature, the natural flow is: describe what you want → AI writes it → you see it working → you ship. Testing isn't part of that loop. Not because AI can't do it, but because nobody prompts for it. The tools don't suggest it. The workflow doesn't include it. There's no friction-free moment where testing just happens.
So people skip it. Not maliciously — just naturally. The same way you skip flossing when you're in a rush. And the consequences accumulate quietly.
The gap isn't technical. AI can write tests just as well as it writes code. The gap is behavioral — testing isn't embedded in the vibe coding loop, so it gets skipped at exactly the moment it matters most: when code is being generated faster than ever.
What the data shows
The quality data isn't damning — it's clarifying. It tells us exactly where the gaps are.
CodeRabbit analyzed 470 open-source PRs in December 2025: AI-co-authored code had 1.7x more major issues and 1.57x more security findings than human-only code. Not because AI writes fundamentally worse code — but because AI code tends to get less review. It arrives fast, it looks plausible, and the developer accepts it.
Tenzai tested five leading AI coding tools by having each build three identical web apps. Across 15 applications: 69 vulnerabilities, 6 critical. The interesting part? The tools blocked generic SQLi and XSS perfectly. The failures were all business logic — negative pricing, missing authorization checks, SSRF. The kind of bugs you only catch by actually testing the flows, not by scanning the code.
Sonar's survey of 1,100 developers found the paradox clearly: 96% don't fully trust AI code, yet only 48% always verify it before committing. The trust is low, but the friction of verification is high enough that people ship anyway.
The pattern is consistent: the issues aren't in the code itself — they're in the flows, the edge cases, the business logic. Exactly the things that end-to-end testing catches and static analysis doesn't.
Why "just write tests" doesn't work
The obvious answer is "just ask your AI to write tests too." And technically, that works. But practically, it doesn't — for three reasons.
First, it breaks the flow. Vibe coding is fast because it's continuous — you describe, you see, you iterate. Stopping to say "now write Playwright tests for what we just built" is a context switch. It's the difference between a freeway and a freeway with a toll booth every mile. Technically you still get there. Practically, everyone finds a back road.
Second, AI-generated test code has the same maintenance problem as human test code. If you ask Claude to write 200 lines of Playwright scripts, you now have 200 lines of test code to maintain. Selectors break. Pages change. The tests you generated last week fail this week because a designer renamed a CSS class. You're back to the old problem, just faster.
Third, there's no persistence. You generate tests in one session, and they exist as files in your repo. But they're not connected to a test runner, a dashboard, or a CI pipeline. They're just... code. More code to maintain.
The real solution isn't "AI writes test code." It's "testing becomes a seamless part of the vibe coding workflow, with no code to maintain at all."
What the workflow should look like
Imagine this: you vibe-code a feature. Before you push, you say one sentence — "test my app" — and it launches a browser, clicks through every flow, fills every form, and tells you what broke. No test files. No selectors to maintain. No setup.
Next week, a designer renames a button class. You say "run my tests" and the same tests execute, but the broken selector is automatically healed. The test still passes. You didn't notice anything happened.
That's what we built with FastTest. The key insight: testing should work exactly like vibe coding — describe what you want in natural language, and the AI handles the rest.
vibe_shield is our one-command safety net. First run: it explores your app, generates test cases from what it finds, and saves them. Every subsequent run: it executes those tests, self-heals any broken selectors, and reports regressions. No test code to write. No test code to maintain.
The real problem was always workflow
Vibe coding didn't create a testing problem. It revealed one that was always there: testing has always been a separate step, a separate tool, a separate mindset. And when you make coding 10x faster without making testing 10x easier, the gap that was always there becomes impossible to ignore.
The Escape.tech scan that found 2,000 vulnerabilities across 5,600 vibe-coded apps? That's not an indictment of AI. It's an indictment of shipping without testing — something developers have always wanted to do, and AI finally made easy enough to do at scale.
The fix isn't slowing down. The fix is making verification as frictionless as creation. When testing lives in the same editor, speaks the same language, and takes one prompt instead of a context switch — people actually do it.
The same AI that's good enough to write your checkout flow is good enough to test your checkout flow. The question was never capability. It was always workflow.
Where this is going
Vibe coding is going to keep winning. By 2027, the majority of committed code will be AI-generated. The tools are getting better every quarter — better at understanding context, better at generating correct code, better at avoiding the generic vulnerabilities they used to miss.
But better code generation doesn't eliminate the need for testing. It changes what testing looks like. The bugs shift from syntax errors to business logic gaps. From "the code doesn't compile" to "the code compiles but the checkout flow charges $0 when you apply a 100% discount twice." Those are the bugs you only find by running the app and testing the flows.
Our bet is simple: testing should feel exactly like vibe coding. Describe what to test, the AI handles the rest. No test files. No selectors. No maintenance. Just a safety net that grows with your app and catches the things that code generation alone can't.
The vibes are good. Let's keep them that way.