The promise of vibe coding is that an AI assistant writes most of your code while you steer. The danger is the same sentence read a different way: an AI writes most of your code, and if you are not careful, you ship code that nobody, including you, has read. That is how a solo builder ends up with a security hole, a silent data bug, or a feature that works in the demo and breaks the moment a real user touches it. Reviewing generated code well is the single skill that separates people who ship reliable apps fast from people who ship fast and then spend a month firefighting.
Why “it runs” is not the bar
The most seductive trap in vibe coding is that generated code usually runs. It compiles, the screen renders, the happy path works, and your brain quietly files it under “done.” But running and correct are different things. Generated code can run while storing passwords in plain text, leaking data between users, mishandling errors, or quietly doing the wrong thing on every input you did not test. The AI optimizes for code that looks right and satisfies the prompt, not for code that is safe and correct in your specific context. Closing that gap is your job, and it is not optional.
Read with a triage mindset
You do not have to read every line with equal attention, and pretending you will is how review gets skipped entirely. Instead, triage. Sort generated code into three buckets and spend your attention accordingly.
The first bucket is load-bearing code: anything touching authentication, payments, user data, permissions, security, or money. Read every line of it, slowly, as if you wrote it and your name is on the breach. This is where a bug is not a bug, it is an incident. When I built a private speech-to-text app, the entire pitch was that audio never leaves the device, so I read every line that touched data flow myself. You cannot promise privacy you have not personally verified.
The second bucket is logic you will depend on: the core feature, the data model, the state management. Read it closely enough to understand it completely, because you will be extending and debugging it for as long as the app lives. Code you do not understand is code you cannot maintain, and “the AI wrote it and I am not sure how it works” is a sentence that ends projects.
The third bucket is boilerplate and scaffolding: a settings screen, a standard list view, glue code that follows a well-worn pattern. Skim it for obvious problems and move on. Spending an hour scrutinizing a stock layout is attention stolen from the code that could hurt you.
The questions to ask every time
For anything in the first two buckets, run the same checklist in your head. It becomes automatic fast.
- What happens on bad input? Empty string, null, a huge number, an emoji, a malicious string. Generated code loves the happy path and forgets the rest. Real users live in the rest.
- What happens on failure? The network drops, the disk is full, the request times out. Does the app handle it, or does it crash or silently lose data? Error handling is the most common thing AI under-builds.
- Where does user data go? Trace it. Does anything get logged, sent to an analytics service, or stored unencrypted that should not be? A stray logging call can quietly turn a private app into a leaky one.
- Does this do more than I asked? Sometimes generated code pulls in a heavy dependency, adds a network call you did not request, or includes a “helpful” feature you never wanted. Delete what you did not ask for.
- Would I have written it this way? Not as ego, but as a smell test. If the approach surprises you, understand why before you accept it. Surprise is often the first symptom of a bug.
Test the boundaries the AI ignored
Reading catches a lot, but some problems only show up when you run the code against inputs the AI never considered. So poke at the edges deliberately. Paste in text that is far too long. Leave the field blank. Lose the network mid-action. Tap the button twice fast. Rotate the phone. Background the app while it is working. These are the situations a generated demo glosses over and a real user hits in their first session. Five minutes of trying to break your own feature is worth more than an hour of admiring it.
Make the AI explain itself
One of the most underused review techniques is simply asking the assistant to walk you through its own code. “Explain what this function does line by line.” “What inputs would break this?” “Where could this leak data?” “What did you assume about the input here?” This does two things at once. It teaches you the code faster than reading cold, and it often makes the AI notice its own mistakes, because explaining forces a different kind of checking than generating did. Reviewing with the assistant, not just after it, is a core part of how to prompt an AI coding assistant well.
Keep changes small enough to review
The single biggest thing you can do to make review possible is to keep each change small. If you ask for an entire feature in one giant prompt, you get a wall of code that is too much to review properly, so you skim it, accept it, and hope. If you build the same feature in small steps, each step is a few dozen lines you can read and verify before moving on. Big-bang generation is one of the vibe coding mistakes worth avoiding because it makes real review impossible. Small steps keep you in control.
Own what you ship
Here is the mindset that ties it together: the moment you accept generated code, it is yours, not the AI’s. When a user hits a bug, “the assistant wrote it” means nothing to them and should mean nothing to you. The AI is a fast junior developer who produces a lot of plausible code and has no stake in whether it is right. You are the senior engineer whose job is to catch what the junior missed before it reaches a user. That framing keeps you honest. It is the division of labor I described in my vibe coding toolkit: the AI writes the volume, you own the judgment.
Reviewing generated code is not a tax on the speed vibe coding gives you. It is the thing that makes the speed safe to use. Skip it and you are not shipping faster, you are just moving the slow part to later, when it shows up as bugs, support tickets, and the kind of data incident that ends the trust you spent months building. Read the code that matters, trust the code that does not, and never confuse “it runs” with “it is done.”