A few months ago I needed a Postman replacement. A feature I wanted wasn’t surfacing in the UI after a minute of looking, so I wrote a spec, handed it to an LLM, and had a working boutique tool fifteen minutes later. I never read the source. For a personal productivity tool, that’s fine. That’s the Vibe end of the spectrum, and it’s great for democratizing what it means to ship an idea.

But that spectrum has another end, and most production software I work on lives there.

The interesting thing about AI-assisted development is that it doesn’t change what good software looks like. It changes who, or what, is writing it. And the same practices that let large engineering teams scale without losing the ability to reason about their systems turn out to be exactly what you need to leverage AI more safely.

The context problem has always been the problem.

Large teams build clear APIs and enforce separation of concerns because no single engineer can hold the whole system in their head at once. Bounded contexts, well-defined interfaces, specs written down somewhere, and being explicit rather than implicit (todo, future article!) are all tools for managing cognitive load across people and time.

LLMs have token limits. The window is big (and growing), but it’s finite, and everything outside it might as well not exist. Making context easy to get and easy to consume is what determines whether the AI can do useful work on your codebase at all. A repo with clear module boundaries and documented interfaces is a repo an LLM can navigate. A tangled monolith is a problem for humans and models alike.

The discipline that helped you scale to twenty engineers helps you scale with AI.

Tests are a trust mechanism, not a bureaucratic ritual.

Shift-left testing, PR gates, system acceptance tests; none of these were invented because of AI. Well-intentioned people make mistakes, and you want to catch those mistakes before they reach main. Junior devs benefit from required passing tests before merge. Senior devs under a deadline benefit from it. AI benefits from it for the same reason.

If core functionality breaks, the deploy should not go out. That’s true regardless of whether the breaking change was introduced by a new hire, a tired senior engineer, or an LLM that confidently refactored something it didn’t fully understand.

The gate doesn’t care who wrote the code.

But gates only work if the tests are honest.

There’s a failure mode worth naming: an AI, asked to make the tests pass, will sometimes make the tests pass by breaking the tests. Prompt differences like “tests are failing” can land differently than “this behavior needs to work.” A junior dev can do the same thing; we’ve all seen it.

This is why you need people watching the tests, not just running them. Code review for test changes should be treated with at least as much scrutiny as code review for implementation changes. Possibly more.

The trust spectrum is real and worth being explicit about.

I’m at the AI adoption stage where, for anything beyond vibing out some tooling, I’m working a couple of claude agents, doing things in parallel, and reviewing all code, in addition to having AI review the code. Two sets of eyes, one of which is mine.

For tooling, the calculus is different. Detailed spec, clear acceptance criteria, quick gut check that it does what I asked and ship it. The Postman replacement works. I’ve used it. I don’t need to understand every line, or even the basic structure. It feels right at the edges.

Where you sit on this spectrum should be a deliberate decision based on blast radius. The lower the consequences of being wrong, the more code-ignorance is reasonable. As consequences go up, the engineering fundamentals come back in, and they come back in hard.

Higher velocity means bad patterns propagate faster.

AI makes it easy to generate a lot of code quickly. If that code is well-structured, tested, and contained behind clean interfaces, you’re in good shape. If it is not, you might just be faster in the wrong direction.

The basics aren’t less important in the AI era. They’re more important, because the cost of skipping them compounds faster.