Coding Was Never the Whole Job. AI Is Proving It

\ “Teaching coding instead of programming is like teaching typing instead of writing… You can’t program without coding, but programming is much more than coding.” Leslie Lamport Leslie Lamport said that years before anyone was generating React components with a chat prompt. Over time, the developer internet distilled his observation into a pithier truth: coding is to programming what typing is to writing . He wasn’t talking about AI. He was talking about us - about the strange professional delusion that the physical act of producing code was the same thing as the intellectual act of solving problems. We nodded along when we heard it. We quoted it in conference talks. And then we went right back to arguing about tab widths. What if the thing developers spent 80% of their time on (writing code character by character, chasing semicolons, memorising API signatures) was always the least important part of the job? Not unimportant, mind you. Typing isn’t unimportant to a writer. But nobody would confuse a fast typist with a great novelist. We made exactly that confusion, for decades, and got away with it because there was no alternative. The typing was the bottleneck, so the typing felt like the work. In the age of AI-assisted development, we’re finding out what’s left when you remove the bottleneck. And what’s left isn’t nothing, it’s everything that actually mattered. The job hasn’t disappeared. It’s been refined down to its essence. The essence is: Specify, Validate, Verify . \ \ The Old World, Briefly Mourned I want to be respectful here, because I genuinely loved the old way. I’m not one of those people who thinks everything before 2024 was the dark ages. The canonical developer workflow (understand requirements, research, plan, implement, test, refine, release) had a rhythm to it. You’d get a ticket, read it three times, argue with the product manager about what it actually meant, then disappear into your editor for a few hours. You’d emerge with something that mostly worked and a vague sense of accomplishment. We didn’t just write code. We wrestled it. We had opinions about tabs versus spaces. We spent entire afternoons naming variables, convinced that userAccountDataManager was meaningfully better than userHandler . We were artisans of what Fred Brooks called "accidental complexity" : the complexity that comes not from the problem you're solving but from the tools you're using to solve it. We just didn't realise we were the tools. Here’s the uncomfortable truth I didn’t fully appreciate until recently: most of my time was spent on implementation. Not thinking about architecture. Not reasoning about edge cases. Not considering how my changes would interact with the rest of the system. Just… typing. Looking up syntax. Remembering whether it was array.includes() or array.contains() . Copying patterns from one file to another, changing the names. The intellectual content of my average Tuesday was maybe two hours of real problem-solving buried in six hours of mechanical translation. I didn’t see it clearly then. You rarely do, when you’re inside the paradigm. The Shift Nobody Warned Us About When people talk about AI changing software development, they usually frame it as an efficiency story. “Now you can write code faster.” That framing is approximately as useful as describing the invention of the automobile as “now horses are faster.” It’s not wrong, exactly. It just misses the point so comprehensively that it wraps back around to being misleading. The real shift hit me during a Wednesday morning pairing session that I expected to be routine. I was working with an AI agent on a moderately complex feature, the kind of thing I’d done a hundred times before. I gave it my usual level of instruction, the kind I’d give a competent colleague: “Add a caching layer to this service, use Redis, expire after five minutes.” I got back code that worked. It also introduced a subtle race condition, chose a serialisation format I wouldn’t have picked, and structured the error handling in a way that would swallow failures in production. The code compiled. The tests passed. And it was wrong in ways that would have taken me hours to untangle. The AI had silently made hundreds of decisions I never specified: which data structure, which error handling pattern, whether to log, how to name things, where to put the files. Each decision was reasonable in isolation. The combination was not. And if I’d run the same prompt tomorrow, I’d have got a different combination equally reasonable, differently wrong. Martin Fowler, in conversation with Gergely Orosz on The Pragmatic Engineer podcast , named the thing I was grappling with: “The biggest part of it is the shift from determinism to non-determinism.” When I write for (let i = 0; i < arr.length; i++) , I know exactly what will happen. Every time. Same input, same output, no surprises. That's determinism, the property that made software engineering possible in the first place. When I tell an AI agent to "add a caching layer," I get a caching layer. One of thousands of possible caching layers, shaped by context I can't see or control. That's non-determinism. It's not a flaw in the AI. It's the fundamental nature of working with a system that interprets rather than executes. And it requires an entirely different set of skills than the ones I spent twenty-five years building. Specify: Learning to Say What I Actually Mean The first time I truly understood specification as a skill, I was staring at a disaster of my own making. I had a multi-repository refactoring project, the kind of thing that makes experienced developers reach for the whisky. An API endpoint needed to change, and that change rippled out across half a dozen consuming services, each maintained by different teams, each with its own conventions and test suites. In the old world, this was a week of careful, tedious work: trace every caller, understand every integration, make each change by hand, pray you didn’t miss one. My first instinct was to do what felt natural: sit with the AI agent and walk it through the changes interactively, step by step, repo by repo. Vibe coding, essentially. “Okay, now open this file. Change this line. No, not that line, the one below it. No, put it back.” It felt productive in the moment. It was, in retrospect, me doing the old job with extra steps. The breakthrough came when I stopped telling the agent what to do and started telling it what I wanted to happen . Instead of walking through code changes, I described the outcome: the endpoint’s new contract, the backwards compatibility requirements, the error handling expectations, the deployment constraints. I spent an hour writing what amounted to a detailed specification, not of code, but of intent. The AI took that specification and did something I wouldn’t have thought to do. It crawled the calling repositories, mapped every dependency, identified integration patterns I’d missed, and produced a set of small, independent user stories, each one a self-contained change that could be reviewed and merged on its own. I documented them as GitHub issues and let the agent work through them at its own pace, producing small, reviewable pull requests. The whole refactoring took a day. Not because the AI typed faster than I could. Because the specification, the act of clearly articulating what I wanted, eliminated the false starts, the backtracking, the “oh wait, I forgot about that service” moments that used to consume half the project. “English is the hottest new programming language.” Andrej Karpathy This means natural language has become our specification language. Not in the formal, mathematical sense. In the practical sense that the precision of your description directly determines the quality of the output. Vague prompt, vague code. Precise specification, precise code. It’s not magic. It’s communication, the same skill that made some developers better than others even in the old world, just stripped of the noise. And here’s the part that keeps me up at night: architecture is specification. If you don’t specify the architecture you want, the patterns, the boundaries, the principles, you haven’t freed yourself from that decision. You’ve delegated it. To a system that has read every Stack Overflow answer ever written and has no ability to distinguish the brilliant ones from the terrible ones. That’s not liberation. That’s abdication. Validate: The Art of Catching What You Didn’t Say Specification is step one. But here’s the thing about working with a non-deterministic system: even when you specify clearly, you need to check that your specification was understood the way you intended. This is validation, not “does the code work?” but “is this what I meant ?” Think of it like working with a brilliant but slightly literal-minded colleague. The kind who, when you say “can you make this endpoint faster,” rewrites it in assembly language. They didn’t misunderstand your words. They misunderstood your intent. The gap between what you said and what you meant is where bugs are born. If this concept feels unfamiliar, ask your product manager, they’ve been living it for years. The AI has an additional problem that your human colleagues mostly don’t: it’s a yes-man. Ask it “does this look right?” and it will explain, with great confidence and eloquence, why it looks right, even when it’s spectacularly wrong. I’ve watched it defend code that would have corrupted a database, using arguments that sounded convincing enough to make me doubt my own experience. Validation, then, isn’t asking the AI if it did a good job. It’s developing your own instinct for when the output feels off, even if you can’t immediately articulate why. The data backs up the frustration. According to Stack Overflow’s 2025 developer survey , 66% of developers cite “almost right, but not quite” output as their single biggest frustration with AI tools, and 45% say debugging AI-generated code takes more time than writing it themselves. I’ve seen these numbers used as evidence that AI coding tools don’t work. I think they’re evidence of something else entirely. Those near-misses aren’t AI failures. They’re specification failures. The developer said something imprecise, got something imprecise back, and now has to reverse-engineer both the intent and the implementation to figure out where they diverged. Of course that’s slow. You’re debugging communication, not code. And in software, unlike horseshoes and hand grenades, close doesn’t count. The moment the specify/validate/verify framework truly crystallised for me was when I built a system of chained agents. Not a single AI assistant I talked to, but a workflow, a pipeline, if you like, though that word undersells it. The idea was to separate how the AI works from what it works on . One agent understood the codebase and its conventions. Another wrote tests. A third reviewed from multiple perspectives, security, correctness, performance, maintainability,like having a panel of senior engineers who never get tired and never phone it in. A final agent assembled everything into a pull request. The first time I ran a complex task through that pipeline and got back a clean, well-structured PR, two things struck me. First, the quality of the AI’s output was dramatically better, not because the model had improved, but because the workflow constrained and guided it. Multiple review passes caught issues that a single-shot prompt would have missed entirely. Second, and more importantly, the whole process had freed me to spend my time on what actually mattered: the what and the why . Instead of line-by-line implementation and review, I was thinking about whether we were solving the right problem, whether the approach aligned with our broader architecture, whether the change would hold up under real-world conditions. The workflow hadn’t replaced my judgment. It had cleared the space for me to actually use it. That’s when I realised something that should have been obvious all along: teams were always non-deterministic. When I handed a task to a junior developer, I didn’t get back exactly what I would have written. I got their interpretation, filtered through their experience, their habits, their understanding of the codebase. The good managers I’d worked under were already doing specify/validate/verify, they just called it “setting clear expectations,” “code review,” and “QA.” We invented entire management frameworks, waterfall, scrum, user stories, sprint reviews, retrospectives, around the fact that humans are non-deterministic systems. We just never used that word. Verify: Trust, But Check the Receipts Validation asks “is this what I meant?” Verification asks “does it actually work?” They sound similar, but the difference matters enormously. I can validate that the AI understood my intent, the code matches my mental model, the architecture looks right, the approach makes sense. But verification means proving it actually does what it’s supposed to do in the real world, with real data, under real conditions. This is where the stakes get real. Research has shown that AI-generated code contains roughly 2.74 times more security vulnerabilities than human-written code . Not because the AI is incompetent, but because it optimises for the thing you asked for and cheerfully ignores the things you didn’t. Ask for a login endpoint, get a login endpoint — without rate limiting, without input sanitisation, without the paranoid defensive coding that experienced developers apply almost unconsciously. The AI doesn’t know what you forgot to mention. It only knows what you said. Which raises the obvious question: who catches what the AI doesn’t know to look for? You can’t verify code you don’t understand. I want to be blunt about this, because I’ve seen a dangerous narrative forming in some corners of the industry, that AI will allow people who don’t know how to program to build software. Maybe it will, in the same way that Google Translate allows people who don’t speak Japanese to order dinner in Tokyo. You’ll get something. It might even be what you wanted. But when it’s not — when the translation says something offensive, or the code has a security hole, or the architecture won’t scale, you’ll have no way of knowing until it’s too late. The specify/validate/verify framework isn’t a shortcut around competence. It requires competence. More of it, arguably, than the old way. When you were writing every line yourself, you understood the code because you had written it. When the AI writes it, you need to understand code you didn’t write, written in patterns you might not have chosen, using idioms that might be unfamiliar. That’s harder, not easier. Verification became concrete for me during a release to a production system that had been, charitably, “temperamental.” The kind of system where deploys were followed by thirty minutes of anxious Datadog-refreshing, looking for the spike in error rates that meant something had gone sideways. Usually, you’d check a few dashboards, eyeball some graphs, and declare victory based on vibes and the absence of PagerDuty alerts. I asked the agent to review the release, not just glance at a dashboard, but query Datadog directly to see if anything had obviously gone wrong. What came back surprised me. It offered to run repeated sweeps over the next twenty minutes, building a baseline from pre-release metrics and comparing post-release data against it. It found upstream and downstream services on its own and checked them for cascading effects. It caught a latency regression in a downstream service that I would have missed entirely, not because I’m careless, but because it was in a service I wouldn’t have thought to check. By the time it was done, it had effectively built a reusable release monitoring checklist, documented its methodology, and provided a summary I could paste into our release channel. The whole thing took twenty minutes. I would have spent the same twenty minutes staring at three dashboards and convincing myself everything was probably fine. It’s not that humans can’t do what the AI did. Of course we can. We understand context, and we do a better job of understanding intent, at least for now. But we’re also distractible, impatient, and prone to cutting corners when we’re on our third deploy of the day and lunch is getting cold. The AI doesn’t get bored. It doesn’t skip steps. It does exactly what you specify, which brings us full circle. There’s a data point from METR that’s been widely cited : in their 2025 randomised controlled trial, experienced open-source developers using AI tools were actually 19% slower on average. I’ve seen this used as a gotcha, proof that AI coding tools are overhyped. But when I read the study details, I noticed something. The developers were working on real-world implementation tasks from their own projects, bug fixes, feature requests, refactoring. They were being measured on the old job, the typing part, while using tools designed for the new job. Of course they were slower. They were using a race car to parallel park. \ \ The gains from AI don’t show up when you’re still thinking in terms of implementation. They show up when you’re thinking in terms of specification, validation, and verification. When you invest in describing the outcome instead of dictating the steps. When you build systems that check the AI’s work instead of checking it yourself, one line at a time. The Punchline Here’s what I’ve come to believe after a year of working this way: the essential work was always specify, validate, verify. We just had to do the typing too. The best developers I’ve ever worked with weren’t the fastest typists or the ones who memorised the most API calls. They were the ones who asked the best questions before writing a single line of code. The ones who could look at a pull request and spot not just bugs but misunderstandings . The ones who thought about deployment, observability, and failure modes while everyone else was still arguing about whether to use the var keyword in Java or semicolons in JavaScript. They were already doing the new job. They just happened to also do the old one, because someone had to. What’s changed isn’t the nature of good software engineering. It’s the ratio . The mechanical translation, the accidental complexity, the typing, the semicolons, used to consume 80% of our time and energy. Now it consumes 20%, or 10%, or on a good day something close to zero. What’s left is the hard part. The part that was always the hard part, even when we pretended the typing was hard too. The job isn’t smaller. The busywork is. And for those of us who got into this profession because we loved solving problems, not because we loved typing, that’s not a loss. It’s the job we always wanted, finally arriving.

View original source — Hacker Noon ↗

ShareShare on X Share on Facebook