I Didn't Write a Single Line of Code

TL;DR: Over six months, I built two major projects without writing a single line of code, using Claude as my primary development tool. The biggest shifts weren’t about code generation. They were about flipping from writer to editor, compressing RFC timelines from months to days, and rethinking what engineers actually do. The generated code is good, which is exactly what makes it dangerous: the 1% it gets wrong looks correct at a glance. Software engineering was never about the code. AI is just making that impossible to ignore.


Over the past six months, I built two major projects without writing a single line of code.

That statement tends to get a reaction, but I think people react to the wrong part of it. The interesting question isn’t “how did AI write the code.” The interesting question is: what was I actually doing all that time? And does that change what we think software engineering is?

I believe it does. Software engineering has never been about code. It’s about repeatable patterns, building systems, scaffolding solutions with proven approaches, and translating business needs into technical implementations. Code is one tool in the toolbox, and AI is making that obvious in a way we can’t ignore anymore.

The engineers who will thrive in this shift are the ones who were already thinking at the systems and problems level. AI doesn’t create a new divide, it makes an existing one visible: the gap between “thinks about code” and “thinks about problems.”

On a personal note, as a Principal SWE covering everything from frontend to infra and devops, AI gave me back something I’d been slowly losing over the years: the power to actually ship things, not just architect them. Having Claude running during meetings, checking on it during downtime, watching a plan turn into working software while I focused on the next problem. That felt like a shift.

The Two Experiments

Two projects, very different in nature, same approach. Both happened to be solo experiments out of circumstance (capacity constraints and timing), not because this is meant to be a solo workflow. The lessons apply to how any engineer or team could work.

Project 1: Conversation Extension Context (Web Widget). A major improvement to how we pass context into Conversation Extensions, the mini-apps that run in iframes and webviews. This one spanned three repositories and touched a project that had been dormant for nearly eight years. I was the last person to work on it, so institutional context was mostly gone.

Project 2: Message Streaming (Messaging Platform). Implementing message streaming to improve perceived latency for bot responders. Through normal team process (freeing up a team, ramping, ceremonies), this was originally estimated at six to eight person-months. The AI team flagged it as a priority, but leadership couldn’t see where to free up capacity.

The implementation details of these projects don’t matter here. What matters is how they got built, and what that tells us about where engineering is heading.

Project 1: Learning the Ropes

I started with Claude and the Superpowers plugin, using the brainstorm-then-implement flow. I had a strong high-level mental model of the solution going in, so the AI had good direction to work with. Within a week, I had a working proof of concept.

Then I made a deliberate choice: I wrote the RFC myself, no AI assist. This was my first real AI-driven experiment, and I was blending old habits with new tools. The RFC felt like the one place where the human thinking had to happen, the requirements, the business value, the tradeoffs. My belief then, which has only gotten stronger since: if you nail the RFC, you can hand the implementation to anyone, human or AI.

Once the RFC was approved, I threw away the PoC and started implementation from scratch. You do not polish a turd. Proofs of concept are full of hacks and half-baked requirements. Their job is to prove the approach, not to produce shippable code. With AI, it’s actually more efficient to throw the prototype away and refine the prompts than to try to salvage what you hacked together. A well-written RFC combined with well-refined prompts can potentially get you to a one-shot implementation.

The PR Size Problem

Working solo with AI, PR size wasn’t on my radar. I was in the flow, moving fast, and the AI was keeping up.

The first external review came from the Web Widget tech lead, and her feedback was direct: “The PRs are too big for review.” She was right. A 2,000+ line PR is brutal to review, and things will slip through the cracks.

The root cause was scope creep. When you’re working with an LLM, it’s dangerously easy to say “just do one more thing” and watch the changeset balloon. Solo AI workflow doesn’t naturally produce review-friendly increments. Nobody is there to tap you on the shoulder and say “that’s enough for one PR.” The discipline has to come from you. This was a lesson I absorbed and carried into Project 2.

The project eventually transitioned to a team to complete. It included a Ruby to Go service migration outside the receiving team’s usual scope. AI made crossing language boundaries less intimidating for everyone involved.

Project 2: The System Clicks

Streaming was a priority, but nobody could solve the capacity puzzle. The ballpark estimate was six to eight person-months, and no one knew when to slot it in.

I had some free time on my hands. I started toying with a PoC on our existing infrastructure, got something working in about a week, and demo’d it. That changed the conversation entirely. “Looks like we can make it work. I’ll go deeper.” This was before Christmas, with a workshop on streaming and related topics planned for mid-January.

From RFC to One-Day RFC

For my entire career, writing the RFC was my thinking tool. The act of writing forced me to process the problem, challenge my own assumptions, and find the gaps. A typical RFC takes me one to one and a half months. Projects span multiple groups, so there’s a lot of context-setting, plus the incubation time of thinking through the problem, plus limited focus blocks due to meetings.

For the streaming project, the thinking shifted. I used Claude’s deep research to study how the industry handles streaming, then generated a roughly 5,000-line implementation plan. I never built a second PoC. The plan itself became my way of thinking through the problem. But the plan was meant for an AI agent to pick up for implementation, not for human review. I still needed a shareable artifact.

So I spent a week building a Claude skill for writing RFCs. I fed it my previous RFCs, a template, and a detailed style guide, with heavy emphasis on avoiding “AI-looking” text. The next week was content development using the skill, and the total RFC time came to roughly two weeks instead of the usual month and a half.

The real shift was flipping from writer to editor. When you write something, you develop ownership and attachment, and it becomes harder to see the flaws in your own prose and your own arguments. When you edit something, even if all the ideas originated from you through brainstorming, the detachment creates critical distance. I find it much easier to criticize and refine ideas when I didn’t physically write them out. The ideas and the judgment are still yours, but by having AI handle the writing, you gain the ability to evaluate your own thinking with fresh eyes. That’s not just faster, it’s arguably a better cognitive process.

People who didn’t know the RFC was AI-written? “This is a well-written RFC.” Nobody flagged it. At least not to my face.

Then came the moment that really drove it home. During the streaming workshop, a new sub-topic came up: how to handle incremental markdown streamed to the client. It needed its own RFC. I ran the full loop (brainstorm, plan, RFC) in a single day and shared it the next afternoon. To be fair, this was a smaller, well-scoped topic. I’m not claiming every RFC can be compressed to 24 hours. But the system cuts down on the two things that used to dominate the timeline: incubation time and the focus time required to sit down and write. When the writing isn’t the bottleneck anymore, you can move as fast as your thinking allows.

Implementation and Review

For implementation, I applied the lessons from Project 1. I switched to the get-shit-done framework for better scope control and added the RFCs into a docs/ folder as the source of truth. PRs came out manageable.

Code review became a multi-layer process: my own comments first, then Claude addressing them, then the Anthropic code-review plugin for a second pass (which caught things I missed by cross-referencing the RFC), then human peer review. I paired with a Senior Staff engineer who did his own AI-assisted review, so both sides were AI-augmented.

The Hard Questions

The 99% Trust Problem

The generated code is actually good. That’s precisely what makes this dangerous. Because it’s good the vast majority of the time, it creates a false sense of security where you start trusting it, reviewing a little less carefully, and that’s when the 1% hits you.

If AI gets it right 99% of the time, the 1% it gets wrong can be catastrophic, and the failures are subtle.

Two examples from my own reviews. In one case, the AI mocked Redis entirely in a test that was specifically about testing a Lua script running on Redis. The mock made the test pass, but the test was testing nothing. In another, it asserted that a constant was equal to its own value. A tautological test that will never fail and will never catch a bug.

These aren’t syntax errors. They’re logical failures that look correct at a glance. This is the category of mistake that worries me most, because as AI output gets better on average, people will lower their guard on review. That’s when the 1% becomes dangerous.

Accountability Doesn’t Change

People have asked me, “Did you look at the code? What’s your feeling about it?”

My answer: it doesn’t really matter what I think about it stylistically. The same standard applies as reviewing any team’s pull request. Does it meet the requirements? Is it sound enough to maintain our quality bar?

What doesn’t change: your name is on the PR and on the commit, so if there’s an incident, you’re on the hook. AI doesn’t replace accountability. If you messed up using the tool, it’s still on you, and you better understand whatever it built.

Ownership and On-Call Readiness

A question that comes up a lot: “If you didn’t write the code, how do you handle on-call? How do you own it?”

Say two people on your team built a feature together, but the other three, including you, were working on something else. Maybe you were aware of the project. You reviewed their code, so at a high level, it went past you. You have a general idea of what it does, you saw some of the implementation. Now you’re on-call that night and that feature breaks. What do you do?

You read the code, check the logs, and trace the problem. You’ve never been expected to have personally written every line you’re on-call for. That’s not how teams work.

AI-generated code is no different. You reviewed it, you understand the intent, you know the architecture. When it breaks, you debug it the same way you’d debug any code a colleague wrote. The same applies to maintenance. If someone needs to modify this code six months from now, that’s the same situation as inheriting a codebase from a teammate who left the company. It’s not new. It just feels new because the original author isn’t human.

What This Means

For Engineers

The role is shifting, all of it, not just at the senior level. Engineers who already think about systems and problems will adapt naturally. Engineers who define themselves primarily through code will feel the ground moving under them.

AI collapses the gap between “I know what to build” and “it’s built.” For senior engineers who’ve spent years accumulating knowledge they couldn’t always act on fast enough, that’s exciting, and it was for me. Getting back the ability to build, not just advise on building, has been one of the most energizing shifts in my career.

The knowledge gap problem isn’t new. Juniors have always needed time to develop the judgment that seniors carry. AI puts a sharper edge on it: if you don’t have the experience to recognize when the AI is wrong, the 99% trust problem gets worse. But this is the same challenge engineering has always faced, just wearing a new outfit.

For Teams and Organizations

Team structures built around implementation capacity will need to evolve. The value shifts toward requirements, validation, operational readiness, and review. The question isn’t “who writes the code” but “who owns the thinking.” Specs, architecture decisions, tradeoff analysis, quality validation. Those are the high-leverage activities, and every engineer on a team should be doing more of them.

Making everything “AI-digestible” becomes important. If someone encounters unfamiliar code or a new domain, they should be able to point AI at it and get up to speed. That changes how we document, how we structure repos, how we think about onboarding new people.

We’re still early in this, figuring it out in real time, and I expect the answers to look different a year from now.

The Personal Question

Here’s what I’d ask anyone reading this: what does software engineering mean to you?

Is it writing beautiful code? Or is it building solutions to problems?

If it’s the first, this transition will feel like a loss. If it’s the second, it’s the most exciting time to be an engineer.

Stay up to date

Get notified when I publish something new, and unsubscribe at any time.