the seven-phase workflowbrainstorm → shipbrainstorm1design2plan3implement4test5review6ship7tdd + review gates at every step
· 10 min read ·

When Slower Is Actually Faster in Claude Code

A structured Claude Code workflow that replaced plan-and-pray with repeatable, first-pass-clean features.

dev ai claude workflow

One afternoon I built a clip storefront from my phone. A few weeks later, a proposals portal over Telegram in a single evening. Both shipped, both worked. Both times, the pattern was the same: describe what I want, let Claude Code plan it, watch the implementation roll out, fix the bugs that surface. Two steps and a lot of debugging.

But speed hides a problem. When you go straight from idea to code, you skip the phase where you figure out if your idea is actually the right one. You skip the phase where edge cases surface before they become bugs. You skip the part where someone asks “wait, what happens when a viewer deep-links to photo 742 in a shuffled gallery?”

Nobody asks that question during implementation. By then you’re already writing code.

Where Plan-and-Implement Falls Apart

My King of Hammers 2026 photo gallery has 1,528 photos. Every visitor sees them in the same order. First photo is always first. Last photo is always last. Share the link to ten people, they all scroll through the same sequence from the top.

Shuffling the order on every page load so repeat visitors and new viewers alike would see something different each time sounded simple.

Here’s the plan-and-implement version of this: “Add shuffle to the gallery. Randomize photo order on load.” Claude writes a plan, dispatches agents, ships code. Maybe it works for the basic case. But 1,528 photos means you can’t render them all at once without destroying load time. And if you’re loading progressively, what happens to the lightbox? What happens to deep links? What about the cart integration?

These aren’t obscure edge cases. They’re the first things a real user would hit. But in a two-step workflow, they surface during implementation, which means you’re redesigning on the fly while writing code. You patch one thing, break another, discover a third problem, back up, try again. Working code, accidental architecture.

Ask me how I know. Features that shipped but felt fragile, tests that passed but didn’t cover the scenarios that actually mattered, working code built on decisions made while writing it instead of before.

A Structured Claude Code Workflow

For the past few months, a set of skills called superpowers has enforced a specific sequence in my Claude Code workflow, not as guidelines but as hard requirements where each phase must complete before the next one starts.

idea brainstorm spec spec review plan plan review implement code review

Seven phases instead of two, which sounds slower until you see what each one actually prevents. Here’s how it played out for the gallery shuffle.

Phase 1: Brainstorming

Nine words in the prompt: “I want to shuffle the King of Hammers gallery.” Enough to send Claude Code straight into writing a shuffle function.

Instead of jumping to code, the brainstorming skill asked questions. One at a time. Multiple choice where possible, open-ended when needed.

  • Should shuffle be automatic for all large galleries, or manually tagged?
  • Client-side shuffle or server-side on each request?
  • Load all photos at once (current behavior) or progressively?
  • What batch size keeps columns even across all responsive breakpoints?
  • How should deep links work when the order is random?

Each question forced me to make a design decision I would have otherwise made accidentally during implementation. By the end, I had a clear picture of what I actually wanted: client-side Fisher-Yates shuffle, progressive loading in batches of 48 (divisible by 1, 2, 3, and 4 column layouts), an opt-in flag per gallery, and deep links that resolve by photo ID rather than array index.

Phase 2: Spec

After brainstorming, the workflow wrote a design spec covering the problem statement, data model, render flow, performance analysis, and testing strategy. No code blocks, no implementation details. Just what the system needs to do and why it needs to do it that way.

From the spec:

Photos load in batches as the viewer scrolls rather than all at once. Shuffle galleries render an empty grid with photo data as inline JSON; client JS shuffles and renders in batches of 48.

It called out that the lightbox photo array must be built from the shuffled order. That deep linking via #photo-742 resolves by photo ID, not array index. That if a deep-linked photo hasn’t been rendered yet because it’s in a later batch, the system must render all batches up to that photo before opening the lightbox.

All of these decisions existed as written requirements before anyone wrote a line of code. In the plan-and-implement model, each would have been a surprise mid-implementation, discovered as a bug, patched with a quick fix, covered by a test added after the fact. Here they were baked in from the start.

Phase 3: Spec Review

Next, the spec went through a code-review agent, which sounds redundant until you realize it’s reviewing the design, not the code. A structured pass over the spec itself, checking for gaps and contradictions before any implementation begins.

It checked for internal consistency. Does the batch size align with the responsive column counts? Does the deep link behavior account for the progressive loading? Does the cart integration still work when photos move around?

Spec review caught that I hadn’t specified what the lightbox swipe order should be in a shuffled gallery. Left to implementation, the lightbox might have used DOM order (only rendered photos) while the user expected to swipe through all 1,528 in the shuffled sequence. A one-line addition to the spec prevented a UX bug that would have been painful to diagnose later.

Phase 4: Implementation Plan

Only after the spec was approved did the system generate an implementation plan. Eight tasks, each with exact file paths, test-first steps, and commit messages.

Task 1: Fisher-Yates shuffle utility with 11 unit tests. Task 2: Add the shuffle flag to gallery data. Task 3: Conditional server render. Task 4: Client-side progressive loading script. Task 5: PhotoSwipe lightbox updates. Task 6: Image retry guards. Task 7: Full test suite verification. Task 8: Local dev testing checklist.

Every task started with “write failing tests” before touching implementation, because TDD isn’t a suggestion in this workflow. It’s a structural requirement baked into the plan format itself.

Phase 5: Plan Review

Another review checkpoint. A code-review agent verified the plan against the spec. Does every spec requirement map to at least one task? Are there tasks that don’t trace back to a spec requirement? Are the test assertions actually testing the behaviors we defined?

This is where I caught redundant work. An early draft of the plan had duplicate test coverage across tasks 3 and 7. Review consolidated it. Small thing, but the implementation would have been messier without it.

Phase 6: Subagent Implementation

Here’s where it gets fast. All eight tasks dispatched as parallel Claude Code subagents, each working on an independent task with its own isolated context and no shared state bleeding between them.

One agent wrote the shuffle utility and 11 tests while another added the data flag. A third modified the Astro template for conditional rendering at the same time a fourth built the progressive loading script and a fifth updated the lightbox. Each agent ran its tests, verified green, and committed independently.

Five agents in parallel, all following the plan’s exact instructions with tests written before implementation and no room for improvisation or scope creep. From plan approval to all tasks committed, the whole thing took less time than my previous approach of iterating through bugs one at a time, because the structure had already eliminated the rework.

Phase 7: Post-Implementation Review

After all tasks shipped, the code-review agent ran again against the actual committed code to verify that every spec requirement had a corresponding test, that no dead code snuck in, and that the implementation actually matched the design it was supposed to follow.

It caught a minor issue: the image retry logic in the existing script would run on page load for shuffle galleries and find zero images, because the progressive loader hadn’t rendered them yet. A one-line guard fixed it. Without the review phase, that would have been a silent performance waste on every shuffle gallery load.

Why Seven Phases Beat Two

Plan-and-implement feels fast because you start writing code immediately. But you end up writing code three times: once for the initial attempt, once to fix the edge cases you discover, once to refactor the parts that don’t fit together.

Brainstorm-spec-review-plan-review-implement-review feels slower because you don’t touch code until phase 6. But when you do touch code, the design is solid, the edge cases are documented, and the tests exist before the implementation. You write code once.

For the gallery shuffle, the spec identified 6 behaviors that needed tests, mapped to 8 tasks with 30+ test assertions. Implementation passed on the first run for 7 of the 8 tasks. One task needed a minor fix caught by its own tests.

Compare that to my storefront build, where I fixed 14 bugs across 6 follow-up sessions. Good code, working product, but the architecture emerged from debugging rather than from design.

There’s a less obvious cost too. When you’re debugging in a long Claude Code session, the context window fills up with error messages, failed attempts, stack traces, and incremental fixes. Eventually the conversation hits compaction, where earlier context gets compressed to make room for new messages. Claude starts losing track of decisions it made earlier in the session, which means its suggestions get worse exactly when the problems get harder. I’ve watched this happen: the fix quality degrades around the fourth or fifth bug in a session because the agent no longer has full visibility into the changes it already made. With the structured workflow, each subagent starts fresh with a clean context window, a focused task, and the full spec as its only reference point. No accumulated debugging noise, no compaction, no degraded recommendations halfway through.

When to Use Which

Low-stakes work still gets the two-step treatment: quick scripts, one-off experiments, prototypes destined for the trash. When exploring without a clear goal, jumping straight to code is the right call.

But for features that touch existing systems, have user-facing implications, or need to work reliably at scale, seven phases wins every time. The proposals portal used a lighter version of this flow and the gallery shuffle used the full pipeline, and both shipped clean on the first real deploy.

It isn’t about being more careful so much as front-loading the thinking so you don’t pay for it later with debugging time. Brainstorming turns gut feelings into actual decisions, specs turn those decisions into testable requirements, reviews catch the gaps between what you intended and what you described, and plans turn the whole thing into parallelizable work that Claude Code can execute without improvising.

My photo galleries now show 1,528 King of Hammers photos in a different order on every visit, with progressive loading, deep link support, lightbox integration, and full cart compatibility. All of it was tested, reviewed, and designed before the first line of implementation was written. I’ll take that over seven debugging sessions every time.

If you’re building with Claude Code and want to compare workflows, I wrote about the self-improving skills loop that keeps these phases sharp over time. Or get in touch if you want to talk shop.