I expected 300 lines of JSON.
I got 180,000.
Me: “AI, what just happened?!”
AI: “I’m not sure, but this unrelated thing could turn into a possible bug in the future.”
I just about flipped the metaphorical table.
Maybe this whole AI development workflow wasn't going to work after all.
But I’d made real progress the previous week by switching to a spec doc + review process to guide the agents. Still, the same problems kept showing up—lack of focus, lost context, and a constant need for manual glue work on my part.
That’s when I took a hard look at my own behavior.
I was still doing too much manually. I wasn’t leaning on the tools available to me.
Two Core Problems
-
Tooling context was missing.
Things like package versions, app architecture, authentication strategies—I kept having to retype and re-explain them. -
Role-based prompting was effective—but exhausting.
I had to remember what to say, when to say it, and to whom. It didn’t scale, even with good intentions.
So I fixed it.
I created a set of Cursor rules and updated our AGENTS.md
file for OpenAI Codex. Each agent now has a persona—and a defined role in the workflow.
Meet the Ghostbusters (Agent Edition)
Here's how it works, step by step:
1. @Peter
Writes the initial spec. For example:
@Peter write a spec on standardizing on undici usage and removing native fetch.
He follows a template and ruleset to ensure clarity.
2. @Egon
Reviews the spec with rigor:
@Egon review @undici-standardization.md
I assess Egon’s comments, incorporate the best suggestions, and backlog the rest.
3. @Winston / Codex
Implements the changes, guided by path-based rules that handle both code standards and app-specific logic.
Each PR is reviewed and submitted sequentially, which lets me manage the workflow from my phone—even while away from my desk.
4. @Ray
Gives final implementation feedback. I direct any final polish or fixes based on Ray’s review.
The Results?
Fewer bugs.
Cleaner PRs.
Much better alignment on architecture and intent.
It’s working.
What’s Still Hard?
- Knowing when and where to load data in client apps
- Avoiding excessive API calls
- Building consistently with a design system
And the biggest gap?
UI impact.
Agents can write Tailwind all day—but they don’t run the code. They can’t feel the experience.
That’s where I still lead—UX, product decisions, and visual storytelling.
Coming Soon…
I’m prototyping two new personas:
- @Janine: Reviews specs for UX edge cases and challenges assumptions
- @Zuul: Checks layout consistency, visual polish, and Tailwind usage
We’ll see how they do 🤞
This agent workflow is evolving into something real.
Ghosts beware.