Article
How to Build Effective Agent Skills with the Dry Run Workflow
Learn how to turn one-off tasks into reusable agent skills using a practical dry run workflow that improves consistency, efficiency, and automation.
Flavio Del Grosso•Mar 22, 2026
5 min read•1287 words
I’ve seen a lot of people overcomplicate “agent skills” before they’ve even built one.
They start by designing abstractions. Naming conventions. Folder structures. Some even sketch out entire taxonomies of capabilities like they’re building an operating system. And then—nothing actually ships.
The irony is that the most effective way to build agent skills is almost aggressively unglamorous: you just do the work first.
What I’ve come to think of as the “dry run” workflow flips the usual instinct on its head. And in practice, it’s the difference between skills that look good on paper and skills you actually use.
The real point of agent skills (and why most people miss it)
At a surface level, agent skills are straightforward: reusable, structured workflows that an AI agent can execute without you re-explaining everything each time.
But that definition undersells what’s actually happening.
A good skill isn’t just a saved prompt. It’s a compressed decision-making process. It captures not just what to do, but how you prefer it done—your defaults, your edge cases, your tolerance for mess.
That’s the part most people skip.
They treat skills like templates. Generic, vaguely useful, and ultimately forgettable.
The real value shows up when a skill reflects how you work. Not how a model was generally trained. Not how a tutorial suggests things should be done. Your version.
And that only emerges if you build the skill from reality, not from theory.
Why one-off prompting breaks down faster than you think
If you’re working with agents regularly, you’ve probably felt this already.
You solve a task once—say, cleaning up a dataset or setting up a service—and it goes well. Then a week later, you try to recreate it. You vaguely remember the steps, but not quite. You re-prompt. You tweak. You get something close, but not identical.
That drift is the problem.
Without skills, every interaction resets the context. You’re relying on memory and improvisation, which is fine for exploration but terrible for consistency. Small differences creep in. Outputs diverge. You spend time re-deciding things you already decided.
Skills are a way of saying: “No, this part is settled.”
They lock in a process so you can move on to harder problems.
The mistake: designing the skill before doing the work
Here’s where things usually go wrong.
People try to write the skill upfront.
They imagine the steps. They generalize too early. They optimize for reusability before they’ve even proven the workflow works end-to-end. The result is predictable: a clean-looking artifact that breaks the moment it touches reality.
I’ve done this myself. It feels productive. It isn’t.
You can’t abstract a process you haven’t actually executed. At best, you’re guessing. At worst, you’re encoding assumptions that will come back to bite you.
The dry run workflow
The fix is simple, and a little counterintuitive: delay the abstraction.
Instead of starting with the skill, start with a real task and treat the first execution as a live prototype.
1. Pick something you’ll actually repeat
This sounds obvious, but it matters.
Good candidates tend to have a few characteristics: they’re multi-step, a bit tedious, and show up more often than you’d like. Think environment setup, file normalization, deployment routines, data transformations.
If you’re only going to do it once, don’t bother. Skills pay off through repetition.
2. Run it manually—with the agent
This is the part people underestimate.
Work through the task step by step with your agent. Give instructions incrementally. Adjust as you go. When something doesn’t look right, fix it immediately.
You’re not documenting yet. You’re discovering the process.
This is where the real decisions happen:
- What order actually works?
- What assumptions break?
- Which steps are unnecessary?
- Where do you need explicit constraints?
By the end, you should have a result that feels “right,” not just “acceptable.”
3. Only then, turn it into a skill
Once you’ve completed the task, then you package it.
At this point, the structure almost writes itself. The sequence of steps is already proven. The edge cases are visible. The important details aren’t hypothetical—they’re grounded in what just worked.
You’re not inventing a workflow. You’re capturing one.
That difference is everything.
What this looks like in practice
Take something mundane: a messy directory full of inconsistently named files.
If you tried to design a “file cleanup skill” upfront, you’d probably write something generic—normalize names, remove duplicates, maybe generate an index. It would work, sort of.
But run a dry run and it gets more interesting.
You notice that timestamps matter more than you expected. That certain naming patterns should be preserved. That duplicate detection isn’t trivial. You refine rules on the fly. You correct the agent when it makes reasonable but wrong assumptions.
By the time you’re done, you haven’t just cleaned files—you’ve defined your standard for what “clean” means.
Now when you convert that into a skill, it carries those decisions forward.
Same story with something like Dockerizing a project. Yes, an agent can do it instantly. But “a Dockerfile” isn’t the goal—your Dockerfile is.
Maybe you care about image size. Maybe you have strict version pinning. Maybe your port handling is nonstandard for good reasons. Those preferences only surface when you actually go through the process and push back on defaults.
The dry run forces that interaction.
Why this approach works (and keeps working)
There are three things I like about this method, and they compound over time.
First, it anchors everything in reality. You’re not guessing what might work—you’ve already seen it work.
Second, it naturally builds iteration into the process. You refine as you go, instead of discovering flaws after you’ve “finalized” a skill.
Third, it produces skills you’ll trust. And that’s not a small thing. If you don’t trust a skill, you won’t use it. You’ll fall back to manual prompting, and the whole exercise collapses.
The obvious objection
There’s a reasonable pushback here: isn’t this slower?
Yes, the first time.
Running a full dry run takes longer than sketching out a quick reusable template. If you’re under time pressure, it can feel like overkill.
But that’s only true if you ignore what happens next.
The second time you run the task, the skill saves you minutes. Then hours. Then cognitive load you didn’t realize you were carrying. More importantly, it eliminates variability. You stop rethinking solved problems.
So the trade-off isn’t speed vs. rigor. It’s when you pay the cost.
I’d rather pay it once, properly.
When a task deserves to become a skill
Not everything should be turned into a skill. That way lies a graveyard of unused automation.
But there’s a reliable signal I’ve learned to trust: repetition plus mild annoyance.
If you find yourself thinking, “I’ve done this before, and I don’t want to think about it again,” that’s the moment. Especially if the task has enough steps that you can’t hold it all in your head without effort.
That’s where skills shine—not in flashy demos, but in removing friction from the work you actually do.
Where this leaves us
The shift here is subtle but important.
Stop trying to design perfect, reusable systems upfront. Start by executing real work, paying attention to the decisions you make along the way. Then capture that.
In my experience, the best agent skills aren’t engineered—they’re distilled.
And once you start working this way, you notice something else: your “one-off” tasks start disappearing. They quietly turn into a library of capabilities you trust, because each one was earned the hard way—by doing the job first.