Workshop-DeckAuthoring Agent SkillsKeys: arrows · space · home/end · N notes · G guide1 / 17
Workshop · 60 minutes · hands-on
Authoring agent skills: a hands-on workshop
How to write a SKILL.md that an agent actually triggers, follows, and can verify — using the htmlify repo's own skills as the worked example.
Who this is for
Developers who use Claude Code, Codex, or any agentskills.io-compatible client and want to package a repeatable workflow as a skill.
What you leave with
A drafted frontmatter description, three rules rewritten to be testable, and a validator lab you ran yourself. Bring a laptop with Node 20+ and a clone of the repo.
3 acts · 3 checkpoints · printable handout under the Guide button (G)
Cold open
You have explained this workflow to your agent fourteen times
“Remember to inline the CSS… no external fonts… check it opens standalone… oh, and add print styles again.”
Every session starts from zero. The workflow lives in your head and leaks into chat one correction at a time.
A skill is that workflow written down once — with a trigger, rules, and a way to check the result.
The agent reads it when the request matches, follows the rules, and validates its own output. No re-explaining.
Today's worked example: the htmlify and deckify skills that produced the deck you are looking at.
Orientation
Three acts, three checkpoints
Fig 1 · The hour at a glance — checkpoints in orange are yours, not mine
Act 1: what a skill file is. Act 2: rules an agent will actually follow. Act 3: making the output check itself. Each act ends with you doing the thing.
---
name: htmlify
description: Create self-contained HTML artifacts from agent or repo
context, including operator briefs, build plans, implementation maps,
PR/release packets, incident timelines, … Use when the user asks
to turn dense text, code evidence, plans, reviews, or status into
browser-ready HTML instead of a markdown wall.
license: Apache-2.0
metadata:
version: "0.3.1"
source: "https://github.com/zakelfassi/htmlify"
---
Specific verbs — “create self-contained HTML artifacts”, not “helps with HTML”.
Artifact nouns — the things users actually ask for, by name.
“Use when…” — describes the request, from the user's side of the conversation.
Versioned and licensed — metadata.version is bumped by release automation, not by hand.
Every token in SKILL.md is paid on every triggered request. Reference files are paid only when the task needs them — so the skill says exactly when to load each one.
Checkpoint 1 · 7 min
Write the frontmatter description for your skill idea
Worksheet · also in the handout
name:
description: (3–5 sentences)
Scoring criteria
Specific verbs: what it produces or does, concretely.
Artifact nouns: the outputs named the way users name them.
“Use when…” clause: request shapes, quoted in user language.
Bounded: says what it is not for if a sibling skill exists.
Test: would a stranger reading nothing but your description know exactly which requests should fire it — and which should not?
## Operating Rules
1. Gather evidence first. Read the repo, docs, git state, PR/CI/deploy
state, logs, … Mark uncertain claims as `needs verification`.
2. Pick the smallest artifact mode that fits the request:
- `operator-brief`: what happened, what is next, blockers, risks…
- `build-plan`: problem, target shape, phases, owners, validation…
4. Keep it self-contained. Inline CSS and JS; no external fonts,
CDNs, assets, analytics, or build step unless the user explicitly asks.
11. Validate before final response: doctype, standalone
<html>/<body>, no missing local assets, expected sections…
Numbered, imperative, and ordered as a procedure: evidence → mode choice → constraints → validation. An agent can cite “rule 4” back to you; it cannot cite a vibe.
Act 2 · Rules
A mode menu is a decision procedure
deckify mode
Choose when the user wants…
talk-deck
a YouTube/live presentation with speaker notes and run-of-show
workshop-deck
a talk plus exercises, labs, checkpoints, and handouts — this deck
essay-deck
a presentation plus a downloadable long-form guide
demo-deck
a session centered on live demos with fallback screenshots
launch-deck
product narrative, proof, risks, roadmap, and a CTA
teaching-guide
a PDF-first guide with an optional slide mode
“Smallest mode that fits” converts a fuzzy judgment call into a lookup. The agent stops guessing scope; the user gets predictable shapes.
Act 2 · Rules
Anti-pattern: rules you cannot test
Vague vibe
Testable rule (from this repo's skills)
“Make it look professional”
Default to the Hardcopy tokens: one accent color, hairlines over shadows, border-radius ≤ 2px, serif display headings.
“Should work offline”
Inline all CSS/JS; no external fonts, CDNs, or remote assets — the validator errors on any remote reference.
“Help the presenter”
Every substantive slide contains <aside class="notes">; the validator reports each missing one by slide number.
“Keep it reasonable in size”
Warn above 512 KiB; hard error above 2 MiB.
Litmus test: could a script check it? If yes, it is a rule. If no, it is a hope. Unbounded scope (“and anything else useful”) fails the same test.
Checkpoint 2 · 7 min
Convert three vague instructions into testable rules
Rewrite each so a script — or a strict reviewer — could verify compliance
“Make the output look nice.”
“Don't make the file too big.”
“Add some interactivity if it helps.”
A good rewrite has
a named standard (a token set, a template, a contract file) instead of an adjective,
a number or enumerable condition where one exists,
a consequence: what happens, or what check fails, when it is violated.
Act 3 · Verify
A skill that can check its own output beats one that hopes
Fig 4 · The validation loop the deckify skill mandates before any final response
Act 3 · Verify
Case study: what --profile deck enforces
Contract item
Check
Severity
Standalone document
doctype, one <html>/<body>, non-empty title, viewport meta
error
Slides
≥ 2 <section class="slide"> elements
error
Keyboard nav
a script registering a keydown listener
error
Speaker notes
every substantive slide has <aside class="notes">, reported per slide
error
Script safety
inline script only; no handler attributes; no javascript: URLs
error
Self-contained
no remote assets, fonts, or stylesheet links
error
Print mode
@media print rules for the handout
warning
Size
> 512 KiB warns; > 2 MiB errors
both
The contract lives in references/deck-template.md; the code lives in src/validate.js. Doc and check ship together, in the same repo, versioned together.
lab/broken-deck.html: INVALID — 4 errors
[no-viewport] missing viewport meta tag
[no-slides] found 1, need ≥ 2
[no-keyboard-nav] no keydown listener
[missing-notes] slide 1 ("Demo")
Steps
Copy the broken snippet from the handout (Exercise 3) into lab/broken-deck.html.
Run the validator. Read each error code.
Fix one error at a time; re-run after each fix.
Stop at: valid — 0 errors.
Finished early? Run it against this deck file and read what a passing report looks like.
Field guide
Three ways skills die
Failure
Symptom
Fix
Never triggers
Description written from the implementation's side (“parses AST…”); users' actual words appear nowhere.
Rewrite from the requester's side: verbs, artifact nouns, quoted “use when” phrases.
Triggers too much
Grabby verbs and unbounded nouns (“helps with any web content”); fires on requests it handles badly.
Narrow the nouns; name what it is not for; point sideways at sibling skills.
Monolithic
2,000-line SKILL.md taxing every invocation; agents skim and miss the rules that matter.
Keep SKILL.md to rules and pointers; move depth to references/ with load-when conditions.
htmlify and deckify dodge the second failure by pointing at each other: htmlify's rule 8 sends full presentation requests to deckify.
Press G for the handout; print it from there for the PDF.
Workshop handout · keep after the session
Authoring Agent Skills — Workshop Guide
Modeworkshop-deck
Duration60 minutes
Source repozakelfassi/htmlify
Formatagentskills.io
This is not a transcript. It is the part of the workshop you will want at your desk next week: the three act summaries, the three exercises with hints and solution sketches, the skill-authoring checklist, and references.
1.0Act 1 — Anatomy of a skill
A skill is a single SKILL.md with three layers that fail in three different ways:
Frontmatter (name, description, license, metadata.version, metadata.source). The description is the trigger surface: the harness matches incoming requests against this text alone. Write it from the requester's side of the conversation — specific verbs, artifact nouns, and a literal “Use when…” clause quoting how users actually ask.
Body: numbered operating rules, a mode menu, validation orders, and a final-response contract. This is the behavior contract once the skill is loaded.
Pointers into references/: depth that is loaded on demand. Every token in SKILL.md is paid on every triggered request; reference files are paid only when needed. Each pointer should name the file and the condition: “Load references/hardcopy.md … before styling any artifact.”
2.0Act 2 — Operating rules agents follow
Rules that survive contact with an agent share three properties:
Numbered and imperative. “1. Gather evidence first.” An agent can cite rule numbers; it cannot cite vibes. The ordering encodes the workflow: evidence → mode → constraints → validation.
Smallest mode that fits. Replace “use judgment” with an enumerated menu plus a selection rule. htmlify names ten artifact modes; deckify names six deck modes. The menu turns scope guessing into a lookup.
Testable, not vague. The litmus test: could a script check it? “Look professional” is a hope; “one accent color, hairlines over shadows, radius ≤ 2px” is a rule. Unbounded scope (“and anything else useful”) fails the same test.
3.0Act 3 — Making output verifiable
A skill that can check its own output beats one that hopes. The htmlify repo demonstrates the full pattern:
The skill orders the loop: “Before the final response, run the bundled validator … Fix every reported error before responding; report remaining warnings.”
The contract is documented next to the code: references/deck-template.md describes the DOM contract; src/validate.js enforces it; both version together.
Two enforcement points: the agent validates before responding (skill rule), and CI re-validates every committed example (repo rule). The skill can be ignored; CI cannot.
Error messages are agent UX: “Substantive slide 7 has no speaker notes” is fixable in one step; “invalid deck” is not.
4.0Exercise 1 — Write a trigger description
Exercise 1 · 7 min
Task. Pick a workflow you repeat with your agent. Write the frontmatter name and a 3–5 sentence description.
Hints. Start with the verb and the artifact (“Generate release notes…”). List the output nouns the way users say them. End with a “Use when the user asks…” clause quoting 2–3 real request phrasings. If a sibling skill exists, say what this one is not for.
Solution sketch (for a release-notes skill):
yaml · sample answer
name: release-notes
description: Generate release notes and changelogs from merged PRs
and commit history, producing version-grouped markdown and a
paste-ready announcement. Use when the user asks to "write release
notes", "summarize what shipped", or turn git history into an
announcement. Not for live deploy status — use a status skill.
5.0Exercise 2 — Vague to testable
Exercise 2 · 7 min
Task. Rewrite each instruction so a script — or a strict reviewer — could verify compliance.
“Make the output look nice.”
“Don't make the file too big.”
“Add some interactivity if it helps.”
Hints. Name a standard instead of an adjective; attach a number where one exists; state which check fails on violation.
Solution sketches (each maps to a shipped rule in this repo):
“Default to the Hardcopy design tokens: serif display headings, mono-uppercase metadata, at most one accent color, hairlines instead of shadows, border-radius ≤ 2px. No gradients, no stock imagery.”
“Keep the artifact under 512 KiB; the validator warns above that and errors above 2 MiB.”
“For deck-style artifacts, add keyboard navigation (arrows, Home/End) registered with addEventListener in one inline script; inline handler attributes and external scripts are validator errors.”
6.0Exercise 3 — Validator lab
Exercise 3 · Lab · 8 min
Task. From a clone of github.com/zakelfassi/htmlify, save this deliberately broken snippet as lab/broken-deck.html:
Expect four errors, then fix one at a time, re-running after each:
no-viewport — add the viewport <meta> to <head>.
no-slides — a deck needs at least two <section class="slide"> elements; add a second slide.
no-keyboard-nav — add one inline <script> that registers a keydown listener with addEventListener and toggles the active class on arrow keys.
missing-notes — each substantive slide (200+ chars of text, or containing h2/h3/table/figure) needs <aside class="notes"> with real speaker notes.
Done when the report reads valid — 0 errors. Then run the same command against examples/deckify/workshop-deck.html to see a passing report for a full deck.
7.0Skill-authoring checklist
Frontmatter has name and a description written from the requester's side: specific verbs, artifact nouns, “use when” phrasings.
license, metadata.version, and metadata.source are set; version bumping is automated where possible.
Operating rules are numbered, imperative, and ordered as the actual workflow.
Rule 1 is evidence-first: read the source material before producing anything.
A mode menu (“smallest mode that fits”) replaces open-ended judgment.
Every rule passes the litmus test: a script could check compliance.
Output constraints are explicit: self-contained, size-bounded, format-contracted.
Depth lives in references/; each pointer names the file and the load-when condition; SKILL.md stays short.
A validation command exists, the skill orders the agent to run it before the final response, and errors must reach zero.
CI re-validates committed examples, so the contract holds even when nobody is watching.
The skill defines its final-response contract: what to report (paths, mode, validation results, gaps).
Sibling skills point at each other to resolve boundary disputes (htmlify rule 8 routes full decks to deckify).