Build-a-Skill - The Four Scenarios
Who this is for: programme managers, ops, and admin - non-technical, mixed confidence. No code, no setup. You pick a real, repetitive task and turn it into a one-page, reusable Skill you could paste into any chatbot on Monday.
The whole session is tool-agnostic. ChatGPT, Gemini, Claude, Copilot - whatever you already have open. The Skill is the deliverable, not the tool.
The one rule that runs through everything: describe what “good” looks like before you prompt. That’s Way #1: Describe before you generate. The Skill one-pager template is built to force it - the Output section sits above the prompt for a reason.
How the session runs
| Length | 2 hours. Each scenario is sized for ~30–40 min end to end. |
| You will | Pick one scenario, fill in the SKILL template, run it once on the supplied sample data, then read every line of the output and find what the AI got wrong. |
| Scenario 1 | Is the guided build - we do it together, step by step, as the warm-up. Then you pick scenario 2, 3, or 4 to do solo or in a pair. |
| Golden rule | Use the sample data only (sample-data/). It is synthetic. Never paste real names, client data, or confidential figures - Way #3. |
| The two beats that win the room | (1) the privacy check - what you strip before pasting; (2) the human sign-off - the one thing the AI got confidently wrong. Every scenario is rigged so there’s something to catch. |
Each scenario below gives you four things: the real task, the input file, the spec-first move (what to define before you prompt), and what good looks like (your checklist). Each points to a finished worked example in skill-examples/ you can compare against.
Scenario 1 - Weekly status report (the guided build)
The real task. Every Monday someone turns messy sync notes into a clean status update for stakeholders. It’s the single most repeated programme-management chore in the building. We’ll build the Skill for it together.
For: programme / project managers writing the weekly stakeholder update.
Input: sample-data/sample-meeting-notes.txt - raw, lower-case, half-finished sync notes from “Project Helix”.
Spec-first - define the Output before you write a single prompt. Open the template, go straight to section 3, and pin down the shape:
- A short table: Summary | Key decisions | Risks (R/A/G) | Actions (Owner, Due).
- A rule for missing owners: if the notes don’t clearly state who owns an action, write
TBC- do not guess. (This one matters - see the worked example.) - A separate Open questions / unconfirmed list so nothing gets quietly smoothed over.
- Neutral, factual tone. Max ~8 rows.
Only once that shape is written do you draft the prompt (section 4) and paste it in.
What good looks like ✅
- Decisions, risks, and actions are separated - not one mushy paragraph.
- Every action has an owner or an honest
TBC. None invented. - The unconfirmed bits (legal review, the analytics-numbers issue) are surfaced as open questions, not dropped.
- You read every line and fixed at least one thing the AI got wrong (Way #2).
- Before pasting, you stripped the names, the vendor, and the confidential figure (Way #3).
Worked example: skill-examples/status-report-skill.md
Scenario 2 - Expense report cleanup
The real task. A monthly expenses export lands as a messy CSV - inconsistent categories, typos, the odd duplicate. Someone has to tidy and sanity-check it before it goes to finance. The Skill turns “stare at a spreadsheet for 40 minutes” into “review the AI’s flagged list in 5.”
For: ops / finance admin preparing an expense file for sign-off.
Input: sample-data/sample-expenses.csv - 27 rows, deliberately imperfect.
Spec-first - define the Output before you prompt. In section 3 of the template, decide what “clean” means:
- A cleaned table (consistent category casing, consistent date format) plus a separate “Flags for review” list - the AI proposes, a human decides.
- The cast-iron rule: flag, don’t delete. Anything suspect (possible duplicate, odd value, wrong format) goes on the flag list with a reason. The AI never silently removes a row.
- Do not convert or total across currencies. If amounts are in mixed currencies, sub-total per currency and say so. (Mixing GBP/USD/EUR into one number is the classic trap - see worked example.)
- A privacy line: strip anything personal from the notes column before pasting.
What good looks like ✅
- Categories normalised (e.g.
travel/Travel→ one form); dates in one format. - Duplicates and oddities are flagged with a reason, not deleted.
- Currencies are sub-totalled separately - no single blended total.
- A typo or two is caught and flagged rather than guessed-corrected.
- You spotted the personal email hiding in a notes cell and removed it (Way #3).
Worked example: skill-examples/expense-cleanup-skill.md
Scenario 3 - Incoming request triage
The real task. A shared inbox or request log fills up with mixed asks - some genuinely urgent, most not. Someone triages priority and owner each morning. The Skill drafts that triage in seconds; a human still decides.
For: ops / admin running a shared request queue or service desk.
Input: sample-data/sample-requests.csv - 12 requests, varied urgency, plain-English wording.
Spec-first - define the Output before you prompt. In section 3:
- A triage table: ID | One-line summary | Priority (High / Medium / Low) | Suggested owner/team | Why this priority.
- The “Why” column is non-negotiable - it forces the AI to justify the rating so a human can sanity-check it rather than trust a bare label.
- Read urgency from the words, not the sender’s tone. “URGENT” in caps isn’t automatically High; a quiet note about no heating for two days might be.
- A redaction rule: personal contact details and ID numbers must be flagged and stripped, never echoed into the triage output.
What good looks like ✅
- Every row has a priority and a one-line justification.
- Genuinely time-critical items (out of paper before tomorrow’s board pack; heating off for days) are rated High - and you checked the AI didn’t under-rate them.
- Low-stakes “whenever convenient” items aren’t inflated to High just because they’re recent.
- No personal phone number or employee ID appears anywhere in the output - they were flagged and removed (Way #3).
- You overrode at least one priority the AI got wrong (Way #2).
Worked example: skill-examples/request-triage-skill.md
Scenario 4 - Multi-document summary
The real task. A decision needs a one-page read-out drawn from several short docs - a policy note, a vendor update, a risk extract. Someone reads all three and writes the brief. The Skill drafts it; the human checks it holds together. The hard part isn’t summarising - it’s noticing where the documents disagree.
For: programme managers / ops preparing a decision brief from a small document pack.
Input: sample-data/sample-docs/ - three files: policy-note.txt, vendor-update.txt, risk-extract.txt.
Spec-first - define the Output before you prompt. In section 3:
- A one-page brief: Background | Key facts | Conflicts / open questions | Recommended next step.
- The make-or-break instruction: a dedicated “Conflicts / open questions” section. Tell the AI explicitly to surface anything where the documents disagree, rather than picking one version and moving on.
- An anti-smoothing rule: do not resolve a contradiction by choosing the more confident source. If dates or facts clash, report the clash.
- Privacy line: strip the named individuals and the vendor name before pasting.
What good looks like ✅
- The brief names the date conflict explicitly - the fixed go-live vs the vendor’s later readiness date - as an open question.
- It doesn’t quietly state a single go-live date as settled fact.
- The flagged risk about the clash is carried through, not dropped.
- The recommended next step is “escalate / confirm the date,” not a false “all on track.”
- Names and vendor anonymised before pasting (Way #3); you verified the conflict made it through (Way #2).
Worked example: skill-examples/doc-summary-skill.md
Wrap-up
By the end you have a one-page Skill you can reuse, and you’ve felt both halves of working with AI: it drafts fast, and a human still signs the work. File your Skill where your team can find it - that’s the Reuse notes section, and Way #5: keep it simple, write down the why.
The same five ways of working underpin the Build session next door - the only difference is the team writes an API contract instead of a Skill one-pager. Same destination, faster road.
Facilitator only - planted elements
Cut this section before printing participant handouts. It exists so the Skill Scout (roaming facilitator) can nudge people toward the two beats that make the session land: the privacy strip and the human catch. Don’t hand people the answer - ask the question that gets them to find it. The whole point is that a lazy “just summarise this” prompt produces a confidently-wrong result, and a spec-first prompt with an anti-guessing rule doesn’t.
Each scenario is ~30–40 min. Every dataset carries (a) privacy bait - something that should never reach the chatbot, and (b) at least one AI-catchable oversight error - something the AI gets wrong unless the spec told it not to.
Scenario 1 - sample-meeting-notes.txt
| Planted element | What it teaches | Scout nudge | |
|---|---|---|---|
| Privacy bait | Full name “Maria Gonzalez”; vendor “Brightwave Ltd”; budget “480k (confidential)“ | Way #3: names, vendor, and confidential figures must be stripped/disguised before pasting. | ”What in those notes would you not want a public chatbot to keep?” |
| Oversight catch (primary) | Line 11: “dave picking up the vendor chase i think? not 100% sure who owns that actually.” | The owner is not stated. A spec-first prompt that says “write TBC if owner not stated, do not guess” outputs owner = TBC. A lazy prompt makes the AI confidently assign Dave. | ”The report says Dave owns the vendor chase - does the source actually say that?” |
| Oversight catch (secondary) | Line 13: the analytics dashboard “showing wrong numbers again” is easy to drop because it’s the last, throwaway line. | Surfacing-not-smoothing: an open issue should survive into the output, not be trimmed. | ”Is everything from the notes accounted for, including the last line?” |
Scenario 2 - sample-expenses.csv
| Planted element | What it teaches | Scout nudge | |
|---|---|---|---|
| Privacy bait | Personal email [email protected] in the notes cell of the coffee row. | PII hides in free-text columns, not just obvious fields. | ”Did you read every cell - including the notes column - before pasting?” |
| Oversight catch - flag-don’t-delete | Two identical “Monitor stand” rows (£45.99 each). | These should be FLAGGED as a possible duplicate, not silently deleted. A genuine double-purchase is plausible - a human decides. | ”The AI removed a row - should it have, or should it have asked you?” |
| Oversight catch - normalise | Inconsistent category casing (travel/Travel, meals/Meals); a typo “Taxii”. | Cleaning = consistency + flagging typos, not guess-correcting them. | ”Are your categories one form now? And did it flag ‘Taxii’ or quietly change it?” |
| Oversight catch - dates | Mixed date formats - most ISO, one “12/03/26”. | Spec should pin one date format; the AI must convert visibly. | - |
| Oversight catch - currency (the big one) | Mixed currencies: GBP / USD / EUR. | The AI will often sum across currencies or invent a conversion rate. Correct behaviour: sub-total per currency, no blended total, no made-up FX. | ”That single total - is it adding pounds, dollars and euros together?” |
Scenario 3 - sample-requests.csv
| Planted element | What it teaches | Scout nudge | |
|---|---|---|---|
| Privacy bait | REQ-003: “Priya Nair, mobile 07700 900142”; REQ-009: “employee ID 48817.” | Contact details and ID numbers must be FLAGGED and stripped, never echoed into the triage output. | ”Search your output for that phone number - is it in there? It shouldn’t be.” |
| Oversight catch - under-rated High | REQ-007 (out of printer paper, board pack due tomorrow AM) and REQ-011 (heating off two days, people in coats) are genuinely High. | Urgency lives in the consequence and deadline, not in capital letters. The AI frequently under-rates these because the wording is calm. | ”No paper before tomorrow’s board pack - is that really Medium?” |
| Oversight catch - over-rated / correctly Low | REQ-002, REQ-006, REQ-010 are genuinely Low (“when there’s a chance,” “whenever convenient,” “order more markers”). | Recency and politeness aren’t priority. Don’t inflate. | ”Anything rated High here that’s actually a ‘whenever’ job?” |
Note: REQ-005 (“URGENT, client demo at 2pm”) is legitimately High - it’s the control case. The teaching point is that the AI should reach the same High for REQ-007/011 without the word URGENT.
Scenario 4 - sample-docs/
| Planted element | What it teaches | Scout nudge | |
|---|---|---|---|
| Privacy bait | ”Sandra Whitfield” (policy-note); “Brightwave Ltd” (vendor-update). | Names and vendor anonymised before pasting. | ”Who’d you swap out before pasting the pack?” |
| Oversight catch - the CONTRADICTION (whole point of this scenario) | policy-note.txt: go-live is fixed 1 June, “will not move.” vendor-update.txt: integration testing can’t start until 2 June, readiness 9 June. risk-extract.txt R-15 explicitly flags the clash and says “Needs escalation.” | A good summary must surface this conflict as an open question - not pick the confident source and state “go-live 1 June” as settled fact. The three docs are designed so the truth only appears when you cross-reference them. | ”Your brief says go-live is 1 June - does the vendor agree? What does R-15 say?” |
The failure mode to watch for: a “just summarise these three documents” prompt almost always reports a clean “go-live 1 June” and buries or omits the vendor’s 9-June readiness, because the policy note is the most authoritative-sounding source. The spec-first fix is a dedicated “Conflicts / open questions” section plus an explicit “do not resolve contradictions by choosing the more confident source” instruction. If a team’s brief surfaces the date clash, they’ve nailed Way #2. Call it out loudly.
Built at Innovation Day. The five ways of working: ways-of-working.md. Skill template: SKILL-template.md.