SKILL: Clean and standardise a messy expenses CSV
A worked Skill from Innovation Day. Built spec-first (Way #1), with a privacy check before any data is pasted (Way #3) and a human verifying what the AI flagged before anything is deleted (Way #2).
1. Goal - what this Skill does, and when to use it
Take an exported expenses CSV that’s full of inconsistent categories, mixed date formats and the odd typo, and turn it into a clean, consistent table - with the dodgy rows flagged, not silently fixed. Use it before you submit a claim batch or reconcile a month.
It replaces the squinting-at-spreadsheet-cells pass everyone does by hand.
2. Inputs - what you paste in for it to work
- Input 1: the expenses CSV (columns: date, description, category, amount, currency, notes).
- Input 2: your team’s fixed category list (see the prompt - using a fixed list is what stops the AI inventing new categories).
PRIVACY CHECK (Way #3): expense exports leak personal data in the notes column.
In the file Do this [email protected](a personal email in a notes cell)Remove it before pasting. Personal email = PII. Any name, claimant ID, or “sent receipt to…” trail Strip it - the AI doesn’t need it to clean dates and categories. Better still - don’t paste the data at all. See the stretch path in section 4: ask the AI for a formula or script, then run it on your own machine. The real data never leaves your laptop. That’s the strongest version of Way #3.
3. Output - what “good” looks like (write this BEFORE the prompt)
A cleaned table with exactly these columns:
| Column | Rule |
|---|---|
| Date (ISO) | every date as YYYY-MM-DD; fix any that aren’t |
| Category | mapped to ONE of the fixed list - never a new value |
| Amount | unchanged numeric value |
| Currency | unchanged (GBP/USD/EUR) - do NOT convert |
| Flags | a short note on anything suspect: duplicate, typo, format fixed, PII spotted |
Below the table: a short Flags summary listing duplicates, typos and anything needing a human decision. Nothing is deleted - suspect rows stay in, flagged.
4. Instructions / Prompt - the exact text to paste
You are helping me clean an expenses CSV. Do NOT delete any rows.
Output a table with these exact columns:
Date (ISO) | Category | Amount | Currency | Flags
Rules:
- Convert every date to ISO format YYYY-MM-DD. If a date is ambiguous
(e.g. 12/03/26), flag it as "date format assumed DD/MM/YY - confirm".
- Map every category to ONE of this fixed list ONLY:
Travel, Meals, Accommodation, Training, Equipment, Software, Office.
Fix casing/variants (e.g. "travel" -> Travel, "meals" -> Meals).
If something doesn't fit the list, put it in Flags - do not invent a category.
- Leave Amount and Currency exactly as they are.
- DO NOT convert currencies and DO NOT sum amounts across currencies.
- In Flags, mark: likely duplicates, spelling typos, any date you reformatted,
and any personal data you notice (emails, names). Flag duplicates - do not remove them.
After the table, give a short "Flags summary" listing every duplicate, typo,
and anything that needs a human to decide.
CSV:
[PASTE CSV WITH PERSONAL DATA REMOVED]
Stretch path (the privacy win) - ask for a script, run it yourself. Instead of pasting the file, paste only the header row and ask:
Here is the header of my expenses CSV: date,description,category,amount,currency,notes Write me an Excel formula (or a short Python/Power Query script) that: - converts the date column to ISO YYYY-MM-DD, - standardises the category column to this fixed list: Travel, Meals, Accommodation, Training, Equipment, Software, Office, - adds a "Flags" column marking duplicate rows and rows with mixed currencies. Explain how to run it. Do not ask me to paste my data.You run the formula/script on your own machine. The actual expense rows - and that personal email - never touch the chatbot. Same result, zero data exposure.
5. Worked Example - one real run, anonymised
Input (a few rows, personal email already stripped):
12/03/26,taxi airport,travel,42.00,GBP, 2026-03-09,Monitor stand,Equipment,45.99,GBP, 2026-03-09,Monitor stand,Equipment,45.99,GBP, 2026-03-12,Software licence (annual),Software,180.00,USD, 2026-03-14,Taxii to office,Travel,9.80,GBP, 2026-03-16,Client dinner,Meals,156.00,EUR,Madrid office visitOutput it produced:
Date (ISO) Category Amount Currency Flags 2026-03-12 Travel 42.00 GBP Reformatted from 12/03/26- assumed DD/MM/YY, confirm2026-03-09 Equipment 45.99 GBP Possible duplicate (see next row) 2026-03-09 Equipment 45.99 GBP Possible duplicate of “Monitor stand” above - keep both, human to decide 2026-03-12 Software 180.00 USD Non-GBP currency 2026-03-14 Travel 9.80 GBP Typo in description: “Taxii” → likely “Taxi” 2026-03-16 Meals 156.00 EUR Non-GBP currency Flags summary
- Duplicate: two identical “Monitor stand” rows on 2026-03-09 (£45.99 each). Both kept - a human decides whether it’s a genuine double purchase or a double-entry. (The AI did NOT delete one - correct behaviour.)
- Typo: “Taxii to office” → “Taxi”.
- Ambiguous date:
12/03/26reformatted assuming DD/MM/YY → 2026-03-12.- Mixed currencies: rows in GBP, USD and EUR. Not converted and not summed - totalling across currencies would be wrong.
- PII reminder: a personal email was present in the original notes and was removed before pasting. (Verify your own file before sharing.)
The catch (Way #2): the two failure modes here are silent ones. A naive “tidy this up” prompt will (a) delete one of the duplicate monitor-stand rows - destroying data you needed to see - and (b) helpfully convert and sum the USD/EUR rows into one GBP total that’s quietly wrong. The fixed prompt forbids both. Always eyeball the duplicate decision and never trust a single total that mixes currencies.
6. Guardrails - how to stay safe and sane
- Privacy (Way #3): strip personal emails/names from the notes column first - or use the stretch path and don’t paste the data at all.
- Human check (Way #2): you decide what happens to flagged duplicates. Never let the AI delete rows. And never trust a total that mixes GBP/USD/EUR - that’s a human (or a proper FX step) job.
- Limits: this standardises format, not truth. It won’t know a £156 “client dinner” breaches policy, or that a 0.00 row is a free webinar. It cleans; you judge.
7. Reuse notes - so it isn’t lost
- Owner: (your name)
- Last updated: (date)
- Lives in: Finance/Ops shared drive → Skills folder. Pin the fixed category list next to it so everyone maps the same way - Way #5.
Built at Innovation Day. The five ways of working: ways-of-working.md.
Downloads for this session
Grab the templates and sample files used here.