module 14 verify output

Claude QAs itself

System Text-to-Speech Ready

Slide: 0:00 / 0:00

Slide 1 of 0Interactive Deck

Full Lesson Reference

Claude built the deliverable. Claude can also review it - if you ask the right way. This lesson is the specific prompts that catch common issues before you do the final review.

Why Claude CAN partially QA itself

A review pass has different context than the generation pass. When Claude built the report, it was focused on pulling data + assembling sections. When asked to review, it can take a step back and check for:

Claims vs evidence (is every claim supported?)
Internal consistency (do numbers add up?)
Format compliance (does it match the template?)
Cross-references (does section 1 align with section 3?)

Why Claude CAN'T fully QA itself

Blind spots - if Claude hallucinated a number, it won't flag it as hallucinated
Confirmation bias - Claude tends to defend its own output
Can't verify external truth - Claude can't log into your ad platform + check

Claude's review catches 60-70% of issues. Your review catches the rest. Together = close to 100%.

The core QA prompts

Data source trace

For every metric in this deliverable, trace it back to the source. Which database query, MCP pull, or file did each number come from? Flag anything that traces only to "Claude calculated" or "inferred from context".

Forces Claude to show its work. Unsourced numbers get caught.

Brand compliance

Compare this deliverable against the brand guide at @brand-guide.md (or the client's project CLAUDE.md). Flag any colour, font, logo, voice, or tone inconsistencies.

Claude cross-references. Catches off-brand elements.

Responsiveness + mobile

Review this HTML for responsive design issues. Check it renders properly at 375px (phone), 768px (tablet), and 1440px (desktop). Flag any layout breakage at each size.

Claude imagines the layouts. Catches obvious mobile breaks.

Hallucination check

Read this deliverable critically. Flag any claim, number, statistic, or recommendation that you can't verify from the data you actually pulled this session. Be honest about what's made up vs what's sourced.

This is the honest one. Claude often catches its own hallucinations when prompted to be skeptical. It's easier to be self-critical when you're reviewing vs generating.

Client-name check

Search this file for any client names that aren't [correct client name]. Flag anything that looks like it was copied from another project or template.

Catches the classic mistake of client-A data in a client-B report.

Internal consistency

Check this deliverable for internal consistency. Do the numbers in section 1 match what's referenced in section 3? Does the headline finding align with the detailed analysis? Flag any contradictions. Catches cases where Claude pulls different data for different sections of the same report.

Link + functionality check

Review every link and button in this HTML. For each: does it have a valid href? Does it point to the right destination? Are any placeholder links (href="#" or href="about:blank") left in?

Catches broken links before deployment.

Stacking QA prompts

For high-stakes deliverables, run multiple prompts in sequence:

Run these 5 QA checks on this deliverable in order

Trace every metric to source
Check brand compliance vs the brand guide
Hallucination check - flag anything you can't verify
Client-name check - flag any cross-contamination
Internal consistency check

After each check, pause and show me the findings. I'll approve before moving to the next.

5 review passes. Each with fresh focus. Catches way more than one big review.

The red-team variant

Module 09 covered the red-team thinking-skill pattern. If you built one (or have it from a skill pack), apply it to your deliverable. If not, use a one-off prompt instead:

Play red-team on this report. Take the role of a sceptical analyst whose job is to poke holes in it.

Here's the report we just built. Attack it like you're a skeptical client or a competitor who wants to find problems. What would a sceptical eye find? What's the weakest claim? What would embarrass us? Claude shifts into critical mode. Finds issues a helpful Claude wouldn't flag.

Building a /qa skill

If you repeat these reviews often, skill it


terminal
/skill-creator

I want a /qa skill that takes a deliverable path and deliverable type (report / audit / email / landing page / proposal). It runs the appropriate checklist for that type + the generic QA prompts (data trace, hallucination check, consistency). Output: a pass/fail report with specific findings.

Now /qa report drafts/weekly-report.html runs the full review in 1 command.

When Claude's QA isn't enough

First-time deliverables for new clients - you know the client better than Claude does
Subjective tone/voice judgements - Claude can check against the guide but can't feel "off"
Strategic appropriateness - is this the right recommendation for THIS client's situation?
Legal / regulatory content - you or a lawyer, not Claude

Use Claude's QA for the mechanical pass. Use your judgement for the subjective pass.

Power-user tips

Run QA in a fresh session - sometimes - if the original session context is bloated, a fresh session reviewing the file alone is sharper
Save your QA prompts as skills - consistent review process every time
Red-team + standard QA together - helpful + critical Claude covers more ground than either alone
Keep a log of issues caught - patterns emerge, you can update rules to prevent recurrence

Action items

☐ Memorise the core QA prompts: trace, brand, hallucination, consistency, links

☐ For your next deliverable, stack 3-5 QA prompts before your own review

☐ Build a /qa skill once you've done this enough times

☐ Keep a log of issues caught - patterns become new rules

Next lesson: Never-let-Claude-do-X rules.

Exercises

Review the concepts covered in this lesson: Claude QAs itself.
Write down your key takeaway from this lesson.
Practice running any commands or prompts mentioned above inside your terminal.