What Are Tokens
Full Lesson Reference
What are tokens?
Tokens are the units Claude Code uses to process text. Every word you type, every response Claude gives, every file it reads - all measured in tokens. Understanding tokens is the difference between fast cheap sessions and slow expensive ones.
The basics
- 1 token ≈ 0.75 words in English (approximately)
- A 1,000-word document ≈ 1,300 tokens
- A typical prompt + response round ≈ 500-2,000 tokens
- A full session can use 100,000+ tokens
- Long sessions with lots of file reads = 500,000+ tokens
Claude's context window has a limit measured in tokens. Bigger context = more tokens = more cost + slower responses.
The compounding problem
Every message you send resends the ENTIRE conversation history. This is the single most important thing to understand about tokens.
Your first message costs ~20K tokens (with CLAUDE.md, skills, MCPs, your prompt). Your 5th message costs ~80K tokens (all 4 previous + new). Your 15th message costs ~250K tokens.
Extra token overhead compounds. 5K extra tokens per message × 15 messages = 75K extra tokens wasted.
What uses tokens before you type
You haven't typed a single prompt and Claude has already loaded 15K-30K tokens:
- System prompt + built-in tools - ~10K (can't change)
- CLAUDE.md files - ~1-3K (keep lean, Module 02)
- Skill metadata - up to 16K (the hard budget, Module 09)
- MCP schemas - deferred loading now, but still some overhead
Your lean baseline is ~15K. A bloated baseline is 50K+. Gap = huge cost over a week.
What uses tokens as you work
- Every prompt + response
- Every file Claude reads
- Every MCP tool call (schema loads when invoked)
- Every skill you invoke (full skill body loads)
- Every data query result
- Every command output
Watch a single session: from 20K to 200K tokens over 15-20 messages is normal. Watch for how fast context fills - that's your usage rate.
The efficiency hierarchy (repeat from Module 08)
Most efficient to least
- Database queries - pre-aggregated, one call, minimal tokens
- CLIs - zero idle overhead, lean per call
- MCPs with deferred loading - small idle, moderate per call
- Large file reads - one big load hits context hard
- Web scraping + raw HTTP - most expensive + fragile
Practical impact
Token bloat doesn't usually break Claude. It just
- Makes sessions feel slower
- Costs you more if you're paying per token
- Fills context faster - you hit the 50% rule sooner
- Reduces how much actual work fits in one session
Keep overhead lean = longer productive sessions.
Power-user tips
- Install ccstatusline - shows live context % + model + cost at the bottom of your terminal. Full walkthrough in Lesson 3.
- Ask Claude about overhead - "how much context have I used in this session?"
- Prefer DB queries over MCP calls for data you query often
- Read parts of files, not whole files - "read rows 1-100" not "read this 10MB CSV"
Action items
☐ Understand: tokens are the currency of every session
☐ Remember: every message resends full history - cost compounds
☐ Know your baseline - what loads before you type
☐ Prefer databases > CLIs > MCPs > file reads for efficiency
Next lesson: Trimming your token overhead.
Exercises
- Review the concepts covered in this lesson: What Are Tokens.
- Write down your key takeaway from this lesson.
- Practice running any commands or prompts mentioned above inside your terminal.