module 13 tokens performance

What Are Tokens

System Text-to-Speech Ready

Slide: 0:00 / 0:00

Slide 1 of 0Interactive Deck

Full Lesson Reference

What are tokens?

Tokens are the units Claude Code uses to process text. Every word you type, every response Claude gives, every file it reads - all measured in tokens. Understanding tokens is the difference between fast cheap sessions and slow expensive ones.

The basics

1 token ≈ 0.75 words in English (approximately)
A 1,000-word document ≈ 1,300 tokens
A typical prompt + response round ≈ 500-2,000 tokens
A full session can use 100,000+ tokens
Long sessions with lots of file reads = 500,000+ tokens

Claude's context window has a limit measured in tokens. Bigger context = more tokens = more cost + slower responses.

The compounding problem

Every message you send resends the ENTIRE conversation history. This is the single most important thing to understand about tokens.

Your first message costs ~20K tokens (with CLAUDE.md, skills, MCPs, your prompt). Your 5th message costs ~80K tokens (all 4 previous + new). Your 15th message costs ~250K tokens.

Extra token overhead compounds. 5K extra tokens per message × 15 messages = 75K extra tokens wasted.

What uses tokens before you type

You haven't typed a single prompt and Claude has already loaded 15K-30K tokens:

System prompt + built-in tools - ~10K (can't change)
CLAUDE.md files - ~1-3K (keep lean, Module 02)
Skill metadata - up to 16K (the hard budget, Module 09)
MCP schemas - deferred loading now, but still some overhead

Your lean baseline is ~15K. A bloated baseline is 50K+. Gap = huge cost over a week.

What uses tokens as you work

Every prompt + response
Every file Claude reads
Every MCP tool call (schema loads when invoked)
Every skill you invoke (full skill body loads)
Every data query result
Every command output

Watch a single session: from 20K to 200K tokens over 15-20 messages is normal. Watch for how fast context fills - that's your usage rate.

The eﬃciency hierarchy (repeat from Module 08)

Most eﬃcient to least

Database queries - pre-aggregated, one call, minimal tokens
CLIs - zero idle overhead, lean per call
MCPs with deferred loading - small idle, moderate per call
Large file reads - one big load hits context hard
Web scraping + raw HTTP - most expensive + fragile

Practical impact

Token bloat doesn't usually break Claude. It just

Makes sessions feel slower
Costs you more if you're paying per token
Fills context faster - you hit the 50% rule sooner
Reduces how much actual work fits in one session

Keep overhead lean = longer productive sessions.

Power-user tips

Install ccstatusline - shows live context % + model + cost at the bottom of your terminal. Full walkthrough in Lesson 3.
Ask Claude about overhead - "how much context have I used in this session?"
Prefer DB queries over MCP calls for data you query often
Read parts of files, not whole files - "read rows 1-100" not "read this 10MB CSV"

Action items

☐ Understand: tokens are the currency of every session

☐ Remember: every message resends full history - cost compounds

☐ Know your baseline - what loads before you type

☐ Prefer databases > CLIs > MCPs > file reads for eﬃciency

Next lesson: Trimming your token overhead.

Exercises

Review the concepts covered in this lesson: What Are Tokens.
Write down your key takeaway from this lesson.
Practice running any commands or prompts mentioned above inside your terminal.