From 4209421349cde0adc821c9c18ad71abb76bcf5a6 Mon Sep 17 00:00:00 2001 From: Affaan Mustafa Date: Thu, 12 Feb 2026 15:37:48 -0800 Subject: [PATCH] docs: add token optimization guide with recommended settings (#175) Adds a comprehensive Token Optimization section to the README with: - Recommended settings (model, MAX_THINKING_TOKENS, AUTOCOMPACT_PCT) - Daily workflow commands table (/model, /clear, /compact, /cost) - Strategic compaction guidance (when to compact vs not) - Context window management (MCP tool description costs) - Agent Teams cost warning --- README.md | 70 +++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 63 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 37b10bb4..062e86c7 100644 --- a/README.md +++ b/README.md @@ -325,6 +325,7 @@ everything-claude-code/ | |-- saas-nextjs-CLAUDE.md # Real-world SaaS (Next.js + Supabase + Stripe) | |-- go-microservice-CLAUDE.md # Real-world Go microservice (gRPC + PostgreSQL) | |-- django-api-CLAUDE.md # Real-world Django REST API (DRF + Celery) +| |-- rust-api-CLAUDE.md # Real-world Rust API (Axum + SQLx + PostgreSQL) (NEW) | |-- mcp-configs/ # MCP server configurations | |-- mcp-servers.json # GitHub, Supabase, Vercel, Railway, etc. @@ -883,18 +884,73 @@ These configs are battle-tested across multiple production applications. --- -## ⚠️ Important Notes +## Token Optimization + +Claude Code usage can be expensive if you don't manage token consumption. These settings significantly reduce costs without sacrificing quality. + +### Recommended Settings + +Add to `~/.claude/settings.json`: + +```json +{ + "model": "sonnet", + "env": { + "MAX_THINKING_TOKENS": "10000", + "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50" + } +} +``` + +| Setting | Default | Recommended | Impact | +|---------|---------|-------------|--------| +| `model` | opus | **sonnet** | ~60% cost reduction; handles 80%+ of coding tasks | +| `MAX_THINKING_TOKENS` | 31,999 | **10,000** | ~70% reduction in hidden thinking cost per request | +| `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` | 95 | **50** | Compacts earlier — better quality in long sessions | + +Switch to Opus only when you need deep architectural reasoning: +``` +/model opus +``` + +### Daily Workflow Commands + +| Command | When to Use | +|---------|-------------| +| `/model sonnet` | Default for most tasks | +| `/model opus` | Complex architecture, debugging, deep reasoning | +| `/clear` | Between unrelated tasks (free, instant reset) | +| `/compact` | At logical task breakpoints (research done, milestone complete) | +| `/cost` | Monitor token spending during session | + +### Strategic Compaction + +The `strategic-compact` skill (included in this plugin) suggests `/compact` at logical breakpoints instead of relying on auto-compaction at 95% context. See `skills/strategic-compact/SKILL.md` for the full decision guide. + +**When to compact:** +- After research/exploration, before implementation +- After completing a milestone, before starting the next +- After debugging, before continuing feature work +- After a failed approach, before trying a new one + +**When NOT to compact:** +- Mid-implementation (you'll lose variable names, file paths, partial state) ### Context Window Management -**Critical:** Don't enable all MCPs at once. Your 200k context window can shrink to 70k with too many tools enabled. +**Critical:** Don't enable all MCPs at once. Each MCP tool description consumes tokens from your 200k window, potentially reducing it to ~70k. -Rule of thumb: -- Have 20-30 MCPs configured -- Keep under 10 enabled per project -- Under 80 tools active +- Keep under 10 MCPs enabled per project +- Keep under 80 tools active +- Use `disabledMcpServers` in project config to disable unused ones -Use `disabledMcpServers` in project config to disable unused ones. +### Agent Teams Cost Warning + +Agent Teams spawns multiple context windows. Each teammate consumes tokens independently. Only use for tasks where parallelism provides clear value (multi-module work, parallel reviews). For simple sequential tasks, subagents are more token-efficient. + +--- + +## ⚠️ Important Notes ### Customization