mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-03-30 13:43:26 +08:00
docs: add token optimization guide with recommended settings (#175)
Adds a comprehensive Token Optimization section to the README with: - Recommended settings (model, MAX_THINKING_TOKENS, AUTOCOMPACT_PCT) - Daily workflow commands table (/model, /clear, /compact, /cost) - Strategic compaction guidance (when to compact vs not) - Context window management (MCP tool description costs) - Agent Teams cost warning
This commit is contained in:
70
README.md
70
README.md
@@ -325,6 +325,7 @@ everything-claude-code/
|
|||||||
| |-- saas-nextjs-CLAUDE.md # Real-world SaaS (Next.js + Supabase + Stripe)
|
| |-- saas-nextjs-CLAUDE.md # Real-world SaaS (Next.js + Supabase + Stripe)
|
||||||
| |-- go-microservice-CLAUDE.md # Real-world Go microservice (gRPC + PostgreSQL)
|
| |-- go-microservice-CLAUDE.md # Real-world Go microservice (gRPC + PostgreSQL)
|
||||||
| |-- django-api-CLAUDE.md # Real-world Django REST API (DRF + Celery)
|
| |-- django-api-CLAUDE.md # Real-world Django REST API (DRF + Celery)
|
||||||
|
| |-- rust-api-CLAUDE.md # Real-world Rust API (Axum + SQLx + PostgreSQL) (NEW)
|
||||||
|
|
|
|
||||||
|-- mcp-configs/ # MCP server configurations
|
|-- mcp-configs/ # MCP server configurations
|
||||||
| |-- mcp-servers.json # GitHub, Supabase, Vercel, Railway, etc.
|
| |-- mcp-servers.json # GitHub, Supabase, Vercel, Railway, etc.
|
||||||
@@ -883,18 +884,73 @@ These configs are battle-tested across multiple production applications.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## ⚠️ Important Notes
|
## Token Optimization
|
||||||
|
|
||||||
|
Claude Code usage can be expensive if you don't manage token consumption. These settings significantly reduce costs without sacrificing quality.
|
||||||
|
|
||||||
|
### Recommended Settings
|
||||||
|
|
||||||
|
Add to `~/.claude/settings.json`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model": "sonnet",
|
||||||
|
"env": {
|
||||||
|
"MAX_THINKING_TOKENS": "10000",
|
||||||
|
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Setting | Default | Recommended | Impact |
|
||||||
|
|---------|---------|-------------|--------|
|
||||||
|
| `model` | opus | **sonnet** | ~60% cost reduction; handles 80%+ of coding tasks |
|
||||||
|
| `MAX_THINKING_TOKENS` | 31,999 | **10,000** | ~70% reduction in hidden thinking cost per request |
|
||||||
|
| `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` | 95 | **50** | Compacts earlier — better quality in long sessions |
|
||||||
|
|
||||||
|
Switch to Opus only when you need deep architectural reasoning:
|
||||||
|
```
|
||||||
|
/model opus
|
||||||
|
```
|
||||||
|
|
||||||
|
### Daily Workflow Commands
|
||||||
|
|
||||||
|
| Command | When to Use |
|
||||||
|
|---------|-------------|
|
||||||
|
| `/model sonnet` | Default for most tasks |
|
||||||
|
| `/model opus` | Complex architecture, debugging, deep reasoning |
|
||||||
|
| `/clear` | Between unrelated tasks (free, instant reset) |
|
||||||
|
| `/compact` | At logical task breakpoints (research done, milestone complete) |
|
||||||
|
| `/cost` | Monitor token spending during session |
|
||||||
|
|
||||||
|
### Strategic Compaction
|
||||||
|
|
||||||
|
The `strategic-compact` skill (included in this plugin) suggests `/compact` at logical breakpoints instead of relying on auto-compaction at 95% context. See `skills/strategic-compact/SKILL.md` for the full decision guide.
|
||||||
|
|
||||||
|
**When to compact:**
|
||||||
|
- After research/exploration, before implementation
|
||||||
|
- After completing a milestone, before starting the next
|
||||||
|
- After debugging, before continuing feature work
|
||||||
|
- After a failed approach, before trying a new one
|
||||||
|
|
||||||
|
**When NOT to compact:**
|
||||||
|
- Mid-implementation (you'll lose variable names, file paths, partial state)
|
||||||
|
|
||||||
### Context Window Management
|
### Context Window Management
|
||||||
|
|
||||||
**Critical:** Don't enable all MCPs at once. Your 200k context window can shrink to 70k with too many tools enabled.
|
**Critical:** Don't enable all MCPs at once. Each MCP tool description consumes tokens from your 200k window, potentially reducing it to ~70k.
|
||||||
|
|
||||||
Rule of thumb:
|
- Keep under 10 MCPs enabled per project
|
||||||
- Have 20-30 MCPs configured
|
- Keep under 80 tools active
|
||||||
- Keep under 10 enabled per project
|
- Use `disabledMcpServers` in project config to disable unused ones
|
||||||
- Under 80 tools active
|
|
||||||
|
|
||||||
Use `disabledMcpServers` in project config to disable unused ones.
|
### Agent Teams Cost Warning
|
||||||
|
|
||||||
|
Agent Teams spawns multiple context windows. Each teammate consumes tokens independently. Only use for tasks where parallelism provides clear value (multi-module work, parallel reviews). For simple sequential tasks, subagents are more token-efficient.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ⚠️ Important Notes
|
||||||
|
|
||||||
### Customization
|
### Customization
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user