Token Optimizer
Context optimization plugin that identifies and eliminates wasted tokens across Claude Code, OpenCode, OpenClaw, and Codex environments while preserving work through compactions
Overview
Token Optimizer is a context optimization plugin that identifies and eliminates wasted tokens across Claude Code, OpenCode, OpenClaw, and Codex environments while preserving work through compactions. It analyzes three categories of token waste: structural waste (bloated configuration files, unused skills, duplicate system prompts), runtime waste (verbose command output that floods context mid-session), and behavioral waste (habits like premature cache expiration and inefficient model selection). Zero runtime dependencies—pure Python stdlib or TypeScript depending on platform.
The Verdict
Who Should Use Token Optimizer?
Best For
- Heavy API users processing high token volumes
- Teams tracking AI coding costs
- Developers experiencing context overflow
- Multi-session workflows needing continuity
- Those wanting visibility into token usage
Not Ideal For
- Casual users with minimal API spend
- Small projects under context limits
- Commercial use (PolyForm Noncommercial)
What's Great
- Zero runtime dependencies—pure stdlib
- Live dashboard with per-turn breakdowns
- Quality scoring with degradation detection
- Survives compaction with checkpoint/restore
- Subagent cost attribution
- No telemetry—all local SQLite
Watch Out For
- PolyForm Noncommercial license
- Requires plugin marketplace install
- Learning curve for optimization strategies
Pricing
View all features & details
Visibility & Measurement
- Live dashboard tracking tokens & costs
- Four pricing tier breakdowns
- Per-turn cost analysis
- Quality scoring (v6 dual-score)
- Cache hit rate analysis
- TTL distribution tracking
Session Continuity
- Checkpoints before compaction
- Critical decision restoration
- Multi-session workflow support
- Zero baseline context overhead
Optimization
- Structural waste detection
- Runtime waste reduction
- Behavioral waste coaching
- Quality nudges & loop detection
Platforms
- Claude Code
- OpenCode
- OpenClaw
- Codex
- macOS, Linux, Windows
How It Compares
| Category | Token Optimizer | Headroom | RTK |
|---|---|---|---|
| Tool output compression | 99%+ per-output, progressive disclosure | 60-95% (cherry-picked benchmarks) | 60-90% (CLI only) |
| First-read file skeletons | Shadow-validated, fail-open | — | — |
| Bash/CLI output compression | Generic + git/ls/pytest patterns | Partial | Yes (main feature) |
| Tabular/JSON compression | Value-preserving columnar | Yes (main feature) | — |
| Delta reads (re-read = diff only) | Yes | — | — |
| Model routing (wrong model for task) | 9 waste detectors | — | — |
| Loop/spin detection | Yes | — | — |
| Context quality scoring | Per-session, cross-session average | — | — |
| Cache instability detection | Yes | — | — |
| Retry churn detection | Yes | — | — |
| Tool cascade waste | Yes | — | — |
| Code structure maps | Outlines on repeated reads | — | — |
| Conversation history (60-75% of cost) | Checkpoint + compaction awareness | Doesn’t touch it | Doesn’t touch it |
| Quality gates | 3-tier system, edit-rate proxies | “Same answers” (untested) | — |
| Measured dollar savings | Real bill reduction per category | Per-output ratios only | rtk gain analytics |
| Multi-platform | Claude Code, Codex, OpenClaw, OpenCode | Python library + proxy | macOS, Linux, WSL |
Summary: Token Optimizer covers all 16 optimization categories. Headroom and RTK each specialize in one area (tool output compression) but miss conversation history (60-75% of cost), loop detection, model routing, and other major waste sources.