RTK (Rust Token Killer)
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands with zero dependencies
60K
GitHub Stars
60-90%
Token Savings
<10ms
Overhead
Overview
RTK (Rust Token Killer) is a CLI proxy that reduces LLM token consumption by 60-90% on common development commands. It filters and compresses command outputs before they reach an AI's context window using smart filtering, grouping, truncation, and deduplication. Single Rust binary with zero dependencies and sub-10ms overhead. Works with Claude Code, GitHub Copilot, Cursor, Windsurf, Cline, and 9+ other AI coding assistants.
The Verdict
Who Should Use RTK?
Best For
- Developers using AI coding assistants daily
- Teams with high API token costs
- Projects needing extended context windows
- CLI-heavy development workflows
- Anyone running lots of git, test, build commands
Not Ideal For
- Non-CLI workflows
- Native Windows (limited support)
- Projects requiring exact command output
- Workflows with custom command formats
What's Great
- Widely adopted with strong community support
- Zero dependencies, single Rust binary
- Sub-10ms overhead—virtually invisible
- Auto-rewrite hook for 100% adoption
- Built-in analytics (rtk gain command)
- Supports 40+ common dev commands
- Works with 10+ AI coding assistants
Watch Out For
- Limited native Windows support (use WSL)
- No auto-rewrite hook on Windows
- May filter out occasionally needed details
- Requires learning command mappings
Pricing
View all features & details
Supported Commands
- File: ls, read, find, grep, diff
- Git: status, log, diff, commit, push, pull
- Testing: Jest, Vitest, pytest, cargo test, go test
- Build: ESLint, TypeScript, cargo build, ruff
- Package: pnpm, pip, bundle, prisma
- Cloud: AWS CLI, Docker, kubectl
Optimization Strategies
- Smart filtering (removing noise)
- Grouping (aggregating similar items)
- Truncation (preserving relevant context)
- Deduplication (collapsing repeated lines)
Sample Session Savings
- 10× ls/tree calls: 80% reduction
- 5× cargo test: 90% reduction
- 30-min session: 118K → 24K tokens (-80%)
Installation
- Homebrew (macOS/Linux)
- Pre-built binaries
- Cargo install
- Works on macOS, Linux, Windows (WSL)
How It Compares
| Category | Token Optimizer | RTK | Headroom |
|---|---|---|---|
| Tool output compression | 99%+ per-output, progressive disclosure | 60-90% (CLI only) | 60-95% (cherry-picked benchmarks) |
| First-read file skeletons | Shadow-validated, fail-open | — | — |
| Bash/CLI output compression | Generic + git/ls/pytest patterns | Yes (main feature) | Partial |
| Tabular/JSON compression | Value-preserving columnar | — | Yes (main feature) |
| Delta reads (re-read = diff only) | Yes | — | — |
| Model routing (wrong model for task) | 9 waste detectors | — | — |
| Loop/spin detection | Yes | — | — |
| Context quality scoring | Per-session, cross-session average | — | — |
| Cache instability detection | Yes | — | — |
| Retry churn detection | Yes | — | — |
| Tool cascade waste | Yes | — | — |
| Code structure maps | Outlines on repeated reads | — | — |
| Conversation history (60-75% of cost) | Checkpoint + compaction awareness | Doesn’t touch it | Doesn’t touch it |
| Quality gates | 3-tier system, edit-rate proxies | — | “Same answers” (untested) |
| Measured dollar savings | Real bill reduction per category | rtk gain analytics |
Per-output ratios only |
| Multi-platform | Claude Code, Codex, OpenClaw, OpenCode | macOS, Linux, WSL | Python library + proxy |
Summary: Token Optimizer covers 16/16 categories. RTK excels at CLI output compression but misses 85-90% of actual token waste (conversation history, loops, model routing, etc.).
User Reviews
Loading reviews...