Codemogger iconCodemogger

oss Free Star0k

Code indexing library and MCP server that uses tree-sitter for semantic chunking and local embeddings, storing everything in a single SQLite file

318 GitHub Stars
13 Languages
1 File SQLite DB

Overview

Codemogger is a code indexing library and MCP server designed specifically for AI coding agents. It parses source code using tree-sitter, chunks it into semantic units (functions, classes, impl blocks), embeds them locally using the all-MiniLM-L6-v2 model, and stores everything in a single SQLite file with both vector and full-text search capabilities. The tool requires no Docker, no external server, and no API keys—just one .db file per codebase. Codemogger returns the 5 most relevant definitions instead of thousands of matches, making it ideal for providing precise context to AI agents.

The Verdict

Who Should Use Codemogger?

Best For

  • Developers building AI coding agents that need codebase context
  • Teams wanting a simple, portable code index (single SQLite file)
  • Users who need both keyword and semantic search combined
  • Projects requiring incremental indexing (only re-embed changed files)
  • Anyone integrating code intelligence via MCP protocol

Not Ideal For

  • Enterprise teams needing cross-repository search at scale
  • Projects requiring advanced code navigation (go-to-definition)
  • Languages not supported by tree-sitter grammars
  • Teams needing a managed, hosted solution

What's Great

  • Zero dependencies—no Docker, no server, no API keys needed
  • Single SQLite file stores everything (portable and simple)
  • Tree-sitter parsing extracts semantic units, not arbitrary chunks
  • Combines vector search and full-text search in one database
  • Incremental indexing only processes changed files (SHA-256 hash)
  • MCP server exposes search, index, and reindex tools

Watch Out For

  • Newer project with smaller community
  • Limited to 13 languages with tree-sitter grammars
  • Large items (150+ lines) are subdivided, which may affect some searches
  • No cloud or team collaboration features

Pricing

View all features & details

Key Features

  • Tree-sitter semantic parsing
  • Local embeddings (all-MiniLM-L6-v2)
  • SQLite vector + FTS search
  • Incremental indexing
  • MCP server integration
  • .gitignore-aware scanning

Supported Languages

  • Rust, C, C++, Go, Zig
  • Python, Java, Scala
  • JavaScript, TypeScript, TSX
  • PHP, Ruby

MCP Tools

  • codemogger_search - semantic/keyword search
  • codemogger_index - index a directory
  • codemogger_reindex - force reindex

Installation

  • npm install -g codemogger
  • npx -y codemogger (no install)
  • Library, CLI, or MCP server

How It Compares

Feature Codemogger GrepAI Sourcegraph
Parser Tree-sitter (semantic) Custom + embeddings SCIP
Storage SQLite (single file) Custom index Database
Search Vector + FTS Semantic AI + Deterministic
MCP Server Yes Yes Yes
Best For Simple, portable indexing Privacy-first search Enterprise scale

User Reviews

Loading reviews...