Zep
Enterprise-grade memory layer for AI agents using knowledge graphs, delivering sub-200ms retrieval with SOC 2 compliance and production reliability.
2,500+
GitHub Stars
Sub-200ms
Retrieval Time
SOC 2
Compliance
Overview
Zep is an enterprise-grade memory layer for AI agents and LLM applications that uses knowledge graphs to provide fast, accurate context retrieval. Unlike traditional vector databases, Zep builds temporal knowledge graphs from conversation history, enabling semantic search, fact extraction, and relationship mapping. With sub-200ms query latency, SOC 2 Type II certification, and production-proven reliability, Zep powers memory for agents at companies like S&P Market Intelligence and enterprise AI platforms.
The Verdict
Who Should Use Zep?
Best For
- Production AI agents requiring fast, reliable memory retrieval
- Enterprise applications needing SOC 2 and compliance certifications
- Teams building RAG systems with conversation history
- Companies wanting open-source with optional managed hosting
- Applications requiring fact extraction and entity relationships
Not Ideal For
- Simple prototypes without production requirements
- Teams seeking lowest-cost memory solutions
- Projects not needing graph-based knowledge representation
What's Great
- Knowledge graph architecture enables semantic search and fact extraction
- Sub-200ms retrieval latency optimized for real-time agent interactions
- SOC 2 Type II certified with enterprise security and compliance
- Open-core model allows self-hosting or managed cloud
- Production-proven at S&P Market Intelligence and enterprise customers
- Built-in conversation summarization and memory optimization
Watch Out For
- Open-core model—advanced features require paid cloud tier
- Credit-based pricing can become expensive at scale
- Graph complexity may be overkill for simple use cases
- Learning curve for optimizing graph-based retrieval
Pricing
Free
$0
1,000 credits/month for prototyping. Includes core memory features, limited storage, and community support.
Starter
$25/mo
50,000 credits included. Credits consumed based on episode size (~350 bytes per credit). Pay-as-you-go beyond included credits.
Flex
$125/mo
50,000 credits included, then $2.50 per 1,000 credits. Best for growing production applications with variable usage.
Flex Plus
$375/mo
Most popular. 200,000 credits included, then $1.87 per 1,000 credits. For high-volume production deployments.
View all features & details
Key Features
- Knowledge graph memory architecture
- Sub-200ms retrieval latency
- Automatic fact extraction and entity linking
- Conversation summarization
- Multi-user session management
- SOC 2 Type II certified
Platforms
- Cloud SaaS (managed)
- Self-hosted (open-source)
- Python SDK
- TypeScript/JavaScript SDK
- LangChain integration
How It Compares
| Feature | Zep | Letta | Pinecone |
|---|---|---|---|
| Architecture | Knowledge graphs | Virtual context OS | Vector database |
| Retrieval Speed | Sub-200ms | Variable (memory-dependent) | 50-100ms (vectors only) |
| Compliance | SOC 2 Type II | None (self-hosted) | SOC 2, ISO 27001 |
| Deployment | Open-core + cloud | Fully open-source | Cloud only |
| Best For | Production RAG + memory | Research & unlimited context | Vector similarity search |
User Reviews
Loading reviews...