What Is Zep? Long-Term Memory for AI Agents That Actually Scales

How Zep stores facts, entities, and user preferences — and how it differs from Mem0 and LangGraph memory

The memory problem

Most AI agents are stateless within a session and completely amnesiac across sessions. LangGraph checkpoints solve the within-session problem. But when a user returns a week later and says 'remember what we discussed about my budget?', a checkpointer cannot help — it only knows about the current thread.

Zep is a dedicated memory server that stores facts, entities, and conversation history across sessions, extracts structured knowledge from conversations automatically, and provides semantic search over that knowledge.

What Zep stores

Memory type	What it is	Example
Facts	Discrete facts extracted from conversations	'User prefers monthly billing'
Entities	Named entities with relationships	'Alice' is a 'Customer' with 'Account #12345'
Summaries	Auto-generated summaries of past sessions	Rolling summary of last 20 exchanges
Raw messages	Original conversation messages	Full message history for a session
Custom data	Arbitrary key-value data you add manually	User's subscription tier, onboarding status

Core data model: Users and Sessions

Zep organises memory around two concepts: Users (persistent identities) and Sessions (individual conversations). A user can have many sessions. Memories at the user level persist forever; session memories are associated with a specific conversation.

from zep_cloud.client import AsyncZep
import asyncio
 
client = AsyncZep(api_key='YOUR_ZEP_API_KEY')
 
async def setup_user_and_session():
    # Create a user (idempotent — safe to call on every login)
    await client.user.add(
        user_id='user-123',
        email='alice@example.com',
        first_name='Alice',
        metadata={'plan': 'pro', 'signup_date': '2026-01-15'}
    )
 
    # Create a session for this conversation
    await client.memory.add_session(
        session_id='session-abc',
        user_id='user-123',
        metadata={'channel': 'web_chat'}
    )
 
asyncio.run(setup_user_and_session())

Adding messages to memory

from zep_cloud.types import Message
 
async def add_exchange(session_id: str, user_msg: str, ai_msg: str):
    await client.memory.add(
        session_id=session_id,
        messages=[
            Message(role='user',      role_type='user',      content=user_msg),
            Message(role='assistant', role_type='assistant', content=ai_msg),
        ]
    )
 
# Zep automatically:
# - Extracts facts ('user wants to cancel subscription')
# - Identifies entities ('subscription', 'user')
# - Updates rolling summary
# - Builds/updates knowledge graph edges

Retrieving memory

async def get_context_for_agent(session_id: str) -> str:
    memory = await client.memory.get(session_id=session_id)
 
    # memory.context is a pre-formatted string — ready to inject into system prompt
    print(memory.context)
    # Example output:
    # Facts about Alice:
    # - Prefers monthly billing
    # - Account number is 12345
    # - Last session: discussed Q1 budget review
 
    # Or access individual facts
    for fact in memory.facts:
        print(f'  Fact: {fact.fact} (rating: {fact.rating})')
 
    return memory.context

Use memory.context directly as your system prompt suffix — it is already formatted for LLM injection. Do not try to reconstruct it from individual facts unless you need custom formatting.

Semantic search across all memories

async def search_user_memory(user_id: str, query: str) -> list:
    results = await client.memory.search_sessions(
        user_id=user_id,
        text=query,
        search_scope='facts',    # 'facts', 'messages', or 'summary'
        limit=5,
    )
    for r in results.results:
        print(f'Score: {r.score:.3f} | {r.fact.fact}')
    return results.results
 
# Example: find memories relevant to a current query
relevant = await search_user_memory('user-123', 'budget preferences')

Zep vs Mem0 vs LangGraph Store

Feature	Zep	Mem0	LangGraph Store
Automatic fact extraction	Yes — from messages	Yes — from messages	No — manual put()
Knowledge graph	Yes — entity relationships	Yes	No
Self-hostable	Yes (open source)	Yes (open source)	Yes
Cloud managed	Yes (Zep Cloud)	Yes (Mem0 Platform)	No
LangChain integration	First-class	First-class	Via LangGraph
Semantic search	Yes	Yes	Yes (with embeddings)
Best for	Conversational apps, CRM enrichment	Agent memory, multi-framework	LangGraph-native apps

When NOT to use Zep

You are building with LangGraph and want to stay in the same ecosystem — use LangGraph Store
You only need within-session memory — a simple list of messages is enough
Your conversations are short and do not accumulate meaningful facts over time
You cannot justify the extra dependency or hosting cost for your use case

Zep Cloud vs self-hosted

Aspect	Zep Cloud	Self-hosted
Setup	API key only	Docker compose or Kubernetes
Knowledge graph	Yes	Yes
Data residency	US/EU regions	Fully controlled
Pricing	Usage-based (free tier available)	Infrastructure costs only
Maintenance	None	You manage updates