Two memory systems.
One seamless experience.
Memory Engine handles automatic recall. Memory Vault stores your curated knowledge. Together they give every AI model complete context.
Automatic recall from conversations. Searches past conversations using semantic similarity. No exact keywords needed. Finds relevant context from days or weeks ago.
Learn morePersistent facts and preferences. A curated knowledge base of saved facts, preferences, and instructions. Explicitly saved. Available across all future conversations.
Learn moreAnuma Memory Engine
Automatic recall from every conversation you have ever had.
Semantic search.
Every message is embedded as a high-dimensional vector capturing its meaning. Queries find the most similar past messages using cosine similarity.
No exact keyword matches needed. A question about “deployment issues” surfaces past conversations about “CI pipeline failures.”
Query
“deployment issues”
Embed & compare vectors
Match · 0.87
“CI pipeline failures”
Smart chunking.
Long messages automatically split into overlapping segments using sentence-boundary detection. Each chunk targets roughly 400 characters with 50-character overlap.
Each chunk is embedded independently. Relevant fragments inside long messages still surface in search.
Input
Long message · 1,200 characters
Output
Chunk 1
Chunk 2
Chunk 3
~400 chars each · 50-char overlap
Diverse results.
Round-robin deduplication ensures results come from multiple conversations, not just one. The system over-fetches 3x to 9x candidates before deduplication for a diverse result pool.
72
candidates fetched at 9x
Round-robin deduplication
8
results from 8 different conversations
Context expansion.
Finding a single chunk is not enough. The agent retrieves surrounding messages for full context. Default returns the full conversation session around each match.
Results are grouped by conversation with relevance scores and timestamps.
Anuma Memory Vault
A knowledge base that grows with you.
Save, update, and organize.
The agent creates, updates, and organizes memories into folders. Each memory is scoped for access control. Mark memories as private or shared.
Memories are filterable by folder or scope. Unfiled entries are queryable separately.
Semantic search with caching.
Embedding-based retrieval by meaning, not keywords. LRU cache holds up to 5,000 vectors. Evicts least-accessed entries when full.
New memories are embedded eagerly and cached immediately. The system gets faster over time.
Vector capacity
68%
3,412 / 5,000 vectors
12ms
Avg retrieval
5,000
LRU cache limit
Supersession detection.
When two memories are semantically similar (70%+ cosine similarity) and 30+ days apart, the newer one replaces the older. Greedy pairing prevents cascading adjustments.
Original memories stay intact. Ranking is adjusted at query time only.
Jan 2
“Prefers light mode”
Feb 16
“Prefers dark mode”
Supersedes older memory
User control.
Confirmation callbacks let users approve what the agent saves. Multi-user support with user-scoped storage enforced at the database layer.
Bulk delete available for full data cleanup.
Agent wants to save
“User prefers dark mode in all editors”
User-scoped · only you can access
Better together.
Memory Engine and Memory Vault share the same embedding infrastructure. They serve different roles but work side by side.
User mentions they prefer dark mode in a conversation.
Memory Engine indexes this conversation automatically.
Agent uses Memory Vault to save “User prefers dark mode” as a persistent fact.
Future conversations: Vault retrieves the preference instantly without searching history.
| Memory Engine | Memory Vault | |
|---|---|---|
| Source | Conversation history | Explicitly saved memories |
| Updates | Automatic | Save/update via agent |
| Organization | By conversation | By folder and scope |
| Search default | 8 results, 0.3 threshold | 5 results, 0.1 threshold |
| Chunking | Sentence-based, ~400 chars | Whole memory as single unit |
| Staleness handling | Round-robin dedup | Supersession detection |
| Caching | Optional per-embedding | LRU cache (5,000 entries) |
| Best for | Recalling past discussions | Persistent facts and preferences |
Memory questions.
Everything you need to know about how Anuma remembers.
Ask us anythingMemory Engine automatically indexes and searches past conversations. Memory Vault stores specific facts and preferences that are explicitly saved.
Memory Engine indexes everything automatically. For Memory Vault, the AI suggests what to save, but you can approve or reject each memory.
Every message and memory is converted into a vector that captures its meaning. Search finds the most similar vectors using cosine similarity, so you do not need exact keyword matches.
Yes. Memory Vault supports folders and scopes. You can filter searches by folder, mark memories as private or shared, and move entries between folders.
Memory Vault uses supersession detection. When a newer memory is semantically similar to an older one and enough time has passed, the newer version is automatically prioritized in search results.
Yes. All memory content is encrypted with your wallet-derived key using AES-GCM-256. Memory content, evidence, and keys are encrypted. Metadata like type and confidence scores are not.
Yes. You can delete individual memories or do a full bulk delete from your account. Deletion is permanent.
Yes. Your memory loads into every conversation regardless of which model you use. Memory Engine and Memory Vault work with ChatGPT, Claude, Gemini, and all other models on Anuma.
There is no hard limit. The Memory Vault cache holds 5,000 vectors for fast access. Memories beyond the cache are still searchable but may take slightly longer on first access.
AI that never forgets.
Memory Engine and Memory Vault give every conversation complete context.