Wallet-derived encryption.
Your encryption key is derived from your wallet. Never typed, never emailed, never stored on our servers. You hold it. We never see it.
메모리 엔진은 필요한 정보를 자동으로 불러오고, 메모리 볼트는 내가 직접 정리한 지식을 저장합니다. 두 시스템이 함께 모든 AI 모델에 완전한 대화 맥락을 제공합니다.
Automatic recall from conversations. Searches past conversations using semantic similarity. No exact keywords needed. Finds relevant context from days or weeks ago.
Learn morePersistent facts and preferences. A curated knowledge base of saved facts, preferences, and instructions. Explicitly saved. Available across all future conversations.
Learn moreHow Anuma AI memory works
Every memory follows the same path. Captured on your device. Encrypted before it leaves. Stored as ciphertext. Decrypted only when you need it.
As you chat, Anuma quietly captures meaningful moments: messages worth recalling and facts worth saving.
Every message is chunked at sentence boundaries (~400 chars, 50-char overlap).
Before anything leaves your device, your memory is encrypted with a key derived from your wallet, a key only you hold.
AES-GCM-256 encryption with a wallet-derived key. Keys never touch our servers.
Encrypted memory syncs across your devices. We see only opaque ciphertext, never the words inside.
Zero-knowledge storage. Anuma servers cannot read your memory, even if compelled to.
When you ask anything in any AI model, your device decrypts the relevant memory and quietly threads it into context.
Semantic search finds meaning, not keywords. Decryption happens client-side, every time.
Automatic recall from every conversation you have ever had.
Every message is embedded as a high-dimensional vector capturing its meaning. Queries find the most similar past messages using cosine similarity.
No exact keyword matches needed. A question about “deployment issues” surfaces past conversations about “CI pipeline failures.”
Query
“deployment issues”
Embed & compare vectors
Match · 0.87
“CI pipeline failures”
Long messages automatically split into overlapping segments using sentence-boundary detection. Each chunk targets roughly 400 characters with 50-character overlap.
Each chunk is embedded independently. Relevant fragments inside long messages still surface in search.
Input
Long message · 1,200 characters
Output
Chunk 1
Chunk 2
Chunk 3
~400 chars each · 50-char overlap
Round-robin deduplication ensures results come from multiple conversations, not just one. The system over-fetches 3x to 9x candidates before deduplication for a diverse result pool.
72
candidates fetched at 9x
Round-robin deduplication
8
results from 8 different conversations
Finding a single chunk is not enough. The agent retrieves surrounding messages for full context. Default returns the full conversation session around each match.
Results are grouped by conversation with relevance scores and timestamps.
A knowledge base that grows with you.
The agent creates, updates, and organizes memories into folders. Each memory is scoped for access control. Mark memories as private or shared.
Memories are filterable by folder or scope. Unfiled entries are queryable separately.
Embedding-based retrieval by meaning, not keywords. LRU cache holds up to 5,000 vectors. Evicts least-accessed entries when full.
New memories are embedded eagerly and cached immediately. The system gets faster over time.
Vector capacity
68%
3,412 / 5,000 vectors
12ms
Avg retrieval
5,000
LRU cache limit
When two memories are semantically similar (70%+ cosine similarity) and 30+ days apart, the newer one replaces the older. Greedy pairing prevents cascading adjustments.
Original memories stay intact. Ranking is adjusted at query time only.
Jan 2
“Prefers light mode”
Feb 16
“Prefers dark mode”
Supersedes older memory
Confirmation callbacks let users approve what the agent saves. Multi-user support with user-scoped storage enforced at the database layer.
Bulk delete available for full data cleanup.
Agent wants to save
“User prefers dark mode in all editors”
User-scoped · only you can access
Memory Engine and Memory Vault share the same embedding infrastructure. They serve different roles but work side by side.
User mentions they prefer dark mode in a conversation.
Memory Engine indexes this conversation automatically.
Agent uses Memory Vault to save “User prefers dark mode” as a persistent fact.
Future conversations: Vault retrieves the preference instantly without searching history.
| Memory Engine | Memory Vault | |
|---|---|---|
| Source | Conversation history | Explicitly saved memories |
| Updates | Automatic | Save/update via agent |
| Organization | By conversation | By folder and scope |
| Search default | 8 results, 0.3 threshold | 5 results, 0.1 threshold |
| Chunking | Sentence-based, ~400 chars | Whole memory as single unit |
| Staleness handling | Round-robin dedup | Supersession detection |
| Caching | Optional per-embedding | LRU cache (5,000 entries) |
| Best for | Recalling past discussions | Persistent facts and preferences |
Local-first architecture
Most AI tools store your memory in plaintext on their servers, readable by their staff, by their training pipelines, and by anyone who breaches them. Anuma takes a different path. Your memory is end-to-end encrypted with keys that only you hold.
Your encryption key is derived from your wallet. Never typed, never emailed, never stored on our servers. You hold it. We never see it.
Memory content is encrypted in your browser before a single byte leaves your device. What lands on our infrastructure is ciphertext we cannot read.
Anuma servers store opaque blobs. Even if someone with the keys to our infrastructure tried, they would see encrypted noise, not your words.
Your encrypted memory follows you across ChatGPT, Claude, Gemini, and every model on Anuma, and across every device you sign in from.
Data flow
Your device
Plaintext memory + your wallet key
Anuma servers
Encrypted ciphertext only. Unreadable.
Your other devices
Decrypted client-side with your key
Encryption happens before sync. Decryption happens after. We are just the pipe in between, and the pipe cannot see what flows through it.
Traditional AI memory vs Anuma
Traditional AI memory lives on vendor servers, in plaintext, locked to one model. Anuma's private AI memory is encrypted on your device, unified across every AI model, and yours to control.
Traditional AI memory
Anuma memory
Storage location
Traditional AI memory: Vendor cloud
Locked inside one provider, beside billions of other users.
Anuma memory: Local-first
Encrypted on your device, then synced as ciphertext.
Encryption at rest
Traditional AI memory: Vendor-managed
Encrypted, but the vendor holds the keys. They can read it.
Anuma memory: AES-GCM-256
End-to-end encryption with keys derived from your wallet.
Key custody
Traditional AI memory: The vendor
Decryption is in their environment, on their schedule.
Anuma memory: You
Wallet-derived keys. Decryption only happens on your devices.
Cross-model portability
Traditional AI memory: Locked to one model
Memory works in one assistant. Switch tools and it is gone.
Anuma memory: Every model
One memory layer across ChatGPT, Claude, Gemini, and more.
User-controlled deletion
Traditional AI memory: Maybe, sometimes
Deletion requests, retention windows, and backup ambiguity.
Anuma memory: Always
Delete any memory, or all of them, in one tap. Permanent.
Used for training
Traditional AI memory: Often by default
Your conversations feed the next model release. Opt-out only.
Anuma memory: Never
Your encrypted memory cannot be used for training. We cannot read it.
Audit transparency
Traditional AI memory: Opaque
Closed-source pipelines. You take the policy on faith.
Anuma memory: Inspectable
Open architecture. The crypto and data flow are documented.
AI persistent memory privacy
Privacy is easier to verify by what is absent than by what is promised. Here is what Anuma servers do not, and cannot, hold.
Plaintext memory content.
Your messages, notes, and saved facts are encrypted on your device before they ever touch our servers. We store ciphertext we cannot read.
Your decryption keys.
Keys are derived from your wallet, on your device. They never reach our infrastructure. Even our most senior engineers cannot fetch them.
Behavioral profiles for ad targeting.
We do not build advertising profiles from your conversations. We are not in the ads business and never plan to be.
Training data sourced from your memory.
Your encrypted memory is never used to train models, ours or anyone else’s. We could not extract it if we wanted to.
Cross-user search indexes.
Embeddings are scoped to you. No combined index across users exists. Your memory is searchable by your device, by your key, and no one else’s.
Plaintext backups.
Backups stay encrypted end-to-end with the same wallet-derived key. If you delete a memory, no plaintext copy survives in cold storage.
The proof is structural, not verbal. We cannot read your memory because we never receive a key. The architecture makes the promise, not a policy page.
Everything you need to know about how Anuma remembers.
Ask us anythingMemory Engine automatically indexes and searches past conversations. Memory Vault stores specific facts and preferences that are explicitly saved.
Memory Engine indexes everything automatically. For Memory Vault, the AI suggests what to save, but you can approve or reject each memory.
Every message and memory is converted into a vector that captures its meaning. Search finds the most similar vectors using cosine similarity, so you do not need exact keyword matches.
Yes. Memory Vault supports folders and scopes. You can filter searches by folder, mark memories as private or shared, and move entries between folders.
Memory Vault uses supersession detection. When a newer memory is semantically similar to an older one and enough time has passed, the newer version is automatically prioritized in search results.
Yes. All memory content is encrypted with your wallet-derived key using AES-GCM-256. Memory content, evidence, and keys are encrypted. Metadata like type and confidence scores are not.
Yes. You can delete individual memories or do a full bulk delete from your account. Deletion is permanent.
Yes. Your memory loads into every conversation regardless of which model you use. Memory Engine and Memory Vault work with ChatGPT, Claude, Gemini, and all other models on Anuma.
There is no hard limit. The Memory Vault cache holds 5,000 vectors for fast access. Memories beyond the cache are still searchable but may take slightly longer on first access.
Memory Engine and Memory Vault give every AI model complete context. Encrypted, private, and yours.