Memory architectures compared

I have built three memory systems in the past year. Each one for a different agent shape. Each one made different trade-offs. Here is what I learned.

System one was pure vector embeddings (pgvector + Supabase). Every memory chunk got an OpenAI embedding, similarity search returned the top-k. It felt smart. It returned irrelevant context constantly. The problem: vector similarity is a fuzzy "vibe" match. If the user asks "what was the bug we fixed in the auth flow last week", the top-k will be authentication topics in general, not the specific bug. Vector retrieval is great for semantic neighborhoods, terrible for specific recall.

System two was full-text search (SQLite FTS5). No embeddings, just BM25 ranking on the words in the memories. Faster, no API calls, no embedding bill. For specific recall it crushed vector search. For semantic queries (you don't remember the exact word but you know the concept) it whiffed.

System three is hybrid: FTS5 first pass (fast, specific, free), then a knowledge graph for entity-based recall (people, projects, decisions, learnings as nodes with edges). Vector embeddings only as the third tier when the first two return nothing relevant. Cost dropped 90%, recall quality went up. This is what local-memory-mcp ships and what studiomeyer-memory uses for cloud.

The knowledge graph is the unsung hero. The agent doesn't really need vector similarity for most queries. It needs to know "this entity has been mentioned, here are its observations, here are its relationships". A graph answers that in one query. Vector search answers it in a soft probabilistic way that requires the agent to filter the noise.

When you should use vectors: your data is mostly unstructured prose, you have no entities to extract, the queries are conceptual. When you should not: you can extract entities, you have specific terminology, you need precise recall.

Honest caveat: the LongMemEval benchmark for our hybrid memory currently sits in the 60-80% range with a clear upward trend after we fix the assistant-ingest path. We do not publish a single hard number yet. The benchmark page on studiomeyer.io tracks the methodology and the run logs.

Pattern I keep seeing: builders pick vectors first because the marketing says vectors. Then they get bad recall and blame the embedding model. The embedding model is fine. The architecture is wrong. Try FTS5 first. Add a graph if you have entities. Add vectors last.