25 Aug 2025

Using LongMemEval to Improve Agent Memory

Y Combinator: The Vault

Sam Bhagwat, the co-founder and CEO of Mastra, discusses the LongMemEval benchmark for agent memory and the process of optimizing Mastra's memory layers. He defines memory as the compression of chat messages and the ability to search them effectively. Sam explains the subtasks within memory, including information extraction, multi-session reasoning, temporal reasoning, knowledge updates, and the ability to recognize missing information. He details Mastra's two main memory types: semantic recall and working memory, and how they were implemented and improved. Sam shares the initial benchmark results and the iterative steps taken to enhance performance, such as generating tailored templates, refining working memory updates, correcting date-related bugs, and restructuring data presentation. The improvements led to state-of-the-art accuracy, demonstrating the importance of continuous evaluation and iteration in developing AI agent frameworks.

Outlines

Continue

Preview

How to Get Rich: Every EpisodeNaval

Using LongMemEval to Improve Agent Memory

Y Combinator: The Vault

Introduction to LongMemEval and Agent Memory

Improving Memory Implementation Using LongMemEval

Configuration, Data Structure, and Iterative Improvement

Using LongMemEval to Improve Agent Memory

Y Combinator: The Vault

00:00Introduction to LongMemEval and Agent Memory

Introduction to LongMemEval and Agent Memory

07:12Improving Memory Implementation Using LongMemEval

Improving Memory Implementation Using LongMemEval

11:06Configuration, Data Structure, and Iterative Improvement

Configuration, Data Structure, and Iterative Improvement