The agent memory benchmark story is now clearer: Hindsight leads BEAM at 10M tokens, while common alternatives break down or rely on weaker retrieval patterns.