·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
SoftBank says it will invest up to €75 billion to build French data centers1h◆‘What a joke’: Github Copilot’s new token-based billing spurs consternation among devs6h◆Meta is reportedly developing an AI pendant7h◆I put Google’s 24/7 AI assistant Gemini Spark to work, and it’s actually pretty useful7h◆How one founder’s bet on ‘the old school web’ is paying off10h◆AI grifters are creating fake Black people to sell Shein junk10h◆As the browser wars heat up, here are the hottest alternatives to Chrome and Safari in 202610h◆The SpaceX IPO is great for Elon Musk and terrible for you11h◆Coders are refusing to work without AI — and that could come back to bite them1d◆Take our I/O 2026 quiz, vibe coded in Google AI Studio.1d◆So you’ve heard these AI terms and nodded along; let’s fix that1d◆What happens when companies become too AI-pilled?1d◆Tech companies desperately want to film you doing chores1d◆9 demos of Gemini Omni and Gemini 3.5 in action1d◆After Nvidia’s $20B not-acqui-hire, AI chip startup Groq reportedly raising $650M1d◆Cognition’s Scott Wu says AI coding agents shouldn’t replace humans1d◆Today is the last day to apply to speak at TechCrunch Disrupt 20261d◆Final 24 hours to save up to $410 on your TechCrunch Disrupt 2026 ticket1d◆Does your CEO have AI psychosis? Aaron Levie thinks most of them do.1d◆Kiwibit’s AI-powered bird feeder is my new backyard buddy1d◆SoftBank says it will invest up to €75 billion to build French data centers1h◆‘What a joke’: Github Copilot’s new token-based billing spurs consternation among devs6h◆Meta is reportedly developing an AI pendant7h◆I put Google’s 24/7 AI assistant Gemini Spark to work, and it’s actually pretty useful7h◆How one founder’s bet on ‘the old school web’ is paying off10h◆AI grifters are creating fake Black people to sell Shein junk10h◆As the browser wars heat up, here are the hottest alternatives to Chrome and Safari in 202610h◆The SpaceX IPO is great for Elon Musk and terrible for you11h◆Coders are refusing to work without AI — and that could come back to bite them1d◆Take our I/O 2026 quiz, vibe coded in Google AI Studio.1d◆So you’ve heard these AI terms and nodded along; let’s fix that1d◆What happens when companies become too AI-pilled?1d◆Tech companies desperately want to film you doing chores1d◆9 demos of Gemini Omni and Gemini 3.5 in action1d◆After Nvidia’s $20B not-acqui-hire, AI chip startup Groq reportedly raising $650M1d◆Cognition’s Scott Wu says AI coding agents shouldn’t replace humans1d◆Today is the last day to apply to speak at TechCrunch Disrupt 20261d◆Final 24 hours to save up to $410 on your TechCrunch Disrupt 2026 ticket1d◆Does your CEO have AI psychosis? Aaron Levie thinks most of them do.1d◆Kiwibit’s AI-powered bird feeder is my new backyard buddy1d◆
News/When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory
arxiv
PublishedMay 11, 2026 at 4:00 AM
—neutral

When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory

Source
arxiv.orgfull article ↗
Read on arxiv→
Publisher summary· verbatim

arXiv:2605.07313v1 Announce Type: new Abstract: Memory-agent evaluations report fixed-snapshot accuracy or retrieval quality, but these scores do not show whether evidence remains usable as irrelevant sessions (sessions not annotated as task-relevant evidence for the query) accumulate. We present a

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

// no spam · unsubscribe one-click · free forever

Discussion
Mentioned models
05
  • 01
    HippoRAG
  • 02
    LiCoMemory
  • 03
    Qwen3-8B
  • 04
    Qwen3-32B
  • 05
    Qwen3-235B
Source
↗
arxiv
Read original ↗All from arxiv →
Tags
04
#evaluation#memory#agents#scalability

No replies yet. Be first.

Mentioned models
05
  • 01
    HippoRAG
  • 02
    LiCoMemory
  • 03
    Qwen3-8B
  • 04
    Qwen3-32B
  • 05
    Qwen3-235B
Source
↗
arxiv
Read original ↗All from arxiv →
Tags
04
#evaluation#memory#agents#scalability
The Bubble Brief
WEEKLY

Read evaluation insights every Tuesday — top movers, new releases, story of the week.

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗
HomeModelsNews