·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Jedify raises $24M to help companies arm AI agents with context on their business57m◆Decart’s new world model can simulate hours of photorealistic driving — with some caveats1h◆Meta signs first AI data center deal in India with Reliance7h◆BiWM: Advancing Open-Source Interactive Video World Models with Bidirectional Autoregression10h◆Fisher-Guided Progressive Parameter Selection for Adaptive Fine-Tuning10h◆Integral Field Unit Spectroscopy with One Fiber10h◆AMEL: Accumulated Message Effects on LLM Judgments10h◆Deployment-Time Memorization in Foundation-Model Agents10h◆Minimalist Genetic Programming10h◆Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation10h◆TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning10h◆Few-step Generative Models as Lossy Compression10h◆Offline Reinforcement Learning for Rotation Profile Control in Tokamaks10h◆HMAF: A Hierarchical Multi-Slot GD-RTB Allocation Framework10h◆Optimal Post-Training Quantization Scales and Where to Find Them10h◆STAGE-Claw: Automated State-based Agent Benchmarking for Realistic Scenarios10h◆Tractogram foundation model10h◆Mitigating Bias in Low-SNR Financial Reinforcement Learning via Quantum Representations10h◆Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning10h◆Parthenon Law: A Self-Evolving Legal-Agent Framework10h◆Jedify raises $24M to help companies arm AI agents with context on their business57m◆Decart’s new world model can simulate hours of photorealistic driving — with some caveats1h◆Meta signs first AI data center deal in India with Reliance7h◆BiWM: Advancing Open-Source Interactive Video World Models with Bidirectional Autoregression10h◆Fisher-Guided Progressive Parameter Selection for Adaptive Fine-Tuning10h◆Integral Field Unit Spectroscopy with One Fiber10h◆AMEL: Accumulated Message Effects on LLM Judgments10h◆Deployment-Time Memorization in Foundation-Model Agents10h◆Minimalist Genetic Programming10h◆Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation10h◆TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning10h◆Few-step Generative Models as Lossy Compression10h◆Offline Reinforcement Learning for Rotation Profile Control in Tokamaks10h◆HMAF: A Hierarchical Multi-Slot GD-RTB Allocation Framework10h◆Optimal Post-Training Quantization Scales and Where to Find Them10h◆STAGE-Claw: Automated State-based Agent Benchmarking for Realistic Scenarios10h◆Tractogram foundation model10h◆Mitigating Bias in Low-SNR Financial Reinforcement Learning via Quantum Representations10h◆Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning10h◆Parthenon Law: A Self-Evolving Legal-Agent Framework10h◆
News/Before We Trust Them: Decision-Making Failures in Navigation of Foundation Models
arxiv
PublishedApril 10, 2026 at 4:00 AM
▼bearish

Before We Trust Them: Decision-Making Failures in Navigation of Foundation Models

Source
arxiv.orgfull article ↗
Read on arxiv→
Publisher summary· verbatim

arXiv:2601.05529v5 Announce Type: replace Abstract: High success rates on navigation-related tasks do not necessarily translate into reliable decision making by foundation models. To examine this gap, we evaluate current models on six diagnostic tasks spanning three settings: reasoning under complet

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

// no spam · unsubscribe one-click · free forever

Discussion
Mentioned models
03
  • 01
    GPT-5
  • 02
    Gemini-2.5 Flash
  • 03
    Gemini-2.0 Flash
Source
↗
arxiv
Read original ↗All from arxiv →
Tags
04
#navigation#decision making#safety#evaluation

No replies yet. Be first.

Mentioned models
03
  • 01
    GPT-5
  • 02
    Gemini-2.5 Flash
  • 03
    Gemini-2.0 Flash
Source
↗
arxiv
Read original ↗All from arxiv →
Tags
04
#navigation#decision making#safety#evaluation

Related coverage

More from ARXIV
arxivBiWM: Advancing Open-Source Interactive Video World Models with Bidirectional Autoregression10harxivFisher-Guided Progressive Parameter Selection for Adaptive Fine-Tuning10harxivIntegral Field Unit Spectroscopy with One Fiber10harxivAMEL: Accumulated Message Effects on LLM Judgments10h
The Bubble Brief
WEEKLY

Read navigation insights every Tuesday — top movers, new releases, story of the week.

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗
HomeModelsNews