·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions8m◆MARIC: Multi-Agent Reasoning for Image Classification8m◆The Impossibility of Eliciting Latent Knowledge8m◆A Five-Plane Reference Architecture for Runtime Governance of Production AI Agents8m◆PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents8m◆Nonslop: A Gamified Experiment in Human-AI Collaborative Writing8m◆Geometric Metrics and LLMs: What They Measure and When They Work8m◆From Architecture to Output: Structural Origins of Hallucination in Large Language Models and the Amplifying Role of Data8m◆From Consumption to Reflection: Designing Human-AI Relations for Stable Reasoning8m◆PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized LLM Inference8m◆MA-DLE: Speech-based Automatic Depression Level Estimation via Memory Augmentation8m◆The Structural Attention Tax: How Retrieval Format Hijacks In-Context Learning Independent of Content8m◆Noise-Guided Transport for Imitation Learning8m◆NightFeats @ MMU-RAGent NeurIPS 2025: A Context-Optimized Multi-Agent RAG System for the Text-to-Text Track8m◆To Intervene or Not: Guiding Inference-time Alignment with Probabilistic Model Blending8m◆Dual-Stance Evaluation of Sycophancy: The Structure of Agreement and the Limits of Intervention8m◆BioDivergence: A Benchmark and Evaluation Framework for Hidden Contextual Contradictions in Biomedical Abstracts8m◆When Researchers Say Mental Model/Theory of Mind of AI, What Are They Really Talking About?8m◆ProcessThinker: Enhancing Multi-modal Large Language Models Reasoning via Rollout-based Process Reward8m◆T2MM: An LLM Supported Architecture For Inquiry-Based Modeling8m◆Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions8m◆MARIC: Multi-Agent Reasoning for Image Classification8m◆The Impossibility of Eliciting Latent Knowledge8m◆A Five-Plane Reference Architecture for Runtime Governance of Production AI Agents8m◆PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents8m◆Nonslop: A Gamified Experiment in Human-AI Collaborative Writing8m◆Geometric Metrics and LLMs: What They Measure and When They Work8m◆From Architecture to Output: Structural Origins of Hallucination in Large Language Models and the Amplifying Role of Data8m◆From Consumption to Reflection: Designing Human-AI Relations for Stable Reasoning8m◆PoQ-Judge: A Multi-Architecture Evaluation Framework for Cost-Aware Proof-of-Quality in Decentralized LLM Inference8m◆MA-DLE: Speech-based Automatic Depression Level Estimation via Memory Augmentation8m◆The Structural Attention Tax: How Retrieval Format Hijacks In-Context Learning Independent of Content8m◆Noise-Guided Transport for Imitation Learning8m◆NightFeats @ MMU-RAGent NeurIPS 2025: A Context-Optimized Multi-Agent RAG System for the Text-to-Text Track8m◆To Intervene or Not: Guiding Inference-time Alignment with Probabilistic Model Blending8m◆Dual-Stance Evaluation of Sycophancy: The Structure of Agreement and the Limits of Intervention8m◆BioDivergence: A Benchmark and Evaluation Framework for Hidden Contextual Contradictions in Biomedical Abstracts8m◆When Researchers Say Mental Model/Theory of Mind of AI, What Are They Really Talking About?8m◆ProcessThinker: Enhancing Multi-modal Large Language Models Reasoning via Rollout-based Process Reward8m◆T2MM: An LLM Supported Architecture For Inquiry-Based Modeling8m◆
News/Geometric Metrics and LLMs: What They Measure and When They Work
arxiv
PublishedJune 11, 2026 at 4:00 AM

Geometric Metrics and LLMs: What They Measure and When They Work

Source
arxiv.orgfull article ↗
Read on arxiv→
Publisher summary· verbatim

arXiv:2509.25359v2 Announce Type: replace-cross Abstract: We present a systematic stress-test of geometric metrics for LLM evaluation. Rank-based geometric properties of internal representations have shown promise as reference-free quality signals, but the conditions under which they are reliable re

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

// no spam · unsubscribe one-click · free forever

Discussion
Source
↗
arxiv
Read original ↗All from arxiv →

No replies yet. Be first.

Source
↗
arxiv
Read original ↗All from arxiv →

Related coverage

More from ARXIV
arxivGeneralizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions8marxivMARIC: Multi-Agent Reasoning for Image Classification8marxivThe Impossibility of Eliciting Latent Knowledge8marxivA Five-Plane Reference Architecture for Runtime Governance of Production AI Agents8m
The Bubble Brief
WEEKLY

Read AI insights every Tuesday — top movers, new releases, story of the week.

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗
HomeModelsNews