·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
FederatedSkill: Federated Learning for Agentic Skill Evolution2h◆GLINT: Sparsely Gated Vision-Language Alignment for Fine-Grained Radiology Representations2h◆Privacy-Aware Decoding: Mitigating Privacy Leakage of Large Language Models in Retrieval-Augmented Generation2h◆CTR-Sink: Attention Sink for Language Models in Click-Through Rate Prediction2h◆When Models Refuse: Political Steerability and Feature Richness as Measures of Ideological Depth2h◆Train Once, Reuse Everywhere: Generalizable Implicit In-Context Learning by Routing Attention2h◆Instant Personalized Large Language Model Adaptation via Hypernetwork2h◆REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control2h◆CourseTimeQA: A Lecture-Video Benchmark and a Latency-Constrained Cross-Modal Fusion Method for Timestamped QA2h◆KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering2h◆Social Caption: Evaluating Social Understanding in Multimodal Models2h◆Neural Attention Search Linear: Towards Adaptive Token-Level Hybrid Attention Models2h◆Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models2h◆Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams2h◆Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models2h◆Assessing Pause Thresholds for empirical Translation Process Research2h◆Language Bias under Conflicting Information in Multilingual LLMs2h◆MT-OSC: Path for LLMs that Get Lost in Multi-Turn Conversation2h◆Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning2h◆MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining2h◆FederatedSkill: Federated Learning for Agentic Skill Evolution2h◆GLINT: Sparsely Gated Vision-Language Alignment for Fine-Grained Radiology Representations2h◆Privacy-Aware Decoding: Mitigating Privacy Leakage of Large Language Models in Retrieval-Augmented Generation2h◆CTR-Sink: Attention Sink for Language Models in Click-Through Rate Prediction2h◆When Models Refuse: Political Steerability and Feature Richness as Measures of Ideological Depth2h◆Train Once, Reuse Everywhere: Generalizable Implicit In-Context Learning by Routing Attention2h◆Instant Personalized Large Language Model Adaptation via Hypernetwork2h◆REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control2h◆CourseTimeQA: A Lecture-Video Benchmark and a Latency-Constrained Cross-Modal Fusion Method for Timestamped QA2h◆KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering2h◆Social Caption: Evaluating Social Understanding in Multimodal Models2h◆Neural Attention Search Linear: Towards Adaptive Token-Level Hybrid Attention Models2h◆Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models2h◆Can Structural Cues Save LLMs? Evaluating Language Models in Massive Document Streams2h◆Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models2h◆Assessing Pause Thresholds for empirical Translation Process Research2h◆Language Bias under Conflicting Information in Multilingual LLMs2h◆MT-OSC: Path for LLMs that Get Lost in Multi-Turn Conversation2h◆Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning2h◆MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining2h◆
News/LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
arxiv
PublishedMay 28, 2026 at 4:00 AM

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Source
arxiv.orgfull article ↗
Read on arxiv→
Publisher summary· verbatim

arXiv:2605.27365v2 Announce Type: replace-cross Abstract: Vision-language models (VLMs) commonly formulate visual grounding and detection as a coordinate-token generation problem, serializing each 2D box into multiple 1D tokens that are learned and decoded largely independently. This token-by-token

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

// no spam · unsubscribe one-click · free forever

Discussion
Source
↗
arxiv
Read original ↗All from arxiv →

No replies yet. Be first.

Source
↗
arxiv
Read original ↗All from arxiv →

Related coverage

More from ARXIV
arxivFederatedSkill: Federated Learning for Agentic Skill Evolution2harxivGLINT: Sparsely Gated Vision-Language Alignment for Fine-Grained Radiology Representations2harxivPrivacy-Aware Decoding: Mitigating Privacy Leakage of Large Language Models in Retrieval-Augmented Generation2harxivCTR-Sink: Attention Sink for Language Models in Click-Through Rate Prediction2h
The Bubble Brief
WEEKLY

Read AI insights every Tuesday — top movers, new releases, story of the week.

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗
HomeModelsNews