·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Theker just raised $85M to build the factory robot that doesn’t specialize in anything1h◆Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world1h◆SpaceX officially prices shares at $135 in the largest IPO ever6h◆Our new community investments in Virginia support local jobs and expand energy affordability.6h◆SpaceX SPV investors won’t know their true holdings until post-IPO lock-ups lift6h◆Amazon’s data centers used 2.5 billion gallons of water last year9h◆Deezer’s new tool can identify AI music from Spotify, Apple Music, and others10h◆Pool’s new app turns your screenshots into something useful11h◆DoorDash’s new AI chatbot lets you order with prompts and photos12h◆Anthropic apologizes for invisible Claude Fable guardrails15h◆Google DeepMind is worried about what happens when millions of agents start to interact15h◆Deezer launches an AI music detector for other streaming services18h◆Opendoor’s India exit is fueling a bigger conversation about AI and outsourcing22h◆MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning22h◆Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!22h◆ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation22h◆Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions22h◆The Impossibility of Eliciting Latent Knowledge22h◆Mapping Scientific Literature with Large Language Models and Topic Modeling22h◆Grounding Computer Use Agents on Human Demonstrations22h◆Theker just raised $85M to build the factory robot that doesn’t specialize in anything1h◆Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world1h◆SpaceX officially prices shares at $135 in the largest IPO ever6h◆Our new community investments in Virginia support local jobs and expand energy affordability.6h◆SpaceX SPV investors won’t know their true holdings until post-IPO lock-ups lift6h◆Amazon’s data centers used 2.5 billion gallons of water last year9h◆Deezer’s new tool can identify AI music from Spotify, Apple Music, and others10h◆Pool’s new app turns your screenshots into something useful11h◆DoorDash’s new AI chatbot lets you order with prompts and photos12h◆Anthropic apologizes for invisible Claude Fable guardrails15h◆Google DeepMind is worried about what happens when millions of agents start to interact15h◆Deezer launches an AI music detector for other streaming services18h◆Opendoor’s India exit is fueling a bigger conversation about AI and outsourcing22h◆MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning22h◆Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!22h◆ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation22h◆Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions22h◆The Impossibility of Eliciting Latent Knowledge22h◆Mapping Scientific Literature with Large Language Models and Topic Modeling22h◆Grounding Computer Use Agents on Human Demonstrations22h◆
Tag

#llms

6 articles tagged #llms

arxiv5d ago

CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives

arXiv:2504.10823v4 Announce Type: replace-cross Abstract: Navigating dilemmas involving conflicting values is challenging even for humans in high-stakes domains, let alone for AI, yet prior work has been limited to everyday scenarios. To close this gap, we introduce CLASH (Character perspective-base

GPCL2 models#value-based decision-making#llms#benchmarkRead on arxiv →
arxiv5d agobullish

Benchmark Everything Everywhere All at Once

arXiv:2606.06462v1 Announce Type: new Abstract: Benchmarks are fundamental for evaluating and advancing LLMs and MLLMs by providing standardized and explicit measures of performance. However, their construction is labor-intensive and hard to reuse, raising concerns about sustainability and scalabili

#benchmark#llms#autonomous-systemsRead on arxiv →
arxivMay 25bullish

AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning

arXiv:2601.17261v4 Announce Type: replace Abstract: Zeroth-Order (ZO) optimization has emerged as a promising solution for fine-tuning LLMs under strict memory constraints, as it avoids the prohibitive memory cost of storing activations for backpropagation. However, existing ZO methods typically emp

QWPA2 models#optimization#llms#fine-tuningRead on arxiv →
arxivMay 25bullish

Task-Awareness Improves LLM Generations and Uncertainty

arXiv:2601.21500v2 Announce Type: replace Abstract: In many applications of LLMs, natural language responses often have an underlying structure such as representing discrete labels, numerical values, or graphs. Yet, existing decoding and uncertainty estimation methods operate only in language space

#llms#machine-learning#uncertainty-estimationRead on arxiv →
mit-tech-reviewMay 21bullish

Roundtables: Can AI Learn to Understand the World?

Listen to the session or watch below AI companies want to build systems that understand the external world and overcome the limitations of LLMs. Recent developments have brought world models to the forefront of the AI discussion. Watch a conversation with editor in chief Mat Honan, senior AI editor

#world-models#llms#ai-developmentRead on mit-tech-review →
arxivApr 13bullish

ConvoLearn: A Learning Sciences Grounded Dataset for Fine-Tuning Dialogic AI Tutors

arXiv:2601.08950v4 Announce Type: replace Abstract: Despite their growing adoption in education, LLMs remain misaligned with the core principle of effective tutoring: the dialogic construction of knowledge. We introduce ConvoLearn, a dataset of 2,134 semi-synthetic tutor-student dialogues operationa

MI1 model#education#dialogic#tutoringRead on arxiv →
HomeModelsNews