·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Newsletter
  • Methodology
  • Contact
Latest
Theker just raised $85M to build the factory robot that doesn’t specialize in anything2h◆Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world2h◆SpaceX officially prices shares at $135 in the largest IPO ever7h◆Our new community investments in Virginia support local jobs and expand energy affordability.7h◆SpaceX SPV investors won’t know their true holdings until post-IPO lock-ups lift7h◆Amazon’s data centers used 2.5 billion gallons of water last year10h◆Deezer’s new tool can identify AI music from Spotify, Apple Music, and others11h◆Pool’s new app turns your screenshots into something useful12h◆DoorDash’s new AI chatbot lets you order with prompts and photos13h◆Anthropic apologizes for invisible Claude Fable guardrails16h◆Google DeepMind is worried about what happens when millions of agents start to interact16h◆Deezer launches an AI music detector for other streaming services19h◆Opendoor’s India exit is fueling a bigger conversation about AI and outsourcing23h◆MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning23h◆Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!23h◆ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation23h◆Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions23h◆The Impossibility of Eliciting Latent Knowledge23h◆Mapping Scientific Literature with Large Language Models and Topic Modeling23h◆Grounding Computer Use Agents on Human Demonstrations23h◆Theker just raised $85M to build the factory robot that doesn’t specialize in anything2h◆Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world2h◆SpaceX officially prices shares at $135 in the largest IPO ever7h◆Our new community investments in Virginia support local jobs and expand energy affordability.7h◆SpaceX SPV investors won’t know their true holdings until post-IPO lock-ups lift7h◆Amazon’s data centers used 2.5 billion gallons of water last year10h◆Deezer’s new tool can identify AI music from Spotify, Apple Music, and others11h◆Pool’s new app turns your screenshots into something useful12h◆DoorDash’s new AI chatbot lets you order with prompts and photos13h◆Anthropic apologizes for invisible Claude Fable guardrails16h◆Google DeepMind is worried about what happens when millions of agents start to interact16h◆Deezer launches an AI music detector for other streaming services19h◆Opendoor’s India exit is fueling a bigger conversation about AI and outsourcing23h◆MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning23h◆Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!23h◆ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation23h◆Generalizing Beyond Suboptimality: Offline Reinforcement Learning Learns Effective Scheduling through Random Solutions23h◆The Impossibility of Eliciting Latent Knowledge23h◆Mapping Scientific Literature with Large Language Models and Topic Modeling23h◆Grounding Computer Use Agents on Human Demonstrations23h◆
Tag

#speech-processing

3 articles tagged #speech-processing

arxivMay 8bullish

Minimizing Modality Gap from the Input Side: Your Speech LLM Can Be a Prosody-Aware Text LLM

arXiv:2605.05927v1 Announce Type: new Abstract: Speech large language models (SLMs) are typically built from text large language model (TLM) checkpoints, yet they still suffer from a substantial modality gap. Prior work has mainly attempted to reduce this gap from the output side by making speech ge

TEWH2 models#speech-processing#language-models#modality-gapRead on arxiv →
arxivMay 7

Deepfake Audio Detection Using Self-supervised Fusion Representations

arXiv:2605.03420v1 Announce Type: cross Abstract: This paper describes a submission to the Environment-Aware Speech and Sound Deepfake Detection Challenge (ESDD2) 2026, which addresses component-level deepfake detection using the CompSpoofV2 dataset, where speech and environmental sounds may be inde

FABEAA3 models#deepfake-detection#speech-processing#environmental-soundsRead on arxiv →
arxivApr 24bullish

Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks

arXiv:2305.01626v4 Announce Type: replace-cross Abstract: Computational models of syntax are predominantly text-based. Here we propose that the most basic first step in the evolution of syntax can be modeled directly from raw speech in a fully unsupervised way. We focus on one of the most ubiquitous

CIFICN3 models#speech-processing#neural-networks#language-modelingRead on arxiv →
HomeModelsNews