arxiv
PublishedMay 29, 2026 at 4:00 AM
—neutral
MELD: Mel-Spectrogram-Based Speech Language Modeling with Discrete Latent Variables
Publisher summary· verbatim
arXiv:2605.29859v1 Announce Type: cross Abstract: Recent speech language models rely on encoders that are optimized separately from autoregressive models. Since these encoders are unaware of the downstream objectives, the extracted representations may not be optimal for downstream tasks. To address
Stay posted· Newsletter
A 5-min weekly brief — top movers, price watch, story of the week.
Discussion
No replies yet. Be first.
Related coverage
More from ARXIV
arxivSFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning14harxivOptical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning14harxivDynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models14harxivTemporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents14hThe Bubble Brief
WEEKLYRead AI insights every Tuesday — top movers, new releases, story of the week.
Originally published on arxiv ↗