arxivMay 4bullish
arXiv:2512.16762v3 Announce Type: replace Abstract: Generative Pre-trained Transformer (GPT) architectures are the most popular design for language modeling. Energy-based modeling is a different paradigm that views inference as a dynamical process operating on an energy landscape. We propose a minim
arxivApr 24bullish
arXiv:2305.01626v4 Announce Type: replace-cross Abstract: Computational models of syntax are predominantly text-based. Here we propose that the most basic first step in the evolution of syntax can be modeled directly from raw speech in a fully unsupervised way. We focus on one of the most ubiquitous
arxivApr 16
arXiv:2512.10877v4 Announce Type: replace Abstract: Discrete diffusion models (DMs) have achieved strong performance in language and other discrete domains, offering a compelling alternative to autoregressive modeling. Yet this performance typically depends on large training datasets, challenging th