arxiv
PublishedJune 4, 2026 at 4:00 AM
LoopMoE: Unifying Iterative Computation with Mixture-of-Experts for Language Modeling
Publisher summary· verbatim
arXiv:2606.04438v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) and looped architectures scale models along two orthogonal axes, namely parameter capacity and effective depth. However, mainstream looped architectures rely on dense backbones that couple parameter count with per-token FLOPs
Stay posted· Newsletter
A 5-min weekly brief — top movers, price watch, story of the week.
Discussion
No replies yet. Be first.
Related coverage
More from ARXIV
arxivSFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning5harxivOptical-Guided Neural Collapse for SAR Few-Shot Class Incremental Learning5harxivDynamic Infilling Anchors for Format-Constrained Generation in Diffusion Large Language Models5harxivTemporal Order Matters for Agentic Memory: Segment Trees for Long-Horizon Agents5hThe Bubble Brief
WEEKLYRead AI insights every Tuesday — top movers, new releases, story of the week.
Originally published on arxiv ↗