·
DataBubble
  • Home
  • Models
  • News
  • Compare
  • Boards
  • Pricing
  • About
  • Contact
Latest
Canva apologizes after its AI tool replaces ‘Palestine’ in designs33m◆China blocks Meta’s $2B Manus deal after months-long probe1h◆OpenAI could be making a phone with AI agents replacing apps1h◆Rebuilding the data stack for AI2h◆Join the new AI Agents Vibe Coding Course from Google and Kaggle2h◆The AI-designed car is taking shape4h◆Meta inks deal for solar power at night, beamed from space5h◆The next phase of the Microsoft OpenAI partnership9h◆From Local to Cluster: A Unified Framework for Causal Discovery with Latent Variables11h◆Consequentialist Objectives and Catastrophe11h◆EgoMAGIC- An Egocentric Video Field Medicine Dataset for Training Perception Algorithms11h◆ReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation11h◆A Probabilistic Framework for Hierarchical Goal Recognition11h◆CNSL-bench: Benchmarking the Sign Language Understanding Capabilities of MLLMs on Chinese National Sign Language11h◆The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology11h◆Toward Principled LLM Safety Testing: Solving the Jailbreak Oracle Problem11h◆Learning from Natural Language Feedback for Personalized Question Answering11h◆TS-Arena -- A Live Forecast Pre-Registration Platform11h◆Report for NSF Workshop on AI for Electronic Design Automation11h◆Initial results of the Digital Consciousness Model11h◆Canva apologizes after its AI tool replaces ‘Palestine’ in designs33m◆China blocks Meta’s $2B Manus deal after months-long probe1h◆OpenAI could be making a phone with AI agents replacing apps1h◆Rebuilding the data stack for AI2h◆Join the new AI Agents Vibe Coding Course from Google and Kaggle2h◆The AI-designed car is taking shape4h◆Meta inks deal for solar power at night, beamed from space5h◆The next phase of the Microsoft OpenAI partnership9h◆From Local to Cluster: A Unified Framework for Causal Discovery with Latent Variables11h◆Consequentialist Objectives and Catastrophe11h◆EgoMAGIC- An Egocentric Video Field Medicine Dataset for Training Perception Algorithms11h◆ReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation11h◆A Probabilistic Framework for Hierarchical Goal Recognition11h◆CNSL-bench: Benchmarking the Sign Language Understanding Capabilities of MLLMs on Chinese National Sign Language11h◆The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology11h◆Toward Principled LLM Safety Testing: Solving the Jailbreak Oracle Problem11h◆Learning from Natural Language Feedback for Personalized Question Answering11h◆TS-Arena -- A Live Forecast Pre-Registration Platform11h◆Report for NSF Workshop on AI for Electronic Design Automation11h◆Initial results of the Digital Consciousness Model11h◆
News/Beyond N-gram: Data-Aware X-GRAM Extraction for Efficient Embedding Parameter Scaling
arxiv
PublishedApril 27, 2026 at 4:00 AM
—neutral

Beyond N-gram: Data-Aware X-GRAM Extraction for Efficient Embedding Parameter Scaling

Source
arxiv.orgfull article ↗
Read on arxiv→
Publisher summary· verbatim

arXiv:2604.21724v2 Announce Type: replace Abstract: Large token-indexed lookup tables provide a compute-decoupled scaling path, but their practical gains are often limited by poor parameter efficiency and rapid memory growth. We attribute these limitations to Zipfian under-training of the long tail,

Discussion
Source
↗
arxiv
Read original ↗All from arxiv →

No replies yet. Be first.

Source
↗
arxiv
Read original ↗All from arxiv →

Related coverage

More from ARXIV
arxivFrom Local to Cluster: A Unified Framework for Causal Discovery with Latent Variables11harxivConsequentialist Objectives and Catastrophe11harxivEgoMAGIC- An Egocentric Video Field Medicine Dataset for Training Perception Algorithms11harxivReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation11h
Originally published on arxiv ↗
HomeModelsNews