·

Home
Models
News
Compare
Boards
Pricing
About
Newsletter
Methodology
Contact

Latest

A Consensus-Based Framework for Relative Preference Evaluation of Large Language Models5h◆Probing Latent Colombian Identity Inferences in Qwen2.5-7B with Natural Language Autoencoders5h◆Data Quality over Capacity: Internalizing Documents into LoRA Adapters for Closed-Book QA5h◆Enjoy Your Talk: A Human-Centered Benchmark for Multi-Turn Dialogue with Decoupled User Simulation, Target Modeling, and Judging5h◆Multi-Mask Diffusion Language Models for Few-Step Generation5h◆Solar Open 2 Technical Report5h◆The Geometry of Personality: Activation Steering with Jungian Cognitive Functions5h◆Self-Guided Process Reward Optimization with Redefined Step-wise Advantage for Process Reinforcement Learning5h◆H$^2$SD: Hybrid Hindsight Self-Distillation5h◆LunarFM: A Shared Multimodal Representation of the Moon's Surface5h◆Prior laundering: learned priors with inherited, undetectable overconfidence5h◆Deep Sigma Point Processes for RCS Modeling in Spaceborne SAR Imagery5h◆Prompt as a Data Type: In-Database LLM Prompt Management and Rewriting5h◆CausalForge: A Formally Grounded, Self-Improving Agentic Framework for Automated Research in Causal Inference5h◆Quantum Spectral Model: Data Reuploading with Input-Conditioned Frequency Support5h◆Meta-Learning Approaches for Speaker-Dependent Voice Fatigue Models5h◆A Comparative Benchmark of Federated Learning Strategies for Mortality Prediction on Heterogeneous and Imbalanced Clinical Data5h◆Simpson's Paradox in Behavioral Curves: How Aggregation Distorts Parametric Models of User Dynamics5h◆Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO5h◆Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning5h◆A Consensus-Based Framework for Relative Preference Evaluation of Large Language Models5h◆Probing Latent Colombian Identity Inferences in Qwen2.5-7B with Natural Language Autoencoders5h◆Data Quality over Capacity: Internalizing Documents into LoRA Adapters for Closed-Book QA5h◆Enjoy Your Talk: A Human-Centered Benchmark for Multi-Turn Dialogue with Decoupled User Simulation, Target Modeling, and Judging5h◆Multi-Mask Diffusion Language Models for Few-Step Generation5h◆Solar Open 2 Technical Report5h◆The Geometry of Personality: Activation Steering with Jungian Cognitive Functions5h◆Self-Guided Process Reward Optimization with Redefined Step-wise Advantage for Process Reinforcement Learning5h◆H$^2$SD: Hybrid Hindsight Self-Distillation5h◆LunarFM: A Shared Multimodal Representation of the Moon's Surface5h◆Prior laundering: learned priors with inherited, undetectable overconfidence5h◆Deep Sigma Point Processes for RCS Modeling in Spaceborne SAR Imagery5h◆Prompt as a Data Type: In-Database LLM Prompt Management and Rewriting5h◆CausalForge: A Formally Grounded, Self-Improving Agentic Framework for Automated Research in Causal Inference5h◆Quantum Spectral Model: Data Reuploading with Input-Conditioned Frequency Support5h◆Meta-Learning Approaches for Speaker-Dependent Voice Fatigue Models5h◆A Comparative Benchmark of Federated Learning Strategies for Mortality Prediction on Heterogeneous and Imbalanced Clinical Data5h◆Simpson's Paradox in Behavioral Curves: How Aggregation Distorts Parametric Models of User Dynamics5h◆Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO5h◆Re-FORC: Adaptive Reward Prediction for Efficient Chain-of-Thought Reasoning5h◆

News/FG$^2$-GDN: Enhancing Long-Context Gated Delta Networks with Doubly Fine-Grained Control

arxiv

PublishedMay 5, 2026 at 4:00 AM

▲bullish

FG$^2$-GDN: Enhancing Long-Context Gated Delta Networks with Doubly Fine-Grained Control

Source

arxiv.orgfull article ↗

Read on arxiv→

Publisher summary· verbatim

arXiv:2604.19021v2 Announce Type: replace Abstract: Linear attention mechanisms have emerged as promising alternatives to softmax attention, offering linear-time complexity during inference. Recent advances such as Gated DeltaNet (GDN) and Kimi Delta Attention (KDA) have demonstrated that the delta

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Email address

// no spam · unsubscribe one-click · free forever

Discussion

Mentioned models

04

01
Gated DeltaNet (GDN)
02
Kimi Delta Attention (KDA)
03
FG$^2$-GDN
04
FG$^2$-GDN+

Source

↗

arxiv

Read original ↗All from arxiv →

Tags

03

#machine learning #attention mechanisms #optimization

No replies yet. Be first.

Mentioned models

04

01
Gated DeltaNet (GDN)
02
Kimi Delta Attention (KDA)
03
FG$^2$-GDN
04
FG$^2$-GDN+

Source

↗

arxiv

Read original ↗All from arxiv →

Tags

03

#machine learning #attention mechanisms #optimization

Related coverage

More from ARXIV

arxivA Consensus-Based Framework for Relative Preference Evaluation of Large Language Models5h arxivProbing Latent Colombian Identity Inferences in Qwen2.5-7B with Natural Language Autoencoders5h arxivData Quality over Capacity: Internalizing Documents into LoRA Adapters for Closed-Book QA5h arxivEnjoy Your Talk: A Human-Centered Benchmark for Multi-Turn Dialogue with Decoupled User Simulation, Target Modeling, and Judging5h

The Bubble Brief

WEEKLY

Read machine learning insights every Tuesday — top movers, new releases, story of the week.

Email address

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗

Home Models News