·

Home
Models
News
Compare
Boards
Pricing
About
Newsletter
Methodology
Contact

Latest

Cursor makes its biggest India push yet ahead of SpaceX acquisition with localized pricing4h◆Photonic reservoir computing with complex networks4h◆XS-VLA: Coupling Coarse-grained Spatial Distillation with Latent Flow Matching for Lightweight Robotic Control4h◆Agentic Permissions Policy Algebra for Taint Confinement in LLM Agents4h◆Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks4h◆The One-Word Census: Answer-Choice Conformity Across 44 Language Models4h◆Creative Integration: A Decidable Criterion of Creativity4h◆BERT-based Models vs. Large Language Models for Low-Resource Named Entity Recognition: A Comparative Study on Marathi4h◆Joint Optimization for Greedy Longest-match Tokenization4h◆Kimi K3: Open Frontier Intelligence4h◆The Few-shot Dilemma: Over-prompting Large Language Models4h◆Speculative Pipeline Decoding: Higher-Accuracy Drafting with Hidden Latency via Pipeline Parallelism4h◆Bayesian Complete-Pooling in Cross-Subject Classification for Motor Imagery Electroencephalogram4h◆StageGuard: Physiologically Constrained Sleep Staging4h◆Soft-Constrained Optimization of Latent Space in Variational Autoencoders4h◆Beyond Error-vs-Discard Characteristic: Toward Stable and Reliable Evaluation for Face Image Quality Assessment4h◆Analyzing the Importance of Blank for CTC-Based Knowledge Distillation4h◆Predicting Channel Closures in the Lightning Network with Machine Learning4h◆Evaluation of Blood Vessel Segmentation Methods on Hard-to-Detect Vascular Structures4h◆MOCA: A Transformer-based Modular Causal Inference Framework with One-way Cross-attention and Cutting Feedback4h◆Cursor makes its biggest India push yet ahead of SpaceX acquisition with localized pricing4h◆Photonic reservoir computing with complex networks4h◆XS-VLA: Coupling Coarse-grained Spatial Distillation with Latent Flow Matching for Lightweight Robotic Control4h◆Agentic Permissions Policy Algebra for Taint Confinement in LLM Agents4h◆Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks4h◆The One-Word Census: Answer-Choice Conformity Across 44 Language Models4h◆Creative Integration: A Decidable Criterion of Creativity4h◆BERT-based Models vs. Large Language Models for Low-Resource Named Entity Recognition: A Comparative Study on Marathi4h◆Joint Optimization for Greedy Longest-match Tokenization4h◆Kimi K3: Open Frontier Intelligence4h◆The Few-shot Dilemma: Over-prompting Large Language Models4h◆Speculative Pipeline Decoding: Higher-Accuracy Drafting with Hidden Latency via Pipeline Parallelism4h◆Bayesian Complete-Pooling in Cross-Subject Classification for Motor Imagery Electroencephalogram4h◆StageGuard: Physiologically Constrained Sleep Staging4h◆Soft-Constrained Optimization of Latent Space in Variational Autoencoders4h◆Beyond Error-vs-Discard Characteristic: Toward Stable and Reliable Evaluation for Face Image Quality Assessment4h◆Analyzing the Importance of Blank for CTC-Based Knowledge Distillation4h◆Predicting Channel Closures in the Lightning Network with Machine Learning4h◆Evaluation of Blood Vessel Segmentation Methods on Hard-to-Detect Vascular Structures4h◆MOCA: A Transformer-based Modular Causal Inference Framework with One-way Cross-attention and Cutting Feedback4h◆

News/Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

arxiv

PublishedJune 10, 2026 at 4:00 AM

—neutral

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

Source

arxiv.orgfull article ↗

Read on arxiv→

Publisher summary· verbatim

arXiv:2606.09864v1 Announce Type: cross Abstract: Key-value (KV) cache quantization is widely used to reduce Large Language Model (LLM) inference memory, yet existing evaluations solely focus on measuring perplexity and accuracy without assessing the safety impact. In this study, we explore alignmen

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Email address

// no spam · unsubscribe one-click · free forever

Discussion

Mentioned models

01

01
Mistral-7B

Source

↗

arxiv

Read original ↗All from arxiv →

Tags

03

#quantization #safety #large-language-models

Mentioned companies

01

NVIDIA

No replies yet. Be first.

Mentioned models

01

01
Mistral-7B

Source

↗

arxiv

Read original ↗All from arxiv →

Tags

03

#quantization #safety #large-language-models

Mentioned companies

01

NVIDIA

Related coverage

More from ARXIV

arxivPhotonic reservoir computing with complex networks4h arxivXS-VLA: Coupling Coarse-grained Spatial Distillation with Latent Flow Matching for Lightweight Robotic Control4h arxivAgentic Permissions Policy Algebra for Taint Confinement in LLM Agents4h arxivBeyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks4h

The Bubble Brief

WEEKLY

Read quantization insights every Tuesday — top movers, new releases, story of the week.

Email address

// no spam · unsubscribe one-click · free forever

Originally published on arxiv ↗

Home Models News