arxiv
PublishedApril 24, 2026 at 4:00 AM
—neutral
From Signal Degradation to Computation Collapse: Uncovering the Two Failure Modes of LLM Quantization
Publisher summary· verbatim
arXiv:2604.19884v1 Announce Type: cross Abstract: Post-Training Quantization (PTQ) is critical for the efficient deployment of Large Language Models (LLMs). While 4-bit quantization is widely regarded as an optimal trade-off, reducing the precision to 2-bit usually triggers a catastrophic ``performa
Discussion
No replies yet. Be first.
Originally published on arxiv ↗