arxivApril 7, 2026 at 4:00 AM1 min read
A Unified Stability Analysis of SAM vs SGD: Role of Data Coherence and Emergence of Simplicity Bias
arXiv:2511.17378v2 Announce Type: replace Abstract: Understanding the dynamics of optimization in deep learning is increasingly important as models scale. While stochastic gradient descent (SGD) and its variants reliably find solutions that generalize well, the mechanisms driving this generalization
No replies yet. Be first.