arxivJun 2
arXiv:2606.00371v1 Announce Type: new Abstract: Muon optimizers improve neural-network training by replacing ill-conditioned momentum updates with approximately semi-orthogonal updates. This motivates a practical question: how much orthogonalization does Muon actually require? We study this question
arxivJun 1bullish
arXiv:2605.30372v1 Announce Type: cross Abstract: Reservoir computing, a type of recurrent neural network, is a promising approach for temporal learning as it separates dynamic processing from the trained readout layer. However, classical Echo State Networks (ESNs) often require task-specific tuning
arxivMay 28bullish
arXiv:2605.28007v1 Announce Type: cross Abstract: Deep networks are powerful function approximators, but they typically store many different computations in shared weight matrices, making it difficult to selectively reuse or adapt parts of them when a familiar structure appears in novel combinations
arxivMay 26bullish
arXiv:2502.06018v3 Announce Type: replace-cross Abstract: Although Kolmogorov-Arnold-based interpretable networks (KANs) possess strong theoretical expressiveness, they suffer from severe parameter explosion and limited ability to capture high-frequency features in high-dimensional tasks. To address
arxivMay 18
arXiv:2605.15530v1 Announce Type: new Abstract: Neural networks are typically trained with a single learning rate across all layers. While recent empirical evidence suggests that assigning layer-specific learning rates can accelerate training, a principled understanding of the conditions and mechani
arxivMay 16
arXiv:2602.14881v2 Announce Type: replace-cross Abstract: We introduce a novel numerical framework for the exploration of Blaschke--Santal\'o diagrams, which are efficient tools characterizing the possible inequalities relating some given shape functionals. We introduce a parametrization of convex b
arxivMay 6
arXiv:2603.18066v2 Announce Type: replace-cross Abstract: Backpropagation has enabled modern deep learning but is difficult to realize as an online, fully distributed hardware learning system due to global error propagation, phase separation, and heavy reliance on centralized memory. Predictive codi
arxivApr 30
arXiv:2604.26807v1 Announce Type: new Abstract: Despite being resource-intensive to train, 3D convolutional neural networks (CNNs) have been the standard approach to classify CT and MRI scans. Recent work suggests that deep multiple instance learning (MIL) may be a more efficient alternative for 3D
arxivApr 27bullish
arXiv:2604.22293v1 Announce Type: cross Abstract: Lookup-table (LUT) based neural networks can deliver ultra-low latency and excellent hardware efficiency on FPGAs by mapping arithmetic operations directly onto the logic primitives. However, state-of-the-art LUT-aware training (LAT) approaches remai
arxivApr 24bullish
arXiv:2305.01626v4 Announce Type: replace-cross Abstract: Computational models of syntax are predominantly text-based. Here we propose that the most basic first step in the evolution of syntax can be modeled directly from raw speech in a fully unsupervised way. We focus on one of the most ubiquitous
arxivApr 15
arXiv:2505.23737v2 Announce Type: replace-cross Abstract: The majority of parameters in neural networks are naturally represented as matrices. However, most commonly used optimizers treat these matrix parameters as flattened vectors during optimization, potentially overlooking their inherent structu
arxivApr 14bullish
arXiv:2511.20577v3 Announce Type: replace Abstract: Real-world time series often exhibit strong non-stationarity, complex nonlinear dynamics, and behavior expressed across multiple temporal scales, from rapid local fluctuations to slow-evolving long-range trends. However, many contemporary architect
arxivApr 13bullish
arXiv:2511.17687v2 Announce Type: replace Abstract: The brain's Path Integration (PI) mechanism offers substantial guidance and inspiration for Brain-Inspired Navigation (BIN). However, the PI capability constructed by the Continuous Attractor Neural Networks (CANNs) in most existing BIN studies exh
arxivApr 10
arXiv:2604.08204v1 Announce Type: new Abstract: For applications on the extreme edge, minimal networks of only a few dozen artificial neurons for event detection and classification in discrete time signals would be highly desirable. Feed-forward networks, RNNs, and CNNs evolved through evolutionary
arxivApr 9
arXiv:2512.19253v3 Announce Type: replace-cross Abstract: We present the first empirical study of machine unlearning (MU) in hybrid quantum-classical neural networks. While MU has been extensively explored in classical deep learning, its behavior within variational quantum circuits (VQCs) and quantu
arxivApr 9bullish
arXiv:2604.07292v1 Announce Type: new Abstract: Real-time supervisory control of advanced reactors requires accurate forecasting of plant-wide thermal-hydraulic states, including locations where physical sensors are unavailable. Meeting this need calls for surrogate models that combine predictive fi