arxiv19h ago
arXiv:2606.05636v1 Announce Type: new Abstract: Root-Cause Analysis (RCA) seeks to identify the variables responsible for abnormal system behavior in complex domains such as manufacturing, cloud computing, and healthcare. Existing approaches face a critical bottleneck: graph-based causal methods can
arxiv19h ago
arXiv:2602.19373v3 Announce Type: replace Abstract: Deep reinforcement learning systems often suffer from unstable training dynamics due to non-stationarity, where learning objectives and data distributions evolve over time. We show that under non-stationary targets, isotropic Gaussian embeddings ar
arxiv19h ago
arXiv:2601.09719v3 Announce Type: replace-cross Abstract: Pre-Layer Normalization (Pre-LN) is the de facto choice for large language models (LLMs) and is crucial for stable pretraining and effective transfer learning. However, Pre-LN incurs repeated statistical-computation overhead and remains vulne
arxiv19h ago
arXiv:2603.19312v3 Announce Type: replace Abstract: Joint Embedding Predictive Architectures (JEPAs) offer a compelling framework for learning world models in compact latent spaces, yet existing methods remain fragile, relying on complex multi-term losses, exponential moving averages, pre-trained en
arxiv2d ago
arXiv:2602.00423v3 Announce Type: replace Abstract: Single-cell integration workflows often construct low-dimensional cell embeddings and then refine them with post-hoc methods to reduce batch effects. This refinement process can become unstable when cell-type compositions vary across batches, with
arxiv3d ago
arXiv:2606.02515v1 Announce Type: new Abstract: Optimal transport (OT) provides a principled framework for mapping between probability distributions. Despite extensive progress, applying OT to large-scale data remains computationally demanding, and the resulting pointwise transport plans are often d
arxiv3d ago
arXiv:2606.01908v1 Announce Type: new Abstract: Test-time adaptation (TTA) can reduce error on new and different data by updating the model on these inputs during inference. However, these updates raise the issue of privacy w.r.t. the testing data, because the model parameters now depend on all past
arxiv3d ago
arXiv:2606.00700v1 Announce Type: cross Abstract: Online link recommendation on evolving graphs is performative: by choosing which candidate links to show users, the system changes which links form and what feedback it later observes. Consequently, fairness estimates from logged outcomes can be misl
arxiv3d ago
arXiv:2410.09737v2 Announce Type: replace Abstract: A popular way to improve the expressive power of graph neural networks (GNNs) is to use Laplacian eigenvectors as additional node features, since they can serve both as structural identifiers and global coordinates of nodes. Properly handling the o
arxiv3d ago
arXiv:2601.17952v2 Announce Type: replace-cross Abstract: Interpretability remains a key challenge for deploying language models (LM) in clinical settings such as progression diagnosis of Alzheimer disease, where early and trustworthy predictions are essential. Existing attribution methods exhibit h
arxiv3d ago
arXiv:2512.16167v3 Announce Type: replace-cross Abstract: Decentralized LLM-based multi-agent service economies face three vulnerabilities that undermine traditional trust mechanisms: reduced cost of fraud, difficulty in evaluating service quality, and instability of service content. These compoundi
arxiv3d ago
arXiv:2506.01226v3 Announce Type: replace-cross Abstract: We study parameterizations of stabilizing nonlinear policies for learning-based control. We propose a structure based on a nonlinear version of the Youla-Kucera parameterization combined with robust neural networks such as the recurrent equil
arxiv3d ago
arXiv:2606.00426v1 Announce Type: new Abstract: Federated continual learning (FCL) lets distributed clients adapt language-model heads to evolving NLP tasks without sharing raw text. Under user-level differential privacy (DP), replay-based continual learning faces a structural obstacle: clients can
arxiv3d ago
arXiv:2605.02122v2 Announce Type: replace-cross Abstract: Human evaluation remains the primary standard for assessing modern AI systems, yet annotator disagreement, bias, and variability make system rankings fragile under standard majority vote aggregation. Majority vote discards annotator reliabili
arxivMay 29
arXiv:2601.14855v3 Announce Type: replace Abstract: Black-box variational inference (BBVI) with Gaussian mixture families offers a flexible approach for approximating complex posterior distributions without requiring gradients of the target density. However, standard numerical optimization methods o
arxivMay 29
arXiv:2604.18518v4 Announce Type: replace-cross Abstract: Uniform Discrete Diffusion Model (UDM) has recently emerged as a promising paradigm for discrete generative modeling; however, its integration with reinforcement learning remains largely unexplored. We observe that naively applying GRPO to UD
arxivMay 29
arXiv:2605.29547v1 Announce Type: cross Abstract: Deep learning optimization relies heavily on the assumption of smooth loss landscapes, a condition systematically violated by modern architectures due to non-smooth components such as ReLU activations and quantization operators. In such non-smooth re
arxivMay 29
arXiv:2605.29673v1 Announce Type: new Abstract: Reconstruction-based inference assigns a class by comparing class-wise reconstruction residuals; Sparse Representation Classification (SRC) is a canonical instance whose reliability depends on the geometry of the learned representation. We adopt a stri
arxivMay 29
arXiv:2605.14373v3 Announce Type: replace-cross Abstract: Zeroth-Order (ZO) optimization is pivotal for scenarios where backpropagation is unavailable, such as memory-constrained on-device learning and black-box optimization. However, existing methods face a stark trade-off: they are either sample-i
arxivMay 29
arXiv:2605.00553v2 Announce Type: replace Abstract: Large Language Model (LLM) Red-Teaming, which proactively identifies vulnerabilities of LLMs, is an essential process for ensuring safety. Finding effective and diverse attacks in red-teaming is important, but achieving both is challenging. Generat
arxivMay 29
arXiv:2605.30201v1 Announce Type: cross Abstract: We investigate a narrow but common failure mode of GRPO-style reinforcement learning in the context of sparse verifiable rewards: early updates contain more responses with negative advantages than those with positive advantages, while response-level
arxivMay 28
arXiv:2605.28802v1 Announce Type: new Abstract: Free-text explanations extend human label variation (HLV) beyond label disagreement by revealing the reasoning and preferences behind annotators' decisions. We study whether large language models (LLMs) can learn and reproduce such annotator-specific l
arxivMay 28
arXiv:2605.28517v1 Announce Type: cross Abstract: Stochastic gradient descent with momentum (SGDM) is one of the most widely used optimization algorithms in machine learning. While optimization properties of SGDM have been extensively studied in the literature, it remains insufficiently understood w
arxivMay 28
arXiv:2605.27986v1 Announce Type: new Abstract: Messenger RNA (mRNA) sequences as therapeutics require optimized design to ensure efficient translation, structural stability, and minimal immunogenicity. This study presents a two-stage in-silico framework that integrates deep learning and evolutionar
arxivMay 28
arXiv:2605.19729v3 Announce Type: replace-cross Abstract: We demonstrate that in knowledge distillation for diffusion models, the teacher network's highly complex denoising process - stemming from its substantially larger capacity - poses a significant challenge for the student model to faithfully m
arxivMay 27
arXiv:2605.26789v1 Announce Type: new Abstract: Post-training is routinely evaluated through aggregate benchmark scores that treat multi-hop reasoning as a single capability -- as if a model that answers more questions correctly must be better at assembling facts. We show that this assumption can be
arxivMay 26
arXiv:2605.25848v1 Announce Type: cross Abstract: Concept probes extracted from transformer residual streams are only as reliable as the layer from which they are extracted. The common practice of probing at a fixed late layer or at the peak of a separation score function ignores a fundamental struc
arxivMay 26
arXiv:2605.24136v1 Announce Type: cross Abstract: We study the problem of identifying dynamically distinct basins of attraction in high dimensional time-homogeneous Markov processes using only trajectory sampling. This problem is fundamental in the analysis of metastable dynamical systems, where the
arxivMay 26
arXiv:2605.25488v1 Announce Type: cross Abstract: Audio-driven talking-head generation has achieved remarkable progress with recent models such as AniTalker, FLOAT, and Sonic. Despite their success, most existing approaches rely on a single static reference image to condition the entire video genera
arxivMay 26
arXiv:2510.14925v4 Announce Type: replace Abstract: High-confidence errors in large language models are often treated as fragile failures. We study an alternative: some errors may be false fixed points, locally stable, internally coherent, and confidently wrong. This separates robustness from truth-
arxivMay 26
arXiv:2510.15514v3 Announce Type: replace Abstract: Reinforcement Learning from AI Feedback (RLAIF) relies on LLM judges as preference measurement instruments, yet these instruments are fundamentally limited by random measurement errors -- stochastic fluctuations that manifest as preference cycles (
arxivMay 26
arXiv:2605.25704v1 Announce Type: new Abstract: In contemporary large language models (LLMs), the swish-gated linear unit (SwiGLU) activation function is widely adopted to regulate the information flow and introduce non-linearity. For large positive inputs, SwiGLU approximates the quadratic function
arxivMay 25
arXiv:2604.28048v2 Announce Type: replace Abstract: Large Language Models (LLMs) are increasingly used as proxies for human perception in urban analysis, yet it remains unclear whether persona prompting produces meaningful and reproducible behavioral diversity. We investigate whether distinct person
arxivMay 25
arXiv:2605.23458v1 Announce Type: cross Abstract: Recent advances have substantially improved real-time interactive video generation in the autoregressive regime. However, most existing few-step autoregressive video generation methods, often distilled from a corresponding many-step teacher, default
arxivMay 22
arXiv:2602.10894v2 Announce Type: replace Abstract: Two-player games such as board games have long been used as traditional benchmarks for reinforcement learning. This work revisits a policy optimization method with reverse Kullback-Leibler regularization and entropy regularization and analyzes this
arxivMay 22
arXiv:2605.21492v1 Announce Type: new Abstract: No feature ranking can be simultaneously faithful, stable, and complete when features are collinear. For collinear pairs, ranking reduces to a coin flip. We prove this impossibility, quantify it for four model classes, resolve it via ensemble averaging
arxivMay 22
arXiv:2605.21800v1 Announce Type: new Abstract: World models are central to building agents that can reason, plan, and generalize beyond their training data. However, research on world models is currently fragmented, with disparate codebases, data pipelines, and evaluation protocols hindering reprod
arxivMay 22
arXiv:2605.22432v1 Announce Type: new Abstract: Modern deep learning commonly relies on AdamW with prescribed learning rate schedules, but recent works challenge both components: Schedule-Free optimization removes explicit schedules via iterate averaging, and Muon improves the update geometry by ort
arxivMay 22
arXiv:2605.22338v1 Announce Type: new Abstract: Reconstructing continuous physical fields from sparse measurements is a central inverse problem, but data-driven generative models can produce states that violate governing dynamics. We introduce a physics-informed generative solver that separates stab
arxivMay 22
arXiv:2605.20069v2 Announce Type: replace Abstract: Competitive selection processes, from scientific funding to admissions and hiring, use evaluations to score candidates, and eventually choose a subset of them based on those scores. Recently, many organizations have adopted partial lotteries, which
arxivMay 21
arXiv:2605.18809v1 Announce Type: cross Abstract: General-sum multi-agent learning is often governed by a stacked update field in which each agent's policy update changes the optimization landscape faced by the others. This coupling can entangle an integrable component of collective improvement with
arxivMay 21
arXiv:2605.21325v1 Announce Type: new Abstract: Linear attention has emerged as a cornerstone for efficient long-context architectures, as evidenced by its integration into state-of-the-art open-source models including Qwen3.5/3.6, Kimi Linear, and RWKV-7. Models that incorporate linear attention la
arxivMay 21
arXiv:2604.05002v3 Announce Type: replace-cross Abstract: Learning from weak, proxy, or relative supervision is common when ground-truth labels are unavailable, but robustness under distribution shift remains poorly understood because the supervision mechanism itself may change across environments.
arxivMay 21
arXiv:2605.19856v1 Announce Type: cross Abstract: Training very deep neural networks requires controlling the propagation of magnitudes across depth. Without such control, activations and gradients may vanish, explode, or enter unstable regimes that make optimization fail. Modern architectures often
arxivMay 20
arXiv:2603.18396v4 Announce Type: replace Abstract: Bus holding control is challenging due to stochastic traffic and passenger demand. While deep reinforcement learning (DRL) shows promise, standard actor-critic algorithms suffer from Q-value instability in volatile environments. A key source of thi