arxivMay 29bullish
arXiv:2605.29843v1 Announce Type: cross Abstract: Post-training quantization (PTQ) is essential for deploying LLMs under memory and bandwidth constraints. However, extreme low-bit quantization remains highly sensitive to activation outliers and anisotropic weight curvature. Existing incoherence-base
arxivMay 28
arXiv:2605.28418v1 Announce Type: new Abstract: With the rise of tabular foundation models alongside traditional models still performing well on many tasks, choosing the right model for a tabular dataset remains difficult. We investigate whether dataset meta-features can explain performance gaps bet
arxivMay 19
arXiv:2605.17246v1 Announce Type: cross Abstract: We introduce fidelity probes: natural-language questions generated from a reference artifact with code-derived ground-truth answers, answered from a candidate specification. The fraction of agreeing probes, which we call the fidelity, decomposes into
arxivMay 15bearish
arXiv:2605.14381v1 Announce Type: cross Abstract: Recent advancements in generative AI facilitate large-scale synthetic data generation for model evaluation. However, without targeted approaches, these datasets often lack the sociotechnical nuance required for sensitive domains. We introduce NodeSyn
arxivMay 14bullish
arXiv:2605.13784v1 Announce Type: new Abstract: Conventional transformer inference engines are request-driven, paying an O(n) prefill cost on every query. In streaming workloads, where data arrives continuously and queries probe an ever-growing context, this cost is prohibitive. We introduce a data-
arxivMay 13
arXiv:2602.08606v3 Announce Type: replace-cross Abstract: Motivated by applications in conditional sampling, given a probability measure $\mu$ and a diffeomorphism $\phi$, we consider the problem of simultaneously approximating $\phi$ and the pushforward $\phi_{\#}\mu$ by means of the flow of a cont
arxivMay 11
arXiv:2605.06821v1 Announce Type: cross Abstract: Cohen et al. (arXiv:2207.14484) observed that adaptive gradient methods such as Adam operate at the edge of stability. While there has been significant work on continuous-time modeling of gradient descent at the edge of stability, extending these mod
arxivMay 5bullish
arXiv:2604.19021v2 Announce Type: replace Abstract: Linear attention mechanisms have emerged as promising alternatives to softmax attention, offering linear-time complexity during inference. Recent advances such as Gated DeltaNet (GDN) and Kimi Delta Attention (KDA) have demonstrated that the delta
arxivMay 1bullish
arXiv:2506.23964v3 Announce Type: replace-cross Abstract: Generative ML models are increasingly popular in networking for tasks such as telemetry imputation, prediction, and synthetic trace generation. Despite their capabilities, they suffer from two shortcomings: \emph{(i)} their output is often vi
arxivApr 30bullish
arXiv:2507.01544v2 Announce Type: replace Abstract: Predictive applications of machine learning often rely on small (sub 1 Bn parameter) specialized models tuned to particular domains or modalities. Such models often achieve excellent performance, but lack flexibility. LLMs and VLMs offer versatilit
arxivApr 28
arXiv:2604.23135v1 Announce Type: new Abstract: Natural-language variation poses a key challenge in Lean autoformalization: semantically equivalent paraphrases of the same theorem statements can induce divergent formal outputs, yet it remains unclear whether this variation reflects semantic disagree
arxivApr 27
arXiv:2604.22730v1 Announce Type: cross Abstract: We investigate whether neural models trained exclusively on modern morphological data can recover cross-lingual lexical structure consistent with historical reconstruction. Using BantuMorph v7, a transformer over Bantu morphological paradigms, we ana
arxivApr 24
arXiv:2604.21629v1 Announce Type: cross Abstract: We compare lightweight automata-based models (n-grams) with neural architectures (LSTM, Transformer) for next-activity prediction in streaming event logs. Experiments on synthetic patterns and five real-world process mining datasets show that n-grams
arxivApr 16
arXiv:2604.13928v1 Announce Type: new Abstract: Industrial time-series data from real production environments exhibits substantially higher complexity than commonly used benchmark datasets, primarily due to heterogeneous, multi-stage operational processes. As a result, anomaly detection methods vali
arxivApr 13
arXiv:2510.22293v4 Announce Type: replace Abstract: Background: Metabolic dysfunction-associated steatotic liver disease (MASLD) affects 30-40% of US adults and is the most common chronic liver disease. Although often asymptomatic, progression can lead to cirrhosis. The objective of the study was to
arxivApr 3
arXiv:2603.27134v2 Announce Type: replace Abstract: Are there still barriers to generalization once all relevant variables are known? We address this question via a framework that casts compositional generalization as a variational inference problem over latent variables with parametric interactions
arxivApr 3bullish
arXiv:2603.19136v2 Announce Type: replace Abstract: Stock markets exhibit regime-dependent behavior where prediction models optimized for stable conditions often fail during volatile periods. Existing approaches typically treat all market states uniformly or require manual regime labeling, which is
arxivApr 2
arXiv:2604.00911v1 Announce Type: new Abstract: In this work, we study whether enforcing strict compositional structure in sequence embeddings yields meaningful geometric organization when applied to protein-protein interaction networks. Using Event2Vec, an additive sequence embedding model, we trai