arxivApril 8, 2026 at 4:00 AM2 min readneutral

Learning Stable Predictors from Weak Supervision under Distribution Shift

View PDF HTML (experimental) Abstract:Learning from weak or proxy supervision is common when ground-truth labels are unavailable, yet robustness under distribution shift remains poorly understood, especially when the supervision mechanism itself changes. We formalize this as supervision drift, defined as changes in P(y | x, c) across contexts, and study it in CRISPR-Cas13d experiments where guide efficacy is inferred indirectly from RNA-seq responses. Using data from two human cell lines and multiple time points, we build a controlled non-IID benchmark with explicit domain and temporal shifts while keeping the weak-label construction fixed. Models achieve strong in-domain performance (ridge R^2 = 0.356, Spearman rho = 0.442) and partial cross-cell-line transfer (rho ~ 0.40). However, temporal transfer fails across all models, with negative R^2 and near-zero correlation (e.g., XGBoost R^2 = -0.155, rho = 0.056). Additional analyses confirm this pattern. Feature-label relationships remain stable across cell lines but change sharply over time, indicating that failures arise from supervision drift rather than model limitations. These findings highlight feature stability as a simple diagnostic for detecting non-transferability before deployment. Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI) Cite as: arXiv:2604.05002 [cs.LG] (or arXiv:2604.05002v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2604.05002 arXiv-issued DOI via DataCite (pending registration) Submission history From: Mehrdad Shoeibi [view email] [v1] Sun, 5 Apr 2026 23:46:49 UTC (1,750 KB)

Read original article ↗

No replies yet. Be first.

arxiv6h ago

Learning Stable Predictors from Weak Supervision under Distribution Shift

Related Articles

Advantage-Guided Diffusion for Model-Based Reinforcement Learning

FluidFlow: a flow-matching generative model for fluid dynamics surrogates on unstructured meshes

HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?