Tag

#adaptation

5 articles tagged #adaptation

arxivJul 10bullish

TTHE: Test-Time Harness Evolution

arXiv:2607.08124v1 Announce Type: cross Abstract: The behavior of an LLM agent is determined not only by the underlying model, but also by its harness: the executable program that constructs context, invokes tools, verifies intermediate results, and recovers from failures. Existing approaches optimi

LL1 model #adaptation #machine-learning #software-engineering Read on arxiv →

arxivJun 5bullish

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

arXiv:2606.04164v1 Announce Type: cross Abstract: Data samples used for training often differ from those encountered during fine-tuning and deployment, and while ML models show promise, their performance remains limited when only small annotated datasets are available. Performance often degrades und

#machine-learning #adaptation #robustness Read on arxiv →

arxivMay 16

FutureSim: Replaying World Events to Evaluate Adaptive Agents

arXiv:2605.15188v1 Announce Type: cross Abstract: AI agents are being increasingly deployed in dynamic, open-ended environments that require adapting to new information as it arrives. To efficiently measure this capability for realistic use-cases, we propose building grounded simulations that replay

#benchmark #adaptation #machine-learning Read on arxiv →

arxivApr 17bullish

Preconditioned Test-Time Adaptation for Out-of-Distribution Debiasing in Narrative Generation

arXiv:2603.13683v2 Announce Type: replace Abstract: Although debiased large language models (LLMs) excel at handling known or low-bias prompts, they often fail on unfamiliar and high-bias prompts. We demonstrate via out-of-distribution (OOD) detection that these high-bias prompts cause a distributio

#debiasing #optimization #language-models Read on arxiv →

arxivApr 17bullish

Training-Free Test-Time Contrastive Learning for Large Language Models

arXiv:2604.13552v1 Announce Type: cross Abstract: Large language models (LLMs) demonstrate strong reasoning capabilities, but their performance often degrades under distribution shift. Existing test-time adaptation (TTA) methods rely on gradient-based updates that require white-box access and need s

#adaptation #reasoning #language-models Read on arxiv →

Tag

#adaptation

5 articles tagged #adaptation

arxivJul 10bullish