Two to Tango: Coupled Task-Reference Selection for Safe LLM Fine-tuning

arxiv

PublishedJune 10, 2026 at 4:00 AM

▲bullish

Two to Tango: Coupled Task-Reference Selection for Safe LLM Fine-tuning

Source

arxiv.orgfull article ↗

Read on arxiv

Publisher summary· verbatim

arXiv:2606.09866v1 Announce Type: cross Abstract: Fine-tuning safety aligned large language models (LLMs) on downstream data improves adaptation but may erode learned safety behavior. Existing methods use fixed safety examples, global constraints, or one-sided task filtering. Our diagnostics show ta

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Discussion

No replies yet. Be first.

Related coverage

More from ARXIV

arxivNovel Claim or D\'ej\`a Vu? Rethinking "Contamination-Free'' Dynamic Evaluation for Multimodal Automated Fact-Checking1h arxivMission-Level Runtime Assurance for LLM-Assisted ISR Swarms over a Verification-Aware Fabric1h arxivNeonatal Hypoxic-ischaemic Encephalopathy Classification from the EEG and HRV Signals Using a Conformer based Masked Autoencoder1h arxivGTIN: A Unified Framework for Joint Event and Time Prediction in Temporal Graphs1h

The Bubble Brief

WEEKLY

Read safety insights every Tuesday — top movers, new releases, story of the week.

Originally published on arxiv ↗