Steering Language Models Before They Speak: Logit-Level Interventions

Source

arxiv.orgfull article ↗

Read on arxiv

Publisher summary· verbatim

arXiv:2601.10960v2 Announce Type: replace-cross Abstract: Controllable generation requires language models to realize output characteristics such as reading level, politeness, and toxicity. Existing steering methods are often indirect, require access to internal activations, or depend on auxiliary t

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Discussion

No replies yet. Be first.

Steering Language Models Before They Speak: Logit-Level Interventions

Related coverage

Steering Language Models Before They Speak: Logit-Level Interventions

Related coverage