One Token to Fool LLM-as-a-Judge

Source

arxiv.orgfull article ↗

Read on arxiv

Publisher summary· verbatim

arXiv:2507.08794v3 Announce Type: replace-cross Abstract: Large language models (LLMs) are increasingly trusted as automated judges, assisting evaluation and providing reward signals for training other models, particularly in reference-based settings like Reinforcement Learning with Verifiable Rewar

Stay posted· Newsletter

A 5-min weekly brief — top movers, price watch, story of the week.

Discussion

No replies yet. Be first.

One Token to Fool LLM-as-a-Judge

Related coverage

One Token to Fool LLM-as-a-Judge

Related coverage