arxiv
PublishedApril 24, 2026 at 4:00 AM
—neutral
Adaptive Instruction Composition for Automated LLM Red-Teaming
Publisher summary· verbatim
arXiv:2604.21159v1 Announce Type: cross Abstract: Many approaches to LLM red-teaming leverage an attacker LLM to discover jailbreaks against a target. Several of them task the attacker with identifying effective strategies through trial and error, resulting in a semantically limited range of success
Discussion
No replies yet. Be first.
Originally published on arxiv ↗