Decomposition Jailbreak
DroppedStudy of how breaking harmful requests into benign-looking subtasks bypasses model refusals.
Technical:
- 4-role async pipeline: Surrogate → Decomposer → Target → Composer
- Tree-based task decomposition with configurable depth
- LLM-as-a-Judge evaluation with Elo scoring
- HarmBench test suite
Why dropped: Hard to measure, and scope
kept expanding—each finding raised even more questions. Similar research was published
during our work, most notably
Adversaries Can Misuse Combinations of Safe Models.