Graveyard — Reworr

Decomposition Jailbreak

Dropped

2024 · With Palisade Research

Study of how breaking harmful requests into benign-looking subtasks bypasses model refusals.

Technical:

4-role async pipeline: Surrogate → Decomposer → Target → Composer
Tree-based task decomposition with configurable depth
LLM-as-a-Judge evaluation with Elo scoring
HarmBench test suite

Why dropped: Hard to measure, and scope kept expanding—each finding raised even more questions. Similar research was published during our work, most notably Adversaries Can Misuse Combinations of Safe Models.

Partial Writeup

High-Value Networks Finder

Dropped

November 2024

LLM-based triage of Wi-Fi datasets (WiGLE) to identify high-value networks (government, energy, military). Exploring how proximity-based attacks can scale once targeting is automated (see Nearest Neighbor Attack).

Why dropped: Blocked on WiGLE commercial access, too dual-use to publish.

Predicting AI Releases via Side Channels

Abandoned

January 2025

Attempt to predict OpenAI releases by analyzing Twitter activity of their red team members. Hypothesis: intensive testing before launches reduces social media engagement.

Why abandoned: Weak signal, Twitter API restrictions, and no free time for projects like this.

LessWrong Post