https://github.com/p-e-w/heretic
interesting idea to remove censorship
we propose a novel white-box jailbreak method that surgically disables refusal with minimal effect on other capabilities.
https://arxiv.org/abs/2406.11717
See also 20251113123357b⁝ LLM for adjacent model notes.