20251116104113⁝ Safety Alignment in LLM

interesting idea to remove censorship

we propose a novel white-box jailbreak method that surgically disables refusal with minimal effect on other capabilities.

See also 20251113123357b⁝ LLM for adjacent model notes.

A Personal Journal of Learning and Discovery