alignment-forum.com - AI Alignment Forum

Description: A community blog devoted to technical AI alignment research

Example domain paragraphs

AI ALIGNMENT FORUM AF Login Home Library Questions All Posts About Home Library Questions All Posts Recommended Sequences AGI safety from first principles by Richard Ngo Embedded Agency by Abram Demski 2022 MIRI Alignment Discussion by Rob Bensinger AI Alignment Posts 50 Welcome & FAQ! Ruben Bloom , Oliver Habryka 2y 8 15 Some background for reasoning about dual-use alignment research Charlie Steiner 7h 0 4 The Unexpected Clanging Chris_Leong 7h 1 18 $500 Bounty/Prize Problem: Channel Capacity Using "Insens

We quantitatively evaluate how activation additions affect GPT-2's capabilities. For example, we find that adding a "wedding" vector decreases perplexity on wedding-related sentences,...

1 0 I'm also not able to evaluate the object-level of "was this post missing obvious stuff it'd have been good to improve", but, something I want to note about my own guess of how an ideal process would go from my current perspective: