owainevans.github.io - Owain Evans, AI Alignment researcher

Description: Owain Evans is an AI Alignment researcher leading a new research group in Berkeley and affiliated with Oxford University. Discover his publications, blog posts, and collaborative opportunities on AI alignment, AGI risk, and related topics.

oxford (1365) mit (891) forecasting (286) rationality (23) massachusetts institute of technology (12) lesswrong (3) owain evans (1) truthfulqa (1) autocast (1) future of humanity institute

Example domain paragraphs

Research Lead (new AI Safety group in Berkeley); Research Associate, Oxford University I have a broad interest in AI alignment and AGI risk. My current focus is evaluating situational awareness and deception in LLMs, and on truthfulness and honesty in AI systems. I am leading a new research group based in Berkeley. In the past, I worked on AI Alignment at the University of Oxford (FHI) and earned my PhD at MIT. I also worked at Ought , where I still serve on the Board of Directors. I post regular research u

I mentor researchers through the SERI MATS program. If you are interested in working with me, consider applying. I also hire research assistants and collaborators outside of SERI MATS: please email me with your resume. My previous mentees are listed here .

Current collaborators: Daniel Kokotajlo , Lorenzo Pacchiardi , Asa Stickland , Mykyta Baliesnyi, Lukas Berglund , Meg Tong , Max Kaufmann , Alex Chan , Dane Sherburn . CV | Email | Scholar | LinkedIn | Twitter Highlights Teaching Models to Express Their Uncertainty in Words We show that GPT-3 can learn to express uncertainty about its own answers in natural language -- and is moderately calibrated even under distribution shift.

Links to owainevans.github.io (5)