AXRP - the AI X-risk Research Podcast

33 - RLHF Problems with Scott Emmons

AXRP - the AI X-risk Research Podcast ›

1:41:24 | Jun 12th, 2024

Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this ...Show More



Recommendations

🎉 Join the #1 community of podcast lovers and never miss a great podcast.

Sign up