1:41:24 | Jun 12th, 2024
Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this ...Show More
More
Join the #1 community of podcast lovers and never miss a great podcast.
By signing up, you'll be subscribed to the #1 podcast discovery newsletter, Podyssey Picks.
🎉 Join the #1 community of podcast lovers and never miss a great podcast.
Recommendations