AXRP - the AI X-risk Research Podcast

AXRP - the AI X-risk Research Podcast

Best episodes ranked by Podyssey's 20,000 community members.

1) 46 - Tom Davidson on AI-enabled Coups
Could AI enable a small group to gain power over a large country, and lock in their power permanently? Often, people worried about catastrophic risks from AI have been concerned with misalignment risk...Show More
46 - Tom Davidson on AI-enabled Coups
2:05:26 | Aug 7th
More
PLAY
ADD
RECOMMEND
SAVE
2) 45 - Samuel Albanie on DeepMind's AGI Safety Approach
In this episode, I chat with Samuel Albanie about the Google DeepMind paper he co-authored called "An Approach to Technical AGI Safety and Security". It covers the assumptions made by the approach, as...Show More
45 - Samuel Albanie on DeepMind's AGI Safety Approach
1:15:42 | Jul 6th
More
PLAY
ADD
RECOMMEND
SAVE
3) 44 - Peter Salib on AI Rights for Human Safety
In this episode, I talk with Peter Salib about his paper "AI Rights for Human Safety", arguing that giving AIs the right to contract, hold property, and sue people will reduce the risk of their trying...Show More
44 - Peter Salib on AI Rights for Human Safety
3:21:33 | Jun 28th
More
PLAY
ADD
RECOMMEND
SAVE
4) 43 - David Lindner on Myopic Optimization with Non-myopic Approval
In this episode, I talk with David Lindner about Myopic Optimization with Non-myopic Approval, or MONA, which attempts to address (multi-step) reward hacking by myopically optimizing actions against a...Show More
43 - David Lindner on Myopic Optimization with Non-myopic Approval
1:40:59 | Jun 15th
More
PLAY
ADD
RECOMMEND
SAVE
5) 42 - Owain Evans on LLM Psychology
Earlier this year, the paper "Emergent Misalignment" made the rounds on AI x-risk social media for seemingly showing LLMs generalizing from 'misaligned' training data of insecure code to acting comica...Show More
42 - Owain Evans on LLM Psychology
2:14:26 | Jun 6th
More
PLAY
ADD
RECOMMEND
SAVE
6) 41 - Lee Sharkey on Attribution-based Parameter Decomposition
What's the next step forward in interpretability? In this episode, I chat with Lee Sharkey about his proposal for detecting computational mechanisms within neural networks: Attribution-based Parameter...Show More
41 - Lee Sharkey on Attribution-based Parameter Decomposition
2:16:11 | Jun 3rd
More
PLAY
ADD
RECOMMEND
SAVE
7) 40 - Jason Gross on Compact Proofs and Interpretability
How do we figure out whether interpretability is doing its job? One way is to see if it helps us prove things about models that we care about knowing. In this episode, I speak with Jason Gross about h...Show More
40 - Jason Gross on Compact Proofs and Interpretability
2:36:05 | Mar 28th
More
PLAY
ADD
RECOMMEND
SAVE
8) 38.8 - David Duvenaud on Sabotage Evaluations and the Post-AGI Future
In this episode, I chat with David Duvenaud about two topics he's been thinking about: firstly, a paper he wrote about evaluating whether or not frontier models can sabotage human decision-making or m...Show More
38.8 - David Duvenaud on Sabotage Evaluations and the Post-AGI Future
20:42 | Mar 1st
More
PLAY
ADD
RECOMMEND
SAVE
9) 38.7 - Anthony Aguirre on the Future of Life Institute
The Future of Life Institute is one of the oldest and most prominant organizations in the AI existential safety space, working on such topics as the AI pause open letter and how the EU AI Act can be i...Show More
38.7 - Anthony Aguirre on the Future of Life Institute
22:39 | Feb 9th
More
PLAY
ADD
RECOMMEND
SAVE
10) 38.6 - Joel Lehman on Positive Visions of AI
Typically this podcast talks about how to avert destruction from AI. But what would it take to ensure AI promotes human flourishing as well as it can? Is alignment to individuals enough, and if not, w...Show More
38.6 - Joel Lehman on Positive Visions of AI
15:28 | Jan 24th
More
PLAY
ADD
RECOMMEND
SAVE

Podcast Playlists Containing This Podcast

The 8 Best Tech Podcasts

Community Playlists Containing This Podcast

Disney and star wars
gogugirl

Podcast Hosts, Guests, or People Mentioned in the Pod

See all