The simplest questions often have the most complex answers. The Philosopher's Zone is your guide through the strange thickets of logic, metaphysics and ethics.
…
continue reading
MP3•Episode home
Manage episode 387620347 series 3402048
Content provided by Joe Carlsmith. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Joe Carlsmith or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.
This is section 2.3.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Chapters
1. Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs") (00:00:00)
2. 2.3.2 Non-classic stories (00:00:36)
3. 2.3.2.1 AI coordination (00:00:55)
4. 2.3.2.2 AIs with similar values by default (00:05:57)
5. 2.3.2.3 Terminal values that happen to favor escape/takeover (00:07:51)
6. 2.3.2.4 Models with false beliefs about whether scheming is a good strategy (00:11:59)
7. 2.3.2.5 Self-deception (00:13:33)
8. 2.3.2.6 Goal-uncertainty and haziness (00:15:46)
9. 2.3.2.7 Overall assessment of the non-classic stories (00:18:19)
10. 2.4 Take-aways re: the requirements of scheming (00:20:08)
11. 2.5 Path dependence (00:20:51)
67 episodes