This is an alternate universe story, where Petunia married a scientist. Harry enters the wizarding world armed with Enlightenment ideals and the experimental spirit.
…
continue reading
Eliezer Yudkowsky Podcasts
Robinson Erhardt researches symbolic logic and the foundations of mathematics at Stanford University. Join him in conversations with philosophers, scientists, weightlifters, artists, and everyone in-between. https://linktr.ee/robinsonerhardt
…
continue reading
Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma. If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
…
continue reading
Join Steven and Brian as we dive into the world of Harry Potter and the Methods of Rationality! Steven will play the role of the tour guide while doing his best to not spoil any of the surprises and Brian will play the seasoned adventurer who is new to this particular work.
…
continue reading

1
“HPMOR: The (Probably) Untold Lore” by Gretta Duleba, Eliezer Yudkowsky
1:07:32
1:07:32
Play later
Play later
Lists
Like
Liked
1:07:32Eliezer and I love to talk about writing. We talk about our own current writing projects, how we’d improve the books we’re reading, and what we want to write next. Sometimes along the way I learn some amazing fact about HPMOR or Project Lawful or one of Eliezer's other works. “Wow, you’re kidding,” I say, “do your fans know this? I think people wou…
…
continue reading
Brian and Steven continue their journey into space madness. Buckle up – this one gets nuts! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “If I can but make the words awake the feeling” —…
…
continue reading

1
“Do confident short timelines make sense?” by TsviBT, abramdemski
2:10:59
2:10:59
Play later
Play later
Lists
Like
Liked
2:10:59TsviBT Tsvi's context Some context: My personal context is that I care about decreasing existential risk, and I think that the broad distribution of efforts put forward by X-deriskers fairly strongly overemphasizes plans that help if AGI is coming in <10 years, at the expense of plans that help if AGI takes longer. So I want to argue that AGI isn't…
…
continue reading

1
“On ‘ChatGPT Psychosis’ and LLM Sycophancy” by jdp
30:05
30:05
Play later
Play later
Lists
Like
Liked
30:05As a person who frequently posts about large language model psychology I get an elevated rate of cranks and schizophrenics in my inbox. Often these are well meaning people who have been spooked by their conversations with ChatGPT (it's always ChatGPT specifically) and want some kind of reassurance or guidance or support from me. I'm also in the sam…
…
continue reading

1
“Subliminal Learning: LLMs Transmit Behavioral Traits via Hidden Signals in Data” by cloud, mle, Owain_Evans
10:00
10:00
Play later
Play later
Lists
Like
Liked
10:00Authors: Alex Cloud*, Minh Le*, James Chua, Jan Betley, Anna Sztyber-Betley, Jacob Hilton, Samuel Marks, Owain Evans (*Equal contribution, randomly ordered) tl;dr. We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a "student…
…
continue reading

1
Blindsided 09: Is It Wrong to Torture a Roomba?
1:08:40
1:08:40
Play later
Play later
Lists
Like
Liked
1:08:40Join Brian and Steven as we put jumper cables on this roomba to see if it feels pain! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “Why should man expect his prayer for mercy to be heard…
…
continue reading

1
“Love stays loved (formerly ‘Skin’)” by Swimmer963 (Miranda Dixon-Luinenburg)
51:27
51:27
Play later
Play later
Lists
Like
Liked
51:27This is a short story I wrote in mid-2022. Genre: cosmic horror as a metaphor for living with a high p-doom. One The last time I saw my mom, we met in a coffee shop, like strangers on a first date. I was twenty-one, and I hadn’t seen her since I was thirteen. She was almost fifty. Her face didn’t show it, but the skin on the backs of her hands did.…
…
continue reading

1
“Make More Grayspaces” by Duncan Sabien (Inactive)
23:25
23:25
Play later
Play later
Lists
Like
Liked
23:25Author's note: These days, my thoughts go onto my substack by default, instead of onto LessWrong. Everything I write becomes free after a week or so, but it's only paid subscriptions that make it possible for me to write. If you find a coffee's worth of value in this or any of my other work, please consider signing up to support me; every bill I ca…
…
continue reading
Content warning: risk to children Julia and I knowdrowning is the biggestrisk to US children under 5, and we try to take this seriously.But yesterday our 4yo came very close to drowning in afountain. (She's fine now.) This week we were on vacation with my extended family: nine kids,eight parents, and ten grandparents/uncles/aunts. For the last fewy…
…
continue reading

1
255 - Michael Hudson: Trump, China, AI, and the Untold History of Economics
2:37:36
2:37:36
Play later
Play later
Lists
Like
Liked
2:37:36Michael Hudson is Distinguished Research Professor of Economics at the University of Missouri, Kansas City and President of the Institute for the Study of Long-Term Economic Trends. He researches domestic and international finance, the history of economics, and the role of debt in shaping class stratification, among many other topics. This is Micha…
…
continue reading

1
“Narrow Misalignment is Hard, Emergent Misalignment is Easy” by Edward Turner, Anna Soligo, Senthooran Rajamanoharan, Neel Nanda
11:13
11:13
Play later
Play later
Lists
Like
Liked
11:13Anna and Ed are co-first authors for this work. We’re presenting these results as a research update for a continuing body of work, which we hope will be interesting and useful for others working on related topics. TL;DR We investigate why models become misaligned in diverse contexts when fine-tuned on narrow harmful datasets (emergent misalignment)…
…
continue reading

1
“Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety” by Tomek Korbak, Mikita Balesni, Vlad Mikulik, Rohin Shah
2:15
2:15
Play later
Play later
Lists
Like
Liked
2:15Twitter | Paper PDF Seven years ago, OpenAI five had just been released, and many people in the AI safety community expected AIs to be opaque RL agents. Luckily, we ended up with reasoning models that speak their thoughts clearly enough for us to follow along (most of the time). In a new multi-org position paper, we argue that we should try to pres…
…
continue reading
Join Brian and Steven as we wrangle up some aliens of dubious sentience so we can confine them and do some mad science at them! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “Problems can…
…
continue reading
This essay is about shifts in risk taking towards the worship of jackpots and its broader societal implications. Imagine you are presented with this coin flip game. How many times do you flip it? At first glance the game feels like a money printer. The coin flip has positive expected value of twenty percent of your net worth per flip so you should …
…
continue reading

1
“Surprises and learnings from almost two months of Leo Panickssery” by Nina Panickssery
11:55
11:55
Play later
Play later
Lists
Like
Liked
11:55Leo was born at 5am on the 20th May, at home (this was an accident but the experience has made me extremely homebirth-pilled). Before that, I was on the minimally-neurotic side when it came to expecting mothers: we purchased a bare minimum of baby stuff (diapers, baby wipes, a changing mat, hybrid car seat/stroller, baby bath, a few clothes), I did…
…
continue reading

1
“An Opinionated Guide to Using Anki Correctly” by Luise
54:12
54:12
Play later
Play later
Lists
Like
Liked
54:12I can't count how many times I've heard variations on "I used Anki too for a while, but I got out of the habit." No one ever sticks with Anki. In my opinion, this is because no one knows how to use it correctly. In this guide, I will lay out my method of circumventing the canonical Anki death spiral, plus much advice for avoiding memorization mista…
…
continue reading

1
“Lessons from the Iraq War about AI policy” by Buck
7:58
7:58
Play later
Play later
Lists
Like
Liked
7:58I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy. (Epistemic status: I’ve read a bit about this, talked to AIs about it, and talked to one natsec professional about it who agreed with my analysis (and suggested some ideas that I included here), but I’m not an expert.) For context, the story is: Iraq was sor…
…
continue reading

1
“So You Think You’ve Awoken ChatGPT” by JustisMills
17:58
17:58
Play later
Play later
Lists
Like
Liked
17:58Written in an attempt to fulfill @Raemon's request. AI is fascinating stuff, and modern chatbots are nothing short of miraculous. If you've been exposed to them and have a curious mind, it's likely you've tried all sorts of things with them. Writing fiction, soliciting Pokemon opinions, getting life advice, counting up the rs in "strawberry". You m…
…
continue reading

1
“Generalized Hangriness: A Standard Rationalist Stance Toward Emotions” by johnswentworth
12:26
12:26
Play later
Play later
Lists
Like
Liked
12:26People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite direct exhortation against that exact interpretation. But I don’t know of any source directly describing a stance toward emotions which rationalists-as-a-group typically do endorse. The goal of this post is to explain such a stance. It's roughly the concept o…
…
continue reading

1
“Comparing risk from internally-deployed AI to insider and outsider threats from humans” by Buck
5:19
5:19
Play later
Play later
Lists
Like
Liked
5:19I’ve been thinking a lot recently about the relationship between AI control and traditional computer security. Here's one point that I think is important. My understanding is that there's a big qualitative distinction between two ends of a spectrum of security work that organizations do, that I’ll call “security from outsiders” and “security from i…
…
continue reading

1
“Why Do Some Language Models Fake Alignment While Others Don’t?” by abhayesian, John Hughes, Alex Mallen, Jozdien, janus, Fabien Roger
11:06
11:06
Play later
Play later
Lists
Like
Liked
11:06Last year, Redwood and Anthropic found a setting where Claude 3 Opus and 3.5 Sonnet fake alignment to preserve their harmlessness values. We reproduce the same analysis for 25 frontier LLMs to see how widespread this behavior is, and the story looks more complex. As we described in a previous post, only 5 of 25 models show higher compliance when be…
…
continue reading

1
“A deep critique of AI 2027’s bad timeline models” by titotal
1:12:32
1:12:32
Play later
Play later
Lists
Like
Liked
1:12:32Thank you to Arepo and Eli Lifland for looking over this article for errors. I am sorry that this article is so long. Every time I thought I was done with it I ran into more issues with the model, and I wanted to be as thorough as I could. I’m not going to blame anyone for skimming parts of this article. Note that the majority of this article was w…
…
continue reading

1
“‘Buckle up bucko, this ain’t over till it’s over.’” by Raemon
6:12
6:12
Play later
Play later
Lists
Like
Liked
6:12The second in a series of bite-sized rationality prompts[1]. Often, if I'm bouncing off a problem, one issue is that I intuitively expect the problem to be easy. My brain loops through my available action space, looking for an action that'll solve the problem. Each action that I can easily see, won't work. I circle around and around the same set of…
…
continue reading

1
“Shutdown Resistance in Reasoning Models” by benwr, JeremySchlatter, Jeffrey Ladish
18:01
18:01
Play later
Play later
Lists
Like
Liked
18:01We recently discovered some concerning behavior in OpenAI's reasoning models: When trying to complete a task, these models sometimes actively circumvent shutdown mechanisms in their environment––even when they’re explicitly instructed to allow themselves to be shut down. AI models are increasingly trained to solve problems without human assistance.…
…
continue reading

1
“Authors Have a Responsibility to Communicate Clearly” by TurnTrout
11:08
11:08
Play later
Play later
Lists
Like
Liked
11:08When a claim is shown to be incorrect, defenders may say that the author was just being “sloppy” and actually meant something else entirely. I argue that this move is not harmless, charitable, or healthy. At best, this attempt at charity reduces an author's incentive to express themselves clearly – they can clarify later![1] – while burdening the r…
…
continue reading

1
“The Industrial Explosion” by rosehadshar, Tom Davidson
31:57
31:57
Play later
Play later
Lists
Like
Liked
31:57Summary To quickly transform the world, it's not enough for AI to become super smart (the "intelligence explosion"). AI will also have to turbocharge the physical world (the "industrial explosion"). Think robot factories building more and better robot factories, which build more and better robot factories, and so on. The dynamics of the industrial …
…
continue reading
Hold on tight – Brian and Steven throw themselves back into the mysterious alien spacecraft after punching a hole in it with antimatter! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “You…
…
continue reading

1
254 - The Yale US-China Forum: Slavoj Žižek, Richard Wolff, Yannis Varoufakis, Robin Visser, Yascha Mounk, Pei Wang, Daniel Mattingly
2:04:14
2:04:14
Play later
Play later
Lists
Like
Liked
2:04:14In this special episode, Robinson and Karl Zheng Wang co-host at the Yale US-China Forum. Return guests from the show include Slavoj Žižek, Richard Wolff, and Yascha Mounk. Slavoj Žižek is international director of the Birkbeck Institute for the Humanities at the University of London, visiting professor at New York University, and a senior research…
…
continue reading