Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo

Eliezer Yudkowsky Podcasts

show episodes
 
Artwork

1
Robinson's Podcast

Robinson Erhardt

icon
Unsubscribe
icon
icon
Unsubscribe
icon
Monthly+
 
Robinson Erhardt researches symbolic logic and the foundations of mathematics at Stanford University. Join him in conversations with philosophers, scientists, weightlifters, artists, and everyone in-between. https://linktr.ee/robinsonerhardt
  continue reading
 
Artwork

1
We Want MoR + Audiobook

Steven Zuber & Brian Deacon + Eneasz Brodski

icon
Unsubscribe
icon
icon
Unsubscribe
icon
Weekly+
 
Join Steven and Brian as we dive into the world of Harry Potter and the Methods of Rationality! Steven will play the role of the tour guide while doing his best to not spoil any of the surprises and Brian will play the seasoned adventurer who is new to this particular work.
  continue reading
 
Loading …
show series
 
Eliezer and I love to talk about writing. We talk about our own current writing projects, how we’d improve the books we’re reading, and what we want to write next. Sometimes along the way I learn some amazing fact about HPMOR or Project Lawful or one of Eliezer's other works. “Wow, you’re kidding,” I say, “do your fans know this? I think people wou…
  continue reading
 
Brian and Steven continue their journey into space madness. Buckle up – this one gets nuts! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “If I can but make the words awake the feeling” —…
  continue reading
 
TsviBT Tsvi's context Some context: My personal context is that I care about decreasing existential risk, and I think that the broad distribution of efforts put forward by X-deriskers fairly strongly overemphasizes plans that help if AGI is coming in <10 years, at the expense of plans that help if AGI takes longer. So I want to argue that AGI isn't…
  continue reading
 
As a person who frequently posts about large language model psychology I get an elevated rate of cranks and schizophrenics in my inbox. Often these are well meaning people who have been spooked by their conversations with ChatGPT (it's always ChatGPT specifically) and want some kind of reassurance or guidance or support from me. I'm also in the sam…
  continue reading
 
Authors: Alex Cloud*, Minh Le*, James Chua, Jan Betley, Anna Sztyber-Betley, Jacob Hilton, Samuel Marks, Owain Evans (*Equal contribution, randomly ordered) tl;dr. We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a "student…
  continue reading
 
Join Brian and Steven as we put jumper cables on this roomba to see if it feels pain! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “Why should man expect his prayer for mercy to be heard…
  continue reading
 
This is a short story I wrote in mid-2022. Genre: cosmic horror as a metaphor for living with a high p-doom. One The last time I saw my mom, we met in a coffee shop, like strangers on a first date. I was twenty-one, and I hadn’t seen her since I was thirteen. She was almost fifty. Her face didn’t show it, but the skin on the backs of her hands did.…
  continue reading
 
Author's note: These days, my thoughts go onto my substack by default, instead of onto LessWrong. Everything I write becomes free after a week or so, but it's only paid subscriptions that make it possible for me to write. If you find a coffee's worth of value in this or any of my other work, please consider signing up to support me; every bill I ca…
  continue reading
 
Content warning: risk to children Julia and I knowdrowning is the biggestrisk to US children under 5, and we try to take this seriously.But yesterday our 4yo came very close to drowning in afountain. (She's fine now.) This week we were on vacation with my extended family: nine kids,eight parents, and ten grandparents/uncles/aunts. For the last fewy…
  continue reading
 
Michael Hudson is Distinguished Research Professor of Economics at the University of Missouri, Kansas City and President of the Institute for the Study of Long-Term Economic Trends. He researches domestic and international finance, the history of economics, and the role of debt in shaping class stratification, among many other topics. This is Micha…
  continue reading
 
Anna and Ed are co-first authors for this work. We’re presenting these results as a research update for a continuing body of work, which we hope will be interesting and useful for others working on related topics. TL;DR We investigate why models become misaligned in diverse contexts when fine-tuned on narrow harmful datasets (emergent misalignment)…
  continue reading
 
Twitter | Paper PDF Seven years ago, OpenAI five had just been released, and many people in the AI safety community expected AIs to be opaque RL agents. Luckily, we ended up with reasoning models that speak their thoughts clearly enough for us to follow along (most of the time). In a new multi-org position paper, we argue that we should try to pres…
  continue reading
 
Join Brian and Steven as we wrangle up some aliens of dubious sentience so we can confine them and do some mad science at them! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “Problems can…
  continue reading
 
This essay is about shifts in risk taking towards the worship of jackpots and its broader societal implications. Imagine you are presented with this coin flip game. How many times do you flip it? At first glance the game feels like a money printer. The coin flip has positive expected value of twenty percent of your net worth per flip so you should …
  continue reading
 
Leo was born at 5am on the 20th May, at home (this was an accident but the experience has made me extremely homebirth-pilled). Before that, I was on the minimally-neurotic side when it came to expecting mothers: we purchased a bare minimum of baby stuff (diapers, baby wipes, a changing mat, hybrid car seat/stroller, baby bath, a few clothes), I did…
  continue reading
 
I can't count how many times I've heard variations on "I used Anki too for a while, but I got out of the habit." No one ever sticks with Anki. In my opinion, this is because no one knows how to use it correctly. In this guide, I will lay out my method of circumventing the canonical Anki death spiral, plus much advice for avoiding memorization mista…
  continue reading
 
I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy. (Epistemic status: I’ve read a bit about this, talked to AIs about it, and talked to one natsec professional about it who agreed with my analysis (and suggested some ideas that I included here), but I’m not an expert.) For context, the story is: Iraq was sor…
  continue reading
 
Written in an attempt to fulfill @Raemon's request. AI is fascinating stuff, and modern chatbots are nothing short of miraculous. If you've been exposed to them and have a curious mind, it's likely you've tried all sorts of things with them. Writing fiction, soliciting Pokemon opinions, getting life advice, counting up the rs in "strawberry". You m…
  continue reading
 
People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite direct exhortation against that exact interpretation. But I don’t know of any source directly describing a stance toward emotions which rationalists-as-a-group typically do endorse. The goal of this post is to explain such a stance. It's roughly the concept o…
  continue reading
 
I’ve been thinking a lot recently about the relationship between AI control and traditional computer security. Here's one point that I think is important. My understanding is that there's a big qualitative distinction between two ends of a spectrum of security work that organizations do, that I’ll call “security from outsiders” and “security from i…
  continue reading
 
Last year, Redwood and Anthropic found a setting where Claude 3 Opus and 3.5 Sonnet fake alignment to preserve their harmlessness values. We reproduce the same analysis for 25 frontier LLMs to see how widespread this behavior is, and the story looks more complex. As we described in a previous post, only 5 of 25 models show higher compliance when be…
  continue reading
 
Thank you to Arepo and Eli Lifland for looking over this article for errors. I am sorry that this article is so long. Every time I thought I was done with it I ran into more issues with the model, and I wanted to be as thorough as I could. I’m not going to blame anyone for skimming parts of this article. Note that the majority of this article was w…
  continue reading
 
The second in a series of bite-sized rationality prompts[1]. Often, if I'm bouncing off a problem, one issue is that I intuitively expect the problem to be easy. My brain loops through my available action space, looking for an action that'll solve the problem. Each action that I can easily see, won't work. I circle around and around the same set of…
  continue reading
 
We recently discovered some concerning behavior in OpenAI's reasoning models: When trying to complete a task, these models sometimes actively circumvent shutdown mechanisms in their environment––even when they’re explicitly instructed to allow themselves to be shut down. AI models are increasingly trained to solve problems without human assistance.…
  continue reading
 
When a claim is shown to be incorrect, defenders may say that the author was just being “sloppy” and actually meant something else entirely. I argue that this move is not harmless, charitable, or healthy. At best, this attempt at charity reduces an author's incentive to express themselves clearly – they can clarify later![1] – while burdening the r…
  continue reading
 
Summary To quickly transform the world, it's not enough for AI to become super smart (the "intelligence explosion"). AI will also have to turbocharge the physical world (the "industrial explosion"). Think robot factories building more and better robot factories, which build more and better robot factories, and so on. The dynamics of the industrial …
  continue reading
 
Hold on tight – Brian and Steven throw themselves back into the mysterious alien spacecraft after punching a hole in it with antimatter! The book doesn’t have chapters in the traditional sense, but it does have natural stopping points separated by quotes. Check out the awesome companion website sxp made! The starting quote for this episode is: “You…
  continue reading
 
In this special episode, Robinson and Karl Zheng Wang co-host at the Yale US-China Forum. Return guests from the show include Slavoj Žižek, Richard Wolff, and Yascha Mounk. Slavoj Žižek is international director of the Birkbeck Institute for the Humanities at the University of London, visiting professor at New York University, and a senior research…
  continue reading
 
Loading …
Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play