Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma. If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
…
continue reading
LessWrong Podcasts
A conversational podcast for aspiring rationalists.
…
continue reading
We live in a world where our civilization and daily lives depend upon institutions, infrastructure, and technological substrates that are _complicated_ but not _unknowable_. Join Patrick McKenzie (patio11) as he discusses how decisions, technology, culture, and incentives shape our finance, technology, government, and more, with the people who built (and build) those Complex Systems.
…
continue reading
Welcome to the Heart of the Matter, a series in which we share conversations with inspiring and interesting people and dive into the core issues or motivations behind their work, their lives, and their worldview. Coming to you from somewhere in the technosphere with your hosts Bryan Davis and Jay Kannaiyan.
…
continue reading
1
“Alignment remains a hard, unsolved problem” by null
23:23
23:23
Play later
Play later
Lists
Like
Liked
23:23Thanks to (in alphabetical order) Joshua Batson, Roger Grosse, Jeremy Hadfield, Jared Kaplan, Jan Leike, Jack Lindsey, Monte MacDiarmid, Francesco Mosconi, Chris Olah, Ethan Perez, Sara Price, Ansh Radhakrishnan, Fabien Roger, Buck Shlegeris, Drake Thomas, and Kate Woolverton for useful discussions, comments, and feedback. Though there are certainl…
…
continue reading
1
251 – Matt Freeman on What Makes a Good Story
1:54:49
1:54:49
Play later
Play later
Lists
Like
Liked
1:54:49Matt Freeman has been cohosting several media analysis podcasts for over a decade. He and his cohost Scott have been doing weekly episodes of the Doofcast every Friday and they cover movies, books, and TV shows. Matt and Scott’s analysis podcasts have made me love stories even more and have equipped me with tools to […]…
…
continue reading
1
“Video games are philosophy’s playground” by Rachel Shu
31:50
31:50
Play later
Play later
Lists
Like
Liked
31:50Crypto people have this saying: "cryptocurrencies are macroeconomics' playground." The idea is that blockchains let you cheaply spin up toy economies to test mechanisms that would be impossibly expensive or unethical to try in the real world. Want to see what happens with a 200% marginal tax rate? Launch a token with those rules and watch what happ…
…
continue reading
TL;DR: Figure out what needs doing and do it, don't wait on approval from fellowships or jobs. If you... Have short timelines Have been struggling to get into a position in AI safety Are able to self-motivate your efforts Have a sufficient financial safety net ... I would recommend changing your personal strategy entirely. I started my full-time AI…
…
continue reading
1
“Gemini 3 is Evaluation-Paranoid and Contaminated” by null
14:59
14:59
Play later
Play later
Lists
Like
Liked
14:59TL;DR: Gemini 3 frequently thinks it is in an evaluation when it is not, assuming that all of its reality is fabricated. It can also reliably output the BIG-bench canary string, indicating that Google likely trained on a broad set of benchmark data. Most of the experiments in this post are very easy to replicate, and I encourage people to try. I wr…
…
continue reading
1
Bayes Blast 46 – Get Involved in Local Politics with Booker Lightman
25:59
25:59
Play later
Play later
Lists
Like
Liked
25:59Booker is a long-time attendee and one of the coordinators of the Denver area Less Wrong community. Community engagement isn’t just a background task for him – he’s taken real steps to get involved with and improve his community and you can too! He’s here to tell us about the things he’s done and give […]…
…
continue reading
1
“Natural emergent misalignment from reward hacking in production RL” by evhub, Monte M, Benjamin Wright, Jonathan Uesato
18:45
18:45
Play later
Play later
Lists
Like
Liked
18:45Abstract We show that when large language models learn to reward hack on production RL environments, this can result in egregious emergent misalignment. We start with a pretrained model, impart knowledge of reward hacking strategies via synthetic document finetuning or prompting, and train on a selection of real Anthropic production coding environm…
…
continue reading
1
“Anthropic is (probably) not meeting its RSP security commitments” by habryka
8:57
8:57
Play later
Play later
Lists
Like
Liked
8:57TLDR: An AI company's model weight security is at most as good as its compute providers' security. Anthropic has committed (with a bit of ambiguity, but IMO not that much ambiguity) to be robust to attacks from corporate espionage teams at companies where it hosts its weights. Anthropic seems unlikely to be robust to those attacks. Hence they are i…
…
continue reading
There has been a lot of talk about "p(doom)"over the last few years. This has always rubbed me the wrong waybecause "p(doom)" didn't feel like it mapped to any specific belief in my head.In private conversations I'd sometimes give my p(doom) as 12%, with the caveatthat "doom" seemed nebulous and conflated between several different concepts.At some …
…
continue reading
1
Understanding equity at tech companies, with Billy Gallagher of Prospect
1:18:31
1:18:31
Play later
Play later
Lists
Like
Liked
1:18:31Why do billions of dollars of stock trade hands based on napkin math and vibes? Billy Gallagher, CEO of Prospect and former Rippling employee, joins Patrick McKenzie (patio11) to walk through the information asymmetry that costs less-sophisticated employees massive amounts of money. From understanding when to early exercise options to navigating 83…
…
continue reading
It seems like a catastrophic civilizational failure that we don't have confident common knowledge of how colds spread. There have been a number of studies conducted over the years, but most of those were testing secondary endpoints, like how long viruses would survive on surfaces, or how likely they were to be transmitted to people's fingers after …
…
continue reading
1
“New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence” by Aaron_Scher, David Abecassis, Brian Abeyta, peterbarnett
6:52
6:52
Play later
Play later
Lists
Like
Liked
6:52TLDR: We at the MIRI Technical Governance Team have released a report describing an example international agreement to halt the advancement towards artificial superintelligence. The agreement is centered around limiting the scale of AI training, and restricting certain AI research. Experts argue that the premature development of artificial superint…
…
continue reading
1
“Where is the Capital? An Overview” by johnswentworth
18:06
18:06
Play later
Play later
Lists
Like
Liked
18:06When a new dollar goes into the capital markets, after being bundled and securitized and lent several times over, where does it end up? When society's total savings increase, what capital assets do those savings end up invested in? When economists talk about “capital assets”, they mean things like roads, buildings and machines. When I read through …
…
continue reading
1
“Problems I’ve Tried to Legibilize” by Wei Dai
4:17
4:17
Play later
Play later
Lists
Like
Liked
4:17Looking back, it appears that much of my intellectual output could be described as legibilizing work, or trying to make certain problems in AI risk more legible to myself and others. I've organized the relevant posts and comments into the following list, which can also serve as a partial guide to problems that may need to be further legibilized, es…
…
continue reading
1
“Do not hand off what you cannot pick up” by habryka
6:39
6:39
Play later
Play later
Lists
Like
Liked
6:39Delegation is good! Delegation is the foundation of civilization! But in the depths of delegation madness breeds and evil rises. In my experience, there are three ways in which delegation goes off the rails: 1. You delegate without knowing what good performance on a task looks like If you do not know how to evaluate performance on a task, you are g…
…
continue reading
1
“7 Vicious Vices of Rationalists” by Ben Pace
9:47
9:47
Play later
Play later
Lists
Like
Liked
9:47Vices aren't behaviors that one should never do. Rather, vices are behaviors that are fine and pleasurable to do in moderation, but tempting to do in excess. The classical vices are actually good in part. Moderate amounts of gluttony is just eating food, which is important. Moderate amounts of envy is just "wanting things", which is a motivator of …
…
continue reading
1
“Tell people as early as possible it’s not going to work out” by habryka
3:19
3:19
Play later
Play later
Lists
Like
Liked
3:19Context: Post #4 in my sequence of private Lightcone Infrastructure memos edited for public consumption This week's principle is more about how I want people at Lightcone to relate to community governance than it is about our internal team culture. As part of our jobs at Lightcone we often are in charge of determining access to some resource, or me…
…
continue reading
1
“Everyone has a plan until they get lied to the face” by Screwtape
12:48
12:48
Play later
Play later
Lists
Like
Liked
12:48"Everyone has a plan until they get punched in the face." - Mike Tyson (The exact phrasing of that quote changes, this is my favourite.) I think there is an open, important weakness in many people. We assume those we communicate with are basically trustworthy. Further, I think there is an important flaw in the current rationality community. We spen…
…
continue reading
1
“Please, Don’t Roll Your Own Metaethics” by Wei Dai
4:11
4:11
Play later
Play later
Lists
Like
Liked
4:11One day, when I was an interning at the cryptography research department of a large software company, my boss handed me an assignment to break a pseudorandom number generator passed to us for review. Someone in another department invented it and planned to use it in their product, and wanted us to take a look first. This person must have had a lot …
…
continue reading
1
“Paranoia rules everything around me” by habryka
22:32
22:32
Play later
Play later
Lists
Like
Liked
22:32People sometimes make mistakes [citation needed]. The obvious explanation for most of those mistakes is that decision makers do not have access to the information necessary to avoid the mistake, or are not smart/competent enough to think through the consequences of their actions. This predicts that as decision-makers get access to more information,…
…
continue reading
1
The $4,000 insurance policy designed to never pay out
28:31
28:31
Play later
Play later
Lists
Like
Liked
28:31Patrick McKenzie (patio11) reads his essay on title insurance, a service designed to never be performed with a "laughably low" 5% loss ratio compared to 50-80% for almost all types of insurance. The typical American moves every seven to eight years, paying a $500 annual tax for basically no good or service. This is due to a quirk about how America …
…
continue reading
1
“Human Values ≠ Goodness” by johnswentworth
11:31
11:31
Play later
Play later
Lists
Like
Liked
11:31There is a temptation to simply define Goodness as Human Values, or vice versa. Alas, we do not get to choose the definitions of commonly used words; our attempted definitions will simply be wrong. Unless we stick to mathematics, we will end up sneaking in intuitions which do not follow from our so-called definitions, and thereby mislead ourselves.…
…
continue reading
1
250 – Making the Good Life with Matt Freeman
1:56:38
1:56:38
Play later
Play later
Lists
Like
Liked
1:56:38While Eneasz is busy at InkHaven, Steven sits down with Matt Freeman to talk about not-AI stuff! We had (in my opinion) a great conversation about stoic philosophy, the traps of getting too entrenched in any philosophical framework, and some of the ingredients of a happy life. LINKS It’s Okay to Feel Bad for a […]…
…
continue reading
Condensation: a theory of concepts is a model of concept-formation by Sam Eisenstat. Its goals and methods resemble John Wentworth's natural abstractions/natural latents research.[1] Both theories seek to provide a clear picture of how to posit latent variables, such that once someone has understood the theory, they'll say "yep, I see now, that's h…
…
continue reading
1
“Mourning a life without AI” by Nikola Jurkovic
11:17
11:17
Play later
Play later
Lists
Like
Liked
11:17Recently, I looked at the one pair of winter boots I own, and I thought “I will probably never buy winter boots again.” The world as we know it probably won’t last more than a decade, and I live in a pretty warm area. I. AGI is likely in the next decade It has basically become consensus within the AI research community that AI will surpass human ca…
…
continue reading
1
“Unexpected Things that are People” by Ben Goldhaber
8:13
8:13
Play later
Play later
Lists
Like
Liked
8:13Cross-posted from https://bengoldhaber.substack.com/ It's widely known that Corporations are People. This is universally agreed to be a good thing; I list Target as my emergency contact and I hope it will one day be the best man at my wedding. But there are other, less well known non-human entities that have also been accorded the rank of person. S…
…
continue reading
1
“Sonnet 4.5’s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals” by Alexa Pan, ryan_greenblatt
35:57
35:57
Play later
Play later
Lists
Like
Liked
35:57According to the Sonnet 4.5 system card, Sonnet 4.5 is much more likely than Sonnet 4 to mention in its chain-of-thought that it thinks it is being evaluated; this seems to meaningfully cause it to appear to behave better in alignment evaluations. So, Sonnet 4.5's behavioral improvements in these evaluations may partly be driven by growing tendency…
…
continue reading
1
“Publishing academic papers on transformative AI is a nightmare” by Jakub Growiec
7:23
7:23
Play later
Play later
Lists
Like
Liked
7:23I am a professor of economics. Throughout my career, I was mostly working on economic growth theory, and this eventually brought me to the topic of transformative AI / AGI / superintelligence. Nowadays my work focuses mostly on the promises and threats of this emerging disruptive technology. Recently, jointly with Klaus Prettner, we’ve written a pa…
…
continue reading