Episode 53: Human-Seeded Evals & Self-Tuning Agents: Samuel Colvin On Shipping Reliable LLMs Vanishing Gradients podcast

Over 20 million podcasts, powered by

Artwork

Episode 53: Human-Seeded Evals & Self-Tuning Agents: Samuel Colvin on Shipping Reliable LLMs

Vanishing Gradients

25 subscribers

published 11h ago

Share

MP3•Episode home

Content provided by Hugo Bowne-Anderson. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Hugo Bowne-Anderson or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.

Demos are easy; durability is hard. Samuel Colvin has spent a decade building guardrails in Python (first with Pydantic, now with Logfire), and he’s convinced most LLM failures have nothing to do with the model itself. They appear where the data is fuzzy, the prompts drift, or no one bothered to measure real-world behavior. Samuel joins me to show how a sprinkle of engineering discipline keeps those failures from ever reaching users.

We talk through:
• Tiny labels, big leverage: how five thumbs-ups/thumbs-downs are enough for Logfire to build a rubric that scores every call in real time
• Drift alarms, not dashboards: catching the moment your prompt or data shifts instead of reading charts after the fact
• Prompt self-repair: a prototype agent that rewrites its own system prompt—and tells you when it still doesn’t have what it needs
• The hidden cost curve: why the last 15 percent of reliability costs far more than the flashy 85 percent demo
• Business-first metrics: shipping features that meet real goals instead of chasing another decimal point of “accuracy”

If you’re past the proof-of-concept stage and staring down the “now it has to work” cliff, this episode is your climbing guide.

LINKS

🎓 Learn more:

Hugo's course: Building LLM Applications for Data Scientists and Software Engineers — next cohort starts July 8: https://maven.com/s/course/d56067f338

📺 Watch the video version on YouTube: YouTube link

… continue reading

53 episodes

#Tech #Hugo BowneAnderson Its #Data Science #Machine Learning

All episodes

×

Top Podcasts

Artwork

1
The Bill Simmons Podcast

The Ringer

175k

1k

17h ago17h ago

Weekly+

HBO and The Ringer's Bill Simmons hosts the most downloaded sports podcast of all time, with a rotating crew of celebrities, athletes, and media staples, as well as mainstays like Cousin Sal, Joe House, and a slew of other friends and family members who always happen to be suspiciously available.

… continue reading

Artwork

1
PTI

ESPN, Tony Kornheiser, Michael Wilbon

146k

200

2h ago2h ago

Weekly+

Tony Kornheiser and Michael Wilbon face off in the nation's capital on the day's hottest topics.

… continue reading

Artwork

1
First Take

ESPN, Stephen A. Smith, Molly Qerim Rose

12k

200

5d ago5d ago

Daily

First Take is always a heated discussion as Stephen A. Smith and guests debate about the day's top stories.

… continue reading

Artwork

1
Marketplace

Marketplace

434k

50

20m ago20m ago

Daily

Every weekday, host Kai Ryssdal helps you make sense of the day's business and economic news — no econ degree or finance background required. "Marketplace" takes you beyond the numbers, bringing you context. Our team of reporters all over the world speak with CEOs, policymakers and regular people just trying to get by.

… continue reading

Artwork

1
Comedy of the Week

BBC Radio 4

275k

37

20h ago20h ago

Weekly

Brighten your week with the latest BBC Radio 4 comedy.

… continue reading

Artwork

1
The Bugle

The Bugle

227k

659

21h ago21h ago

Weekly

It's the trans-global satiricast that leaves no hot potato unbuttered. Andy Zaltzman breaks down the news with comedians from across the world including Alice Fraser, Hari Kondabolu, Chris Addison, John Oliver, Nish Kumar, Tiff Stevenson and Helen Zaltzman. Go to TheBuglePodcast.com to become a premium subscriber and get exclusive shows. Follow us on YouTube. Hosted on Acast. See acast.com/privacy for more information.

… continue reading

Artwork

1
How Did This Get Made?

Earwolf and Paul Scheer, June Diane Raphael, Jason Mantzoukas

431k

271

1d ago1d ago

Weekly+

The award-winning comedy podcast that celebrates bad movies. Comedians and actors Paul Scheer (The League), June Diane Raphael (Grace and Frankie), and Jason Mantzoukas (Big Mouth) break down the very best of the worst films ever made—we’re talkin’ blockbuster flops, cheesy 80s action movies, Lifetime thrillers, obscure cult classics, and if we’re honest… most Nic Cage and Jason Statham movies. Plus, sometimes they’re even joined by hilarious guests like Seth Rogen, Conan O’Brien, Amy Schume ...

… continue reading

Artwork

1
Doug Loves Movies

Misfit Toys

337k

1k

1d ago1d ago

Weekly

Comedian Doug Benson (Super High Me, Last Comic Standing) invites his friends to sit down and discuss his first love: movies!

… continue reading

Artwork

1
Planet Money

NPR

287k

355

3d ago3d ago

Weekly+

Wanna see a trick? Give us any topic and we can tie it back to the economy. At Planet Money, we explore the forces that shape our lives and bring you along for the ride. Don't just understand the economy – understand the world. Wanna go deeper? Subscribe to Planet Money+ and get sponsor-free episodes of Planet Money, The Indicator, and Planet Money Summer School. Plus access to bonus content. It's a new way to support the show you love. Learn more at plus.npr.org/planetmoney

… continue reading

Artwork

1
TED Talks Daily

TED

2469k

2k

12h ago12h ago

Daily

Want TED Talks on the go? Everyday, this feed brings you our latest talks in audio format. Hear thought-provoking ideas on every subject imaginable – from Artificial Intelligence to Zoology, and everything in between – given by the world's leading thinkers and doers. This collection of talks, given at TED and TEDx conferences around the globe, is also available in video format. Hosted on Acast. See acast.com/privacy for more information.

… continue reading

Artwork

1
NBC Nightly News with Tom Llamas

NBC News

2126k

600

1d ago1d ago

Daily

Listen to "NBC Nightly News," providing reports and analysis of the day's most newsworthy national and international events. This audio podcast, updated each evening, brings you the day's show in its entirety. For more from "Nightly News", visit NBCNightlyNews.com.

… continue reading

Artwork

1
The World This Hour

CBC

1990k

1

11h ago11h ago

Daily+

Catch up on the day's most important news from Canada and around the world in 5 minutes. Updated every hour, 24/7.

… continue reading

Artwork

1
Daily Boost Motivation and Coaching

Scott Smith - Motivation and Coaching

610k

1k

29m ago29m ago

Daily

Serious Results—Never Too Serious. Words, Wisdom & Wisecracks. Are you ambitious, driven, and tired of spinning your wheels? The Daily Boost is your shortcut to clarity, momentum, and real progress—with a healthy dose of humor to keep things fun. I'm Scott Smith, and for over 5,000 episodes (and 100 million downloads), I've helped people like you get clear on what matters, take action that sticks, and actually enjoy the process. Each weekday, you'll get practical coaching, real-world insight ...

… continue reading

Artwork

1
Radiolab

WNYC Studios

628k

150

3d ago3d ago

Weekly

Radiolab is on a curiosity bender. We ask deep questions and use investigative journalism to get the answers. A given episode might whirl you through science, legal history, and into the home of someone halfway across the world. The show is known for innovative sound design, smashing information into music. It is hosted by Lulu Miller and Latif Nasser.

… continue reading

Artwork

1
Science Friday

Science Friday and WNYC Studios

353k

150

23h ago23h ago

Daily

Covering the outer reaches of space to the tiniest microbes in our bodies, Science Friday is the source for entertaining and educational stories about science, technology, and other cool stuff.

… continue reading

Artwork

1
This American Life

This American Life

668k

18

17h ago17h ago

Monthly

Each week we choose a theme. Then anything can happen. This American Life is true stories that unfold like little movies for radio. Personal stories with funny moments, big feelings, and surprising plot twists. Newsy stories that try to capture what it’s like to be alive right now. It’s the most popular weekly podcast in the world, and winner of the first ever Pulitzer Prize for a radio show or podcast. Hosted by Ira Glass and produced in collaboration with WBEZ Chicago.

… continue reading

Artwork

1
Criminal

Vox Media Podcast Network

379k

338

4d ago4d ago

Weekly

Criminal is the first of its kind. A show about people who’ve done wrong, been wronged, or gotten caught somewhere in the middle. Hosted by Phoebe Judge. Named a Best Podcast of 2023 by the New York Times. Part of the Vox Media Podcast Network.

… continue reading

Artwork

1
Sword and Scale

Sword and Scale

84k

71

7h ago7h ago

Monthly+

Sword and Scale is a weekly true crime podcast covering the dark underworld of crime and the criminal justice system’s response to it. The first episode launched January 1st, 2014 and feature stories of murder, abduction, rape, and even more bizarre forms of crime. It’s the purest form of true-crime where the raw uncensored audio tells the story. Everything from 911 calls to court testimony, interviews with victims and sometimes with perpetrators give listeners a 360 degree look at the seedy ...

… continue reading

Artwork

1
In The Dark

The New Yorker

82k

64

4M ago16w ago

Monthly

In the Dark, hosted by Madeleine Baran, is an award-winning investigative-journalism podcast that started in 2016. Its first season looked at the mysterious abduction of Jacob Wetterling in rural Minnesota and the lack of accountability that sheriffs face when they fail to solve cases. Season 2 examined the case of Curtis Flowers, who was tried six times for the same crime. In 2020, In the Dark released a special report on the coronavirus pandemic in the Mississippi Delta. In 2023, In the Da ...

… continue reading

Help/FAQ | Advertise with Us

Arts|Business|Comedy|Economics|Entertainment|News|Politics|Religion

Science|Soccer|Sports|Storytelling|Technology|True Crime

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright

Listen to this show while you explore