Go offline with the Player FM app!
Google I/O 2025 Special Edition - #733
Manage episode 485489290 series 2355587
Today, I’m excited to share a special crossover edition of the podcast recorded live from Google I/O 2025! In this episode, I join Shawn Wang aka Swyx from the Latent Space Podcast, to interview Logan Kilpatrick and Shrestha Basu Mallick, PMs at Google DeepMind working on AI Studio and the Gemini API, along with Kwindla Kramer, CEO of Daily and creator of the Pipecat open source project. We cover all the highlights from the event, including enhancements to the Gemini models like thinking budgets and thought summaries, native audio output for expressive voice AI, and the new URL Context tool for research agents. The discussion also digs into the Gemini Live API, covering its architecture, the challenges of building real-time voice applications (such as latency and voice activity detection), and new features like proactive audio and asynchronous function calling. Finally, don’t miss our guests’ wish lists for next year’s I/O!
The complete show notes for this episode can be found at https://twimlai.com/go/733.
757 episodes
Google I/O 2025 Special Edition - #733
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Manage episode 485489290 series 2355587
Today, I’m excited to share a special crossover edition of the podcast recorded live from Google I/O 2025! In this episode, I join Shawn Wang aka Swyx from the Latent Space Podcast, to interview Logan Kilpatrick and Shrestha Basu Mallick, PMs at Google DeepMind working on AI Studio and the Gemini API, along with Kwindla Kramer, CEO of Daily and creator of the Pipecat open source project. We cover all the highlights from the event, including enhancements to the Gemini models like thinking budgets and thought summaries, native audio output for expressive voice AI, and the new URL Context tool for research agents. The discussion also digs into the Gemini Live API, covering its architecture, the challenges of building real-time voice applications (such as latency and voice activity detection), and new features like proactive audio and asynchronous function calling. Finally, don’t miss our guests’ wish lists for next year’s I/O!
The complete show notes for this episode can be found at https://twimlai.com/go/733.
757 episodes
All episodes
×
1 Building the Internet of Agents with Vijoy Pandey - #737 56:13

1 LLMs for Equities Feature Forecasting at Two Sigma with Ben Wellington - #736 59:31

1 Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso - #735 56:45

1 Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734 1:25:21

1 RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732 57:09

1 From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731 1:01:25

1 How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730 1:07:27

1 CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729 56:18

1 Generative Benchmarking with Kelly Hong - #728 54:17

1 Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen - #727 1:34:06

1 Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen - #726 51:45

1 Waymo's Foundation Model for Autonomous Driving with Drago Anguelov - #725 1:09:07

1 Dynamic Token Merging for Efficient Byte-level Language Models with Julie Kallini - #724 50:32

1 Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723 58:38
Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.