Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo

Llm Evaluation Podcasts

show episodes
 
All Things LLM is your go-to podcast for demystifying Large Language Models! We break down their core concepts—like tokens, embeddings, and the self-attention that powers GPT-4 and Llama. Learn how LLMs are built, trained, and fine-tuned (SFT, RLHF, PEFT) on massive datasets. Discover real-world use cases in healthcare, finance, chatbots, code, RAG, and more. We explore the LLM ecosystem, covering open-source vs. closed models, LLMaaS, LangChain, and LLMOps tools. Plus, we tackle challenges— ...
  continue reading
 
Step into the world of tomorrow with AI News Daily – your go-to podcast for cutting-edge updates, trends, and breakthroughs in artificial intelligence and language models. Whether you’re a tech enthusiast, developer, startup founder, or just curious about how AI is shaping our daily lives, this podcast delivers sharp, insightful, and digestible news—every single day. From OpenAI’s latest model releases to industry-shaking innovations in machine learning, natural language processing, robotics ...
  continue reading
 
Artwork

1
Deep Papers

Arize AI

icon
Unsubscribe
icon
icon
Unsubscribe
icon
Monthly+
 
Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.
  continue reading
 
The Everyday AI podcast is a daily livestream, podcast and free newsletter where we help everyday people grow their careers with AI. The Everyday AI podcast is hosted by Jordan Wilson, a former journalist who's now the owner of a boutique digital strategy company with 20 years of martech experience. Our main focus is to help you keep up with AI trends to make your job easier. Get your work done faster. Increase your output. - Sign up for our free Prime Prompt Polish ChatGPT course: https://p ...
  continue reading
 
Artwork

1
AWS Podcast

Amazon Web Services

icon
Unsubscribe
icon
icon
Unsubscribe
icon
Weekly
 
The Official AWS Podcast is a podcast for developers and IT professionals looking for the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Hawn Nguyen-Loughren for regular updates, deep dives, launches, and interviews. Whether you’re training machine learning models, developing open source projects, or building cloud solutions, the Official AWS Podcast has something for you.
  continue reading
 
Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, de ...
  continue reading
 
AXRP (pronounced axe-urp) is the AI X-risk Research Podcast where I, Daniel Filan, have conversations with researchers about their papers. We discuss the paper, and hopefully get a sense of why it's been written and how it might reduce the risk of AI causing an existential catastrophe: that is, permanently and drastically curtailing humanity's future potential. You can visit the website and read transcripts at axrp.net.
  continue reading
 
Loading …
show series
 
Capabilities? Through the roof? Usage? Ground floor. Claude Agent Skills might be one of the most useful features of any front-end LLM. Yet....it's crickets in terms of chat around it. For this 'AI at Work on Wednesday' episode, we're breaking it down for beginners and will have you spinning up your own Claude Agent Skills in no time. Claude Skills…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki AI News Daily — Dec 10, 2025 Summary Top Highlights: Major AI companies (OpenAI, Anthropic, Micro…
  continue reading
 
In this episode of the podcast, I speak with Anindya Ghose from NYU and Vilma Todri from Emory University about their recent paper, The Impact of Visual Generative AI on Advertising Effectiveness, which is available in pre-print. In the paper, Anindya, Vilma, and the other authors assess the performance efficacy of three types of ad creative: Creat…
  continue reading
 
In this episode, we’re joined by Munawar Hayat, researcher at Qualcomm AI Research, to discuss a series of papers presented at NeurIPS 2025 focusing on multimodal and generative AI. We dive into the persistent challenge of object hallucination in Vision-Language Models (VLMs), why models often discard visual information in favor of pre-trained lang…
  continue reading
 
OpenAI is (reportedly) in full panic mode. 🚨 All hands on deck, Code Red status. So.... what happened? How did OpenAI go from defining the AI category to getting beat by competitors they once trounced? And, is it too late for them to turn it around? Or will Google permanently take the AI crown? Tune in... we've got hot takes. OpenAI's Code Red: Is …
  continue reading
 
OpenAI has launched a code red. 🚨 After increasing pressure from Google, OpenAI is reportedly in ‘all hands on deck’ mode to reclaim the LLM crown. Meanwhile, Google quietly released an EVEN MORE powerful version of Gemini 3 that hardly no one noticed. And Perplexity? They got hit with a massive lawsuit. You take one week off of AI, and you could b…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: NeurIPS 2025 emphasized attention limits, compositional generalization, and rigor…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Google Gemini 3 with Deep Think strengthens reasoning; OpenAI pushes GPT-5.2 amid…
  continue reading
 
How can you scale AI at the enterprise, yet still hit your climate goals? And can heavy AI usage and an enterprise's ESG mission co-exist? Ashutosh Ahuja lays it out for us. Aligning AI With Climate And Business Goals -- An Everyday AI Chat with Jordan Wilson and Ashutosh Ahuja Newsletter: Sign up for our free daily newsletter More on this Episode:…
  continue reading
 
You're probably using AI agents without even knowing it. 🤯 Crazier yet? It's very possible that there may already be more AI agent instances than humans in the world. Was that a bold claim we made a year ago? Yeah. But did Cloudflare's Tech Lead of AI Agents agree? Also, yeah. (See, we're not that crazy.) So, what do you need to know about the futu…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: OpenAI acquired Neptune.ai for ML workflow optimization; Google launched Workspac…
  continue reading
 
FYI -- Today's LinkedIn livestream broke, so you can access the custom instructions here. This is Vibe Coding 001. Have you ever wanted to build your own software or apps that can just kinda do your work for you inside of the LLM you use but don't know where to start? Start here. We're giving it all away and making it as simple as possible, while a…
  continue reading
 
In this episode, Zain Asgar, co-founder and CEO of Gimlet Labs, joins us to discuss the heterogeneous AI inference across diverse hardware. Zain argues that the current industry standard of running all AI workloads on high-end GPUs is unsustainable for agents, which consume significantly more tokens than traditional LLM applications. We explore Gim…
  continue reading
 
My guest on this episode of the podcast is Rishabh Jain, the CEO and co-founder of FERMÀT Commerce, an eCommerce advertising optimization platform. Rishabh most recently joined the podcast in June for an episode of the MDM Mailbag. In this episode, Rishabh and I discuss the impact of chatbot discovery on eCommerce sales, including over Black Friday…
  continue reading
 
You've been lied to about AI. 🤥 A lot. So on today's Hot Take Tuesday episode, we're breaking down 3 of the most viral AI half-truths of 2025 and setting the record straight. Did Anthropic overtake OpenAI? Do 95% of AI pilots fail? Is half of the internet AI slop? Tune in LIVE and find out. 3 AI Lies most people believed in 2025 (but you shouldn’t)…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: DeepSeek V3.2 achieves medal-level math/coding at lower cost. Hugging Face launch…
  continue reading
 
Claude Opus 4.5 has entered the Chat. 🗣️ A week after OpenAI, Grok and Google released their most powerful AI models to date, Anthropic joined the party with their major drop in Claude Opus 4.5. But that probably wasn't even the biggest AI news of the week. That's because OpenAI isn't just building AI hardware that can hear/know everything, they're…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Google unveiled Nested Learning and 2.3kW TPU Rubin; Gemini 3 sees surging adopti…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Google launched Gemini 3, advancing agentic automation and multimodal reasoning. …
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Google Gemini 3 and Anthropic Claude Opus 4.5 launch major upgrades with price cu…
  continue reading
 
"... best model in the world..." 🤔 Wait, again? Days after Gemini 3 Pro splashed on the scene, Anthropic snuck in a low-key drop in Claude Opus 4.5. And Anthropic pulled no punches, calling its new model the "best model in the world for coding, agents and computer use" So, should you be hot swapping your Gemini or ChatGPT use out for the new Opus 4…
  continue reading
 
My guest on this episode of the podcast is Mikołaj Barczentewicz, a professor of law at the University of Surrey and the author of EU Tech Reg, a blog dedicated to following developments in the EU regulatory machinery. In this episode, Mikołaj and I discuss the digital omnibus package that was recently proposed by the European Commission and which …
  continue reading
 
Even if you've banned AI, your employees are 100% using it. 🥵 To make matters worse? Even if you've approved a certain AI system, your teams are probably using whatever they want. And those choices are likely putting your enterprise data at risk. So, how do you reel in manage the AI sprawl? Kevin Kiley, the CEO of Airia, is laying out the playbook.…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Major Model Releases: Google launched Gemini 3 Pro with deepfake detection via SynthID and superi…
  continue reading
 
We dive into the latest paper from Google and a team of academic researchers: "TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture." Hear from one of the paper's authors — Yongchao Chen, Research Scientist — walks through the research and its implications. The paper proposes Tool-Use Mixture (TUMIX), an ensemble framework that runs multiple …
  continue reading
 
Wildest week in AI since December 2024. 🤯 ↳Gemini 3 is out and it's REALLY good. ↳ GPT-5.1 Pro might end up being better. (Even though no one is talking about it) ↳Microsoft is releasing agents where people will actually use them. ↳ Nano Banana Pro will probably be more impactful than Gemini 3 (as banana as that sounds. Whew. What a week in AI. Don…
  continue reading
 
Discover how AWS leverages automated reasoning to enhance AI safety, trustworthiness, and decision-making. Byron Cook (Vice President and Distinguished Scientist) explains the evolution of reasoning tools from limited, PhD-driven solutions to scalable, user-friendly systems embedded in everyday business operations. He highlights real-world examples…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Headlines: California approves Waymo's statewide driverless rides (San Diego mid-2026); UK la…
  continue reading
 
Yeah, agnetic browsers can do your work for you. 💅 But..... should they? How do we tip-toe the fine line between the upside productivity of agentic browsers and the potential security nightmares they bring with them? Tune it and let's chat about it. AI Agents in your browser Work Cheat Code or too Risky? An Everyday AI Chat with Jordan Wilson and A…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Major Model Releases: Google launched Gemini 3 Pro with TPU acceleration, topping coding benchmar…
  continue reading
 
Richard Seroter is a Chief Evangelist at Google. 📢 So it’s LITERARLLY his job to help people use Google’s AI products. So with him joining the Everyday AI show, you KNOW he’s gonna be dropping some time-saving and business building strategies. And a bit of future of work knowledge along the way. This is one you DO NOT wanna miss. 5 Simple AI Strate…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Google Gemini 3 launches with advanced reasoning and multimodality, intensifying …
  continue reading
 
You ever see a new AI model drop and be like.... it's so good OMG how do I use it? 🤔 Same. And yeah.... Gemini 3 is THAT good. So if you're wondering what's new, why it matters and how to use it, this episode is for you. AI at Work on Wednesdays: let's get it with the world's most poweful model in Gemini 3. Gemini 3 Deep Dive and 3 Upgraded Use Cas…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Google launched Gemini 3 across Search, apps, and dev tools with rapid deployment…
  continue reading
 
Today, we're joined by Devi Parikh, co-founder and co-CEO of Yutori, to discuss browser use models and a future where we interact with the web through proactive, autonomous agents. We explore the technical challenges of creating reliable web agents, the advantages of visually-grounded models that operate on screenshots rather than the browser’s mor…
  continue reading
 
My guest on this episode of the podcast is Simon Whitcombe, the Vice President, Global Business Group at Meta. We discuss Meta's Business AI and the Meta AI business assistant, both of which were announced ahead of this year's AdWeek. I unpacked the potential of these tools to help advertisers cross the "ad-product divide" in Can Meta cross the ad-…
  continue reading
 
Gemini 3 is officially here. ✨ ✨ ✨ For about 8 months, Gemini 2.5 Pro has mostly maintained its standing as the top LLM in the world yet Google just unleashed its successor in Gemini 3.0. So, what's new in Gemini 3? And whether you're a developer or casual user, what does Google's new model unlock? Join us as we chat with Google's Logan Kilpatrick'…
  continue reading
 
Willy Lulciuc (@wslulciuc) is a pioneer in data engineering and one of the creators of OpenLineage, the open-source framework for data lineage collection and analysis. It enables consistent collection of lineage metadata, giving engineers a better perspective on how data is produced and used, so they can better solve complex problems. Join us to le…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Headlines (Nov 18): xAI's Grok 4.1 leads arena leaderboards with record Elo and MoE transpare…
  continue reading
 
Buckle up AI world. OpenAI released a new model, and apparently they’re not done. Google is reportedly dropping Gemini 3 in hours. Jeff Bezos is going back hands-on building a new AI company. And that’s just the tip of the AI iceberg this week. Don’t get drowned out in the noise. On Monday, we cut it straight with the AI news that matters. Gemini 3…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Google's Gemini 3 is imminent, beating coding benchmarks to challenge ChatGPT. AM…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: Anthropic stopped the first autonomous, state-linked AI cyber-espionage campaign.…
  continue reading
 
Everyone knows AI needs your data to truly work. But, what about your company's reasoning? 🤔 Buried beneath the modes and models, features and agents is something so fundamental that we almost always overlook it: the friggin gold that is your company's conversations. It's your expertise. Your secret sauce. Your decision making. Your competitive adv…
  continue reading
 
Send us a text 🌍 INAI • The Open AI Hub The Intelligence Atlas → the world’s most comprehensive, open hub of AI knowledge. 2 Million+ tools, models, agents, tutorials & daily news—free for all, updated every day. https://github.com/inai-sandy/inAI-wiki Top Highlights: DeepMind's SIMA 2 achieves human-level performance in unseen 3D environments thro…
  continue reading
 
What happens when the web is all bots and AI? 🤖 And more importantly, what happens to your company's online presence when AI search completely takes over? Big questions. So we're bringing in the big gun for the answers. Michael Walrath is the Chairman and CEO Yext Inc, a global leader in brand management and search experience. Michael will dish the…
  continue reading
 
Today, we're joined by Robin Braun, VP of AI business development for hybrid cloud at HPE, and Luke Norris, co-founder and CEO of Kamiwaza, to discuss how AI systems can be used to automate complex workflows and unlock value from legacy enterprise data. Robin and Luke detail high-impact use cases from HPE and Kamiwaza’s collaboration on an “Agentic…
  continue reading
 
Loading …
Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play