Billion-parameter Brains In Pocket-sized Chips: The Local AI Revolution - We Talk IoT #75 We Talk IoT

Billion-parameter brains in pocket-sized chips: The local AI revolution - We talk IoT #75

We talk IoT – The Internet of Things Business Podcast

published 4d ago

Content provided by Avnet Silica. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Avnet Silica or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.

In this episode, we explore how engineers are embedding powerful AI directly into hardware – no cloud connection required.

Michaël Uyttersprot from Avnet Silica and Cedric Vincent from Tria Technologies reveal how they run ChatGPT-quality language models on resource-constrained embedded devices. What once demanded data centre infrastructure now fits onto chips with just 2GB of RAM.

The conversation covers the technical challenges of cramming billion-parameter models into embedded systems, real-world applications from conference rooms to industrial robotics, and the three compelling reasons driving this shift: data privacy, power efficiency, and cost control.

Michaël and Cedric discuss hardware platforms from AMD, NXP, and Qualcomm, explain techniques like quantisation and mixture of experts, and demonstrate applications including a vintage telephone box that lets you call avatars from different time periods.

Tune in to learn why the future of AI might not be in the cloud at all – and what that means for industries from manufacturing to healthcare.

#AI #LLM #embeddedsystems #IoT #privacy #wetalkiot

Summary of this week's episode:

02:48 What makes large language models special

05:27 Why run LLMs locally on embedded devices

07:42 Real-world applications: Vision LLMs and OCR

11:12 Technical deep dive: How to fit billions of parameters into tiny devices

18:52 Understanding temperature: Making AI creative or accurate

22:41 Industries moving fastest: OCR, security, and robotics

24:52 Future applications: Robotic arms and time series analysis

28:00 The biggest technical hurdle: Power consumption

30:55 Advice for engineers: Start with llama.cpp

Show notes:

Michaël Uyttersprot: https://www.linkedin.com/in/micha%C3%ABl-uyttersprot-aaa971211/

Cedric Vincent: https://www.linkedin.com/in/cedric-vincent-19222910/

Tria Technologies: https://www.tria-technologies.com/

Generative AI at the Edge: https://my.avnet.com/silica/solutions/technologies/artificial-intelligence/generative-ai/

The podcast episode where the generative AI examples where discussed: https://www.podbean.eu/ep/pb-9juiy-d4dec4

How to enhance embedded systems with Generative AI and Local LLMs | Michael Uyttersprot at HWPMAX25: https://www.youtube.com/watch?v=wL9g2wJ1a7c

Listen to the "We Talk IoT" Soundtrack on:

Spotify: https://open.spotify.com/playlist/05MOV4OV2MH2in2txsAGtG?si=ad08112cb8d443f4

YouTube: https://www.youtube.com/watch?v=D-NvQ6VJYtE&list=PLLqgVFfZhDRVYmpEqbgajzDvGL4kACRDp

The Llama song: https://youtu.be/JavZh3y1ue0

About Avnet Silica:

This podcast is brought to you by Avnet Silica — the Engineers of Evolution.

Subscribe to our newsletters here: https://my.avnet.com/silica/resources/newsletter/

You can connect with us on LinkedIn: https://www.linkedin.com/company/silica-an-avnet-company/. Or find us at www.avnet-silica.com.

75 episodes