Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Pierson Marks, Bilal Tahir, Pierson Marks, and Bilal Tahir. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Pierson Marks, Bilal Tahir, Pierson Marks, and Bilal Tahir or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

ElevenLabs V3 and the Future of Text to Speech

46:33
 
Share
 

Manage episode 489956716 series 3671689
Content provided by Pierson Marks, Bilal Tahir, Pierson Marks, and Bilal Tahir. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Pierson Marks, Bilal Tahir, Pierson Marks, and Bilal Tahir or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.

In Episode 2 of Creative Flux, Pierson Marks (@piersonmarks) and Bilal Tahir (@deepwhitman) explore recent text-to-speech advancements from model providers like ElevenLabs V3 and OpenAI. They highlight where generative voice is today, where it is going, and how the industry is adopting this new technology.

Chapters

00:00 Introduction to Creative Flux

03:05 Exploring Text-to-Speech Technology

06:08 The Evolution of Speech Recognition

08:45 Advancements in Text-to-Speech Models

11:52 Comparing Text-to-Speech Providers

15:09 Voice Agents vs. Creative Applications

17:52 The Future of Conversational AI

23:33 Exploring API Access and New Features

24:38 Innovations in Audio Inpainting

28:06 Gemini TTS and AI Studio Overview

30:53 The Role of AI Studio for Developers

34:22 The Future of Local Models and Open Source

39:25 The Need for Abstraction Layers in TTS

44:02 The Rapid Evolution of Media Generation

Links:
-
Elevenlabs: https://elevenlabs.io/app/speech-synthesis/text-to-speech
- Elevenlabs v3 intro: https://elevenlabs.io/docs/models#eleven-v3-alpha
- Openai Fm: https://www.openai.fm/
- Playai inpaint & dialog: https://fal.ai/models/fal-ai/playai/inpaint/diffusion
- Kokoro: https://fal.ai/models/fal-ai/kokoro/american-english
- Google ai studio: https://aistudio.google.com/prompts/new_chat
- Google AI studio for TTS: https://aistudio.google.com/generate-speech
- Minimax hailuo -02: https://fal.ai/models/fal-ai/minimax/hailuo-02/standard/image-to-video
- Seedance lite: https://fal.ai/models/fal-ai/bytedance/seedance/v1/lite/text-to-video

  continue reading

2 episodes

Artwork
iconShare
 
Manage episode 489956716 series 3671689
Content provided by Pierson Marks, Bilal Tahir, Pierson Marks, and Bilal Tahir. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Pierson Marks, Bilal Tahir, Pierson Marks, and Bilal Tahir or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.

In Episode 2 of Creative Flux, Pierson Marks (@piersonmarks) and Bilal Tahir (@deepwhitman) explore recent text-to-speech advancements from model providers like ElevenLabs V3 and OpenAI. They highlight where generative voice is today, where it is going, and how the industry is adopting this new technology.

Chapters

00:00 Introduction to Creative Flux

03:05 Exploring Text-to-Speech Technology

06:08 The Evolution of Speech Recognition

08:45 Advancements in Text-to-Speech Models

11:52 Comparing Text-to-Speech Providers

15:09 Voice Agents vs. Creative Applications

17:52 The Future of Conversational AI

23:33 Exploring API Access and New Features

24:38 Innovations in Audio Inpainting

28:06 Gemini TTS and AI Studio Overview

30:53 The Role of AI Studio for Developers

34:22 The Future of Local Models and Open Source

39:25 The Need for Abstraction Layers in TTS

44:02 The Rapid Evolution of Media Generation

Links:
-
Elevenlabs: https://elevenlabs.io/app/speech-synthesis/text-to-speech
- Elevenlabs v3 intro: https://elevenlabs.io/docs/models#eleven-v3-alpha
- Openai Fm: https://www.openai.fm/
- Playai inpaint & dialog: https://fal.ai/models/fal-ai/playai/inpaint/diffusion
- Kokoro: https://fal.ai/models/fal-ai/kokoro/american-english
- Google ai studio: https://aistudio.google.com/prompts/new_chat
- Google AI studio for TTS: https://aistudio.google.com/generate-speech
- Minimax hailuo -02: https://fal.ai/models/fal-ai/minimax/hailuo-02/standard/image-to-video
- Seedance lite: https://fal.ai/models/fal-ai/bytedance/seedance/v1/lite/text-to-video

  continue reading

2 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play