Go offline with the Player FM app!
Producing AI-Narrated Audiobooks Using ElevenLabs With Simon Patrick
Manage episode 491104065 series 1567480
Is the high cost of audiobook production holding you back? What if you could create a high-quality audiobook for a fraction of the traditional cost?
In this conversation, Simon Patrick explores the world of AI narration with ElevenLabs, discussing how you can gain complete creative control, and even license your own voice clone for a new stream of income.
This episode is supported by my patrons. Join my Community at Patreon.com/thecreativepenn
Simon Patrick founded Ten Times Better Books to support his daughter, Abby Patrick, as one of ElevenLabs' first users and beta testers. He has produced several of their most popular AI voices. He now develops courses and AI audiobook solutions for independent authors at Novel Productions.
You can listen above or on your favorite podcast app or read the notes and links below. Here are the highlights and the full transcript is below.
Show Notes
- Costs vs benefits of human vs AI narration
- Features of ElevenLabs — realistic and expressive voices, creative control, ownership of final audio files for wide distribution to platforms like Spotify.
- Practical tips for AI narration
- ElevenLabs v3 and emotion tags
- Creating and monetizing a voice clone
- Publishing on ElevenReader
You can find Simon at Novel.Productions or 10xb.com.
Transcript of the interview
Joanna: Simon Patrick founded Ten Times Better Books to support his daughter, Abby Patrick, as one of ElevenLabs' first users and beta testers. He has produced several of their most popular AI voices. He now develops courses and AI audiobook solutions for independent authors at Novel Productions. Welcome to the show, Simon.
Simon: Thank you, Joanna. It's such a joy to be speaking with you. Your podcast and books were foundational to my daughter, Abby, becoming an author and me learning to be her publisher and all that's happened since.
I love your Patreon @thecreativepenn. It's the best money I spend every month, frankly. It's just a great community to be part of, so it's such a joy to be sharing some of what I've learned.
Joanna: Oh, thank you so much. Behind the scenes on the Patreon, Simon has done a video demo of ElevenLabs. Today, obviously, we're doing audio-only. So first up—
Tell us a bit more about your background and why you decided to get into AI-narrated audiobooks.
Simon: Okay. Well, I've got 25 years of experience in marketing and design. I still am halftime head of communications for an international charity, but we've always had our own businesses too.
My wife and I ran a small home education tuition publishing business. We've home-educated our three kids, which brings me to Abby, my daughter who brought me into your world of book publishing.
She was going to college, studying early years education, and was just bored out of her mind. She asked if she could drop out of college to be a writer instead. She'd been writing a book since she was 15. To the astonishment of her friends and some of ours too, we said yes.
Let me add, it was responsible parenting. We made her finish the term, stick it out, and do the work experience. By Christmas 2019, she'd left to pursue finishing her book based on the deal that —
If she learned to write, I would learn to publish for her.
Joanna: Wow!
Simon: So I attended the first Self Publishing Show in that crazy spring of 2020. I think you were there too, just a few days before the pandemic shut us all down. I've listened to hundreds of your podcasts, read your books, done some of the Self-Publishing Formula courses, and learned to be Abby's publisher.
Since then, I have used those skills and connected with a few other authors, so I probably publish a book or two a month, something like that.
Audio has always been the stumbling block. I love audiobooks. As a family, we must consume hundreds of hours a month of them. There are incredible narrators like Ray Porter and Daniel Rigby, who self-narrates his own Audible exclusives, and my absolute favourite, a guy called Jeff Hayes, who narrates incredibly.
They're amazing talents, and I don't think AI is going to touch them because they bring so much humanity to the performance.
But to ordinary authors and publishers, those narrators are inaccessible. I don't even want to think about what they cost.
For Abby, who is still just starting out, any professional narration would cost her three to four thousand dollars for her books. The math just doesn't work. While there are options like a royalty split with ACX, Audible's publishing platform, I struggle with that.
Firstly, you're tied in exclusively to Audible for seven years, and we're big fans of going wide.
Secondly, you're only getting 20% of the royalties when it's being split. I just don't think for us, they're ever going to make that money back. So all of that is what led me in early 2023 to be searching for AI audio options.
ChatGPT was going crazy, you were demoing all of that at the time, and I figured there must be some kind of AI audio option that would let me take control of the process and hopefully produce good audiobooks way more cheaply than current options. That's when I discovered ElevenLabs.
Joanna: There's lots to unpack there. First of all, as you mentioned, there are some incredible human narrators, and we want to acknowledge them. I'm also a human narrator myself.
For most authors, especially indie authors or new authors, it's not a choice between human or AI; it's AI or nothing because they can't afford the fees.
As you said, a lot of the time you don't know if you're going to make the money back. So I think that's really important to acknowledge.
There are lots of AI narration options now. It is hard for authors to decide which platform to use.
So what is ElevenLabs, and why do you think it's the best option for quality and also for publishing reach?
You mentioned ACX, and there's obviously AVV, the Audible Virtual Voice. Most people might think, “Well, maybe I should just do that.”
Give us an overview on why you made that decision to go with ElevenLabs.
Simon: Absolutely. ElevenLabs continues to be the most realistic AI platform out there. They kicked off about two and a half years ago. I was one of their first users, and even back then, they were so much better than everything else.
There were lots of programmers wanting to do clever things with APIs and websites, but I just wanted to make audiobooks with these things. They were actually listening, which is remarkable in the publishing industry sometimes.
About a year and a half ago, and for reference, we're in June 2025 right now, they launched ElevenLabs Studio. It can take a whole book, like the ePub that I've worked on for Abby or a Word document, you can drop it in and have it convert it chapter by chapter, paragraph by paragraph into a great-sounding audiobook.
The high quality and natural-sounding elements of it are why I was first attracted to them. The expressiveness is just another step above.
The comparison with Amazon's Virtual Voice is that it's so much more pleasant to listen to, but it doesn't just sound better —
What I love about it is the complete creative control it gives me. There are thousands of voices I can pick from, a whole library of voices.
They're real people, people like me, actually, who have recorded their voice and then licensed it to ElevenLabs and get paid a small amount. Then when it's used, there's actually compensation to those who've licensed their voices to it.
It’s not like the large language models like ChatGPT where the whole universe seems to have been scraped and compiled into this thing. They're being super diligent about making sure it's all kosher, that it's real people's voices and they're getting compensated.
Beyond that, the tools they're building give you control. They're incredibly open to listening to feedback, which has been brilliant. I'm talking to the programmers regularly. They've got a great Discord where they're asking for feedback.
With the tools, I can spend time perfecting the book. I can get the dialogue just the way I want it. I can create a duet audiobook with a male narrator for male POVs and a female for female POVs. I can even do multi-cast and assign different voices to each character in the book.
Probably most importantly, I can download the whole thing as WAV files or MP3s.
The big difference with something like Amazon Virtual Voice is that I own what I've created with ElevenLabs.
It's a commercial license, so I can put them into BookFunnel's audio delivery service, I can put them on my website, you can add them to a Kickstarter, stick them on YouTube, or just give them away for free if I wanted to.
In terms of publishing reach, they're doing a lot. We were kind of stuck with either self-publishing, YouTube, or Kobo, who are superstars and super open. But one of the game changers that's happened in the last few months is you can now add them to Spotify, which has come in as the big disruptor for Audible and Amazon.
You've done that recently with the book that we produced together. How's that been?
Joanna: Death Valley, which has been on the feed, you can listen to a couple of chapters, and that's using my voice clone. We'll come back to the voice clone in a minute.
As you mentioned, I think it's mainly the ownership of the files and the Spotify distribution.
At the moment, it really is only Google's auto-narration and ElevenLabs that you can use legitimately on Spotify through Findaway Voices. You cannot use the AVV files anywhere else.
So I think that's incredibly important because, of course—
We can talk forever about how to make audio, but it's also about selling audio, isn't it?
Simon: And for anyone who's dealt with KDP or Audible customer services, I probably don't have to say what the experience was like. So another reason I love ElevenLabs is their support has been brilliant.
There's this Discord I mentioned where there are dozens of super helpful and patient people giving input. Their customer service team replies quickly, it's personal, they're helpful, and they've got amazing documentation.
Stepping back a little bit, the fact that we can create well-narrated audiobooks for a hundred to two hundred dollars plus a few days of learning and production on each one is just incredible.
I took my two boys to a local Comic-Con recently, and there was a self-published author there with a single beautiful book. He'd clearly poured his heart and money into this thing.
There were beautiful cover bookmarks and giveaways, and then I saw he had an audiobook. We got talking about it. He'd got it professionally narrated, and he opened up and said it cost £7,000.
I honestly wanted to cry. I genuinely get emotional about it even now. I want us as authors and publishers to put our time, energy, and money into creating incredible stories and getting our words out into the world and just make everything around that as simple as possible, using tech where we can.
Joanna: I just want to comment on this because one of the reasons we timestamp these episodes is because I'll have people email me and say, “Oh, but you said this,” and I'm like, “Yeah, but when did I say that?”
For example, in 2014 when I started audiobook publishing on ACX, they were the only thing out there, and they were the bee's knees. We had a much higher royalty rate, there were very few audiobooks around, and you could make that money back. The amount of money you mentioned, you could make back quite quickly.
Now, I know some people will be saying, “Oh, but I make that money back.” And I'm like, “Well, yeah, if you are an established author, absolutely.” If you have a popular series, if you know that you already make that kind of money from audiobooks, then you can.
We are in a different era in 2025. There's a lot more audio, and of course, AI is a double-edged sword. There is going to be more audio than ever before.
The question is, how do we make that money back?
If we lower the costs, then we also lower the amount of revenue we need to make to offset that.
Simon: And you know, it's going to move on fast, but now is an extraordinary time. I love good audiobooks, and the fact that AI can help me make those now is very exciting to me.
Joanna: It's super fun. You and I both have a reasonably technical background, so we can use these tools. To be fair, you said wonderful documentation. I am terrible at reading documentation. I just jump in and give it a go.
There are people who don't know anything about AI audio. How does it work?
Can you give a few key elements and tips for authors if they want to use ElevenLabs for AI narration?
Simon: Yeah, I've got five tips for you. First, go in and check it out. There is a creator package that you can get for half price for the first month. I would say for exploration, it is worth getting for $11 just to have a little bit of a play with it.
Getting familiar with the platform can be a little intimidating because it does lots of different things, like voice changing, sound effects, and dubbing video.
We are really only interested in the Studio tool. As soon as you go into that Studio tool, it will start to feel familiar. You can click “Create an audiobook,” drop your ePub in there, and basically instantly see how this thing works, breaking it into chapters, applying a voice, and clicking play.
The warning though is this creator package, at $22 a month, is not good enough to create professional audiobooks. This is my first tip: you need the Pro package, which is $99 a month, because that is what outputs 192 kilobits per second.
That's the technical specification that you need to go on BookFunnel or Spotify. You only get that by using the $99 a month package. You get about 10 hours of audio creation in that, so for a lot of people, that could make a book. The hours roll over, so you can either wait for month two and have enough hours to do it.
As soon as you're done with your book, you can downgrade to a $5 a month package, so don't worry, it's not trapping you in there. Just know that you need the $99 a month Pro package to produce your audiobooks.
My second tip is to —
Really spend time choosing or making your voice.
You had an experience with this, Joanna, where you try out a voice, commit to it, and then realize two or three chapters in that you don't like it. I've had that experience too.
So use that first month on the creator package to really play with voices. Generate your first chapter in five or six different voices. Really get familiar and comfortable with a voice that you want to use so that you're not wasting time and credits when you get into producing something.
Third, don't get overwhelmed; have fun with it. It's amazing hearing your book come to life in audio. I feel if you give it an hour, the Studio tool is pretty intuitive. If you have the level of tech ability to do something like typesetting in Atticus or Vellum or use Scrivener, you can absolutely master using Studio.
My fourth tip, and a warning, is that it still takes time. This isn't some one-button wonder. Your novella, Death Valley, was six and a half hours long. That took 18 hours of editing.
Joanna: And this is where people get confused because with AVV, the Audible Virtual Voice, there is no control. You literally do click one button and it goes live. There's almost no point in proof-listening to it because you can't actually change it.
With Studio, you have such fine control that you can add pauses, a breath in the middle of a sentence, or change the emphasis.
You kind of direct it with Studio, don't you?
Simon: That's the word I use, yes. Directing. It's like you're directing an audiobook. If you are doing non-fiction, it is borderline a one-click wonder. It will deliver it amazingly, and you need to listen to it once, and you're good to go.
If you've spent a year or two writing a book, think about the effort we put into making it look good in the typesetting and the covers. A day or two to listen to it, refine it, and make it represent your vision is not time wasted. I'm only interested in high-quality audiobooks that do the story justice.
I want to be proud of it. I want Abby to be proud of it. I highly encourage people, particularly fiction writers, to be prepared to spend two or three days working on the book. It is so rewarding to get something that comes out the other end that you are proud of.
Joanna: And just on the proofing, if you work with a human narrator, you will be doing proofing. You listen to the audio, find the timestamp, explain what you want changed, and send it back to the human to rerecord.
The process is probably pretty similar in terms of the amount of time taken, but you can do it yourself, and there are areas that help.
For example, if there's a character name, you can fix that once for the whole audio, can't you?
Simon: Correct. It's a pronunciation dictionary for any words. It really struggled with “croissant.” It does little random things. I think our favorite was when it pronounced “desert” as “dessert.”
Joanna: It just would not stop wanting some dessert! What are some other tips?
Simon: My fifth and final tip right now, and this is only pertinent to those listening as this is broadcast, is if you are wanting to do an audiobook for your fiction book, you should wait.
If you're doing non-fiction, the existing models are amazing. But last week, their Version 3 model was released, and it is a game-changer.
The initial reactions are, “I can never go back to Version 2.”
Version 3, from an expression and liveliness perspective, but also from a control and direction perspective, is changing the game. It wasn't even supposed to come out for a couple of months, so they're moving forward with this fast.
The real reason to wait is it's got one massive feature upgrade that I've been waiting for for at least a year: You can add emotion tags. Previously, if we wanted someone to whisper, sometimes it would figure it out from the text.
Other times, we would literally be adding, “he whispered,” “she shouted,” “he said excitedly.” We were kind of gaming the system.
Now, we can add tags in square brackets to the text like [whispers], [shouts], [says thoughtfully], [says in a British accent].
There is this whole world of things it can do that allows us to work much more effectively as a director, particularly for dialogue and emphasis. There is even a button that will read the text and put in suggested tags throughout the book. The AI is reading those instructions but not reading them out loud.
So it is the big breakthrough in terms of us creating audiobooks that sound exactly how we want them to.
Joanna: That is really good. I'm looking forward to that as well. Let's wind it back for people. You mentioned non-fiction quite quickly.
For non-fiction, what do I do about the table of contents, URLs, or images in my text?
Simon: When you upload the ePub, you can just delete those bits.
I feel like people forget that you have control. You can completely change the front matter, the back matter, and the bits around it to be something that's going to work most effectively when it's delivered on the platforms you want. And you can create different versions.
Joanna: And I think it's really important for people to remember with audiobooks that it is an adaptation, however you're doing it. It is a different product.
With Death Valley, for example, I would say to you, “Oh, well, let's just rewrite that sentence,” because it would be easier for me to rewrite it and it will keep the same meaning.
Simon: Exactly. You have that luxury as the author, which is why people doing it themselves is wonderful. When producing your book for you, I can't take those liberties.
Joanna: So let's come to the voice clone idea because, of course, you mentioned earlier that you've licensed your voice. We used my voice clone for Death Valley, and I am still on the fence as to whether or not to license that publicly.
What are some tips for people who want to license their voice or do a voice clone?
Simon: For me, it's been amazing getting this bonus income that I totally didn't expect. For Abby, it's been life-changing. She is the most popular English British female voice. She's called Amelia on ElevenLabs. She's earning enough from her voice that she could quit a toxic job and go full-time writing. It’s extraordinary.
So, in terms of tips, if you are recording your own voice, whether you are going to use it yourself or think about sharing it with others, first of all, the quality of the recording is essential.
You want to be using a good microphone in a quiet place. There are lots of tools to clean it up, but nothing is going to compare to something that's recorded well.
When you are delivering your voice, the delivery needs to be varied but consistent. I generally get authors to read their own book. You want to give variations in terms of tone and volume, from whispering through to high energy, as though you are reading to an engaged audience.
You do not want to put on character voices. That's really important. The AI will pick up on the variations in your delivery, but it gets very confused if you've done character voices because it doesn't know how those fit in with how you speak.
A cheat code for improving the quality if you don't have a really good mic or a quiet area is Adobe Podcast. It's a free service with an enhanced speech function. You can put your recording in there and massively improve how it sounds.
The tip is to not put it out at a 100% treatment; you want kind of 70% to 90% of their enhanced speech applied, or else it sounds too obviously affected by AI.
Joanna: And right now, my J.F. Penn voice is my voice, and I'm the only one who can use it.
There's another step if you want to license it and put it in the voice market, isn't there?
Simon: Yes, and the first challenge of that is genuinely a moral evaluation. If you want to monetize your voice, you have to decide if you are prepared for your voice to be used to say almost anything.
ElevenLabs has controls to stop things like hate speech or sexual content, but to really monetize it, you have to switch off a feature called “live moderation,” which prevents things like swearing.
As soon as you turn that live moderation on, your voice becomes unavailable for most uses that would make money, like audiobooks or conversational AI.
The second option to consider is the notice period. You can choose to have the right to instantly withdraw your voice or set a notice period of up to two years. They pay more if you're prepared to have a longer minimum period.
As a producer, I am not going to start using someone's voice for an audiobook series if I might not have it to use in three months' time. I instantly filter for anything less than a year's notice period and generally only pick two years.
If you want to monetize your voice, you have to turn live moderation off and give a two-year notice period, in my opinion.
A final tip would be to be safe. Do not publicly share your voice's name and connect it with you as a person. Forget about voice recognition for telephone banking, for example.
Also, do your research. See what voices are most popular, what descriptions work best, and think about the sample you provide.
Joanna: As we head towards a close, we do need to quickly come back to —
ElevenReader. It's an emerging place to publish audiobooks, too. You can also upload e-books, and then listeners can choose the voice.
Back in 2020, I wrote in my book on AI that at some point there will be an app where listeners can choose whatever voice they want to listen to my book in, and this is it.
Simon: It's super exciting. It's an app you'll find on your iPhone or Android store. It's the consumer-facing side of ElevenLabs. You can drop in pretty much any content, like PDFs, e-books, and webpages, and it turns any text into speech. Right from the beginning, it's also offered books for direct sale.
Joanna: We have to mention that Melania Trump has used a voice clone of her quite distinctive voice to do her memoir, also called Melania. She has basically said this is the future of publishing. “Here's my AI voice clone, and it's on ElevenReader.”
I thought that was a tipping point for me because it means that it's going mainstream.
Simon: So you can see it like Audible or Spotify, except you can choose what voice you want to narrate it. For authors, it's an amazingly simple way to offer an audiobook.
You don't even have to go through the studio production process. You can just sign up to ElevenReader publishing and upload your book. Boom, they'll review it and publish it.
Joanna: I would say to people, you must —
Read the terms and conditions of any site that you ever upload anything to.
Also, if your e-book is in Kindle Unlimited and exclusive to Amazon, you can't upload that e-book to ElevenReader because it's exclusive.
Simon: And we have just taken Abby's books out of Kindle Unlimited so we can put them in ElevenReader this week.
Joanna: Before we go, you have courses coming and you also offer services to authors.
Tell us about those and where people can find you online.
Simon: Wonderful, thank you, Joanna. First, I'd be a very neglectful father if I didn't mention that Abby's latest book, Stolen Legacy, went live yesterday. You can find Abby Hope Patrick and her Deadly Ever After series on Amazon and, very soon, ElevenReader.
You can find my voice on ElevenReader; I'm “Christopher” on there.
The courses are something new. We've started a new website called Novel Productions. The first course will be “AI Audio for Authors” and will cover everything people need to know to get themselves not just onto ElevenLabs, but all platforms.
It's also going to have training on how to record your own voice clones and monetize them if you want to. I was about to publish it, and then Version 3 of ElevenLabs came out, so I don't want to train anyone on anything that's not going to be the best in a couple of months.
So right now, if you go to Novel.Productions, there will be a waiting list that you can sign up to.
Regarding services, you were my first beta tester outside the books that I publish myself. We're still weighing up how affordable we can make it. I'd rather teach people first, and if they don't want to then do it themselves, we'll see how we can help.
I'm beta testing that with authors, so you can email me at [email protected].
Joanna: Brilliant. Well, thank you so much for your time, Simon. That was great.
Simon: Thank you, Joanna. It has been such a pleasure.
The post Producing AI-Narrated Audiobooks Using ElevenLabs With Simon Patrick first appeared on The Creative Penn.
568 episodes
Manage episode 491104065 series 1567480
Is the high cost of audiobook production holding you back? What if you could create a high-quality audiobook for a fraction of the traditional cost?
In this conversation, Simon Patrick explores the world of AI narration with ElevenLabs, discussing how you can gain complete creative control, and even license your own voice clone for a new stream of income.
This episode is supported by my patrons. Join my Community at Patreon.com/thecreativepenn
Simon Patrick founded Ten Times Better Books to support his daughter, Abby Patrick, as one of ElevenLabs' first users and beta testers. He has produced several of their most popular AI voices. He now develops courses and AI audiobook solutions for independent authors at Novel Productions.
You can listen above or on your favorite podcast app or read the notes and links below. Here are the highlights and the full transcript is below.
Show Notes
- Costs vs benefits of human vs AI narration
- Features of ElevenLabs — realistic and expressive voices, creative control, ownership of final audio files for wide distribution to platforms like Spotify.
- Practical tips for AI narration
- ElevenLabs v3 and emotion tags
- Creating and monetizing a voice clone
- Publishing on ElevenReader
You can find Simon at Novel.Productions or 10xb.com.
Transcript of the interview
Joanna: Simon Patrick founded Ten Times Better Books to support his daughter, Abby Patrick, as one of ElevenLabs' first users and beta testers. He has produced several of their most popular AI voices. He now develops courses and AI audiobook solutions for independent authors at Novel Productions. Welcome to the show, Simon.
Simon: Thank you, Joanna. It's such a joy to be speaking with you. Your podcast and books were foundational to my daughter, Abby, becoming an author and me learning to be her publisher and all that's happened since.
I love your Patreon @thecreativepenn. It's the best money I spend every month, frankly. It's just a great community to be part of, so it's such a joy to be sharing some of what I've learned.
Joanna: Oh, thank you so much. Behind the scenes on the Patreon, Simon has done a video demo of ElevenLabs. Today, obviously, we're doing audio-only. So first up—
Tell us a bit more about your background and why you decided to get into AI-narrated audiobooks.
Simon: Okay. Well, I've got 25 years of experience in marketing and design. I still am halftime head of communications for an international charity, but we've always had our own businesses too.
My wife and I ran a small home education tuition publishing business. We've home-educated our three kids, which brings me to Abby, my daughter who brought me into your world of book publishing.
She was going to college, studying early years education, and was just bored out of her mind. She asked if she could drop out of college to be a writer instead. She'd been writing a book since she was 15. To the astonishment of her friends and some of ours too, we said yes.
Let me add, it was responsible parenting. We made her finish the term, stick it out, and do the work experience. By Christmas 2019, she'd left to pursue finishing her book based on the deal that —
If she learned to write, I would learn to publish for her.
Joanna: Wow!
Simon: So I attended the first Self Publishing Show in that crazy spring of 2020. I think you were there too, just a few days before the pandemic shut us all down. I've listened to hundreds of your podcasts, read your books, done some of the Self-Publishing Formula courses, and learned to be Abby's publisher.
Since then, I have used those skills and connected with a few other authors, so I probably publish a book or two a month, something like that.
Audio has always been the stumbling block. I love audiobooks. As a family, we must consume hundreds of hours a month of them. There are incredible narrators like Ray Porter and Daniel Rigby, who self-narrates his own Audible exclusives, and my absolute favourite, a guy called Jeff Hayes, who narrates incredibly.
They're amazing talents, and I don't think AI is going to touch them because they bring so much humanity to the performance.
But to ordinary authors and publishers, those narrators are inaccessible. I don't even want to think about what they cost.
For Abby, who is still just starting out, any professional narration would cost her three to four thousand dollars for her books. The math just doesn't work. While there are options like a royalty split with ACX, Audible's publishing platform, I struggle with that.
Firstly, you're tied in exclusively to Audible for seven years, and we're big fans of going wide.
Secondly, you're only getting 20% of the royalties when it's being split. I just don't think for us, they're ever going to make that money back. So all of that is what led me in early 2023 to be searching for AI audio options.
ChatGPT was going crazy, you were demoing all of that at the time, and I figured there must be some kind of AI audio option that would let me take control of the process and hopefully produce good audiobooks way more cheaply than current options. That's when I discovered ElevenLabs.
Joanna: There's lots to unpack there. First of all, as you mentioned, there are some incredible human narrators, and we want to acknowledge them. I'm also a human narrator myself.
For most authors, especially indie authors or new authors, it's not a choice between human or AI; it's AI or nothing because they can't afford the fees.
As you said, a lot of the time you don't know if you're going to make the money back. So I think that's really important to acknowledge.
There are lots of AI narration options now. It is hard for authors to decide which platform to use.
So what is ElevenLabs, and why do you think it's the best option for quality and also for publishing reach?
You mentioned ACX, and there's obviously AVV, the Audible Virtual Voice. Most people might think, “Well, maybe I should just do that.”
Give us an overview on why you made that decision to go with ElevenLabs.
Simon: Absolutely. ElevenLabs continues to be the most realistic AI platform out there. They kicked off about two and a half years ago. I was one of their first users, and even back then, they were so much better than everything else.
There were lots of programmers wanting to do clever things with APIs and websites, but I just wanted to make audiobooks with these things. They were actually listening, which is remarkable in the publishing industry sometimes.
About a year and a half ago, and for reference, we're in June 2025 right now, they launched ElevenLabs Studio. It can take a whole book, like the ePub that I've worked on for Abby or a Word document, you can drop it in and have it convert it chapter by chapter, paragraph by paragraph into a great-sounding audiobook.
The high quality and natural-sounding elements of it are why I was first attracted to them. The expressiveness is just another step above.
The comparison with Amazon's Virtual Voice is that it's so much more pleasant to listen to, but it doesn't just sound better —
What I love about it is the complete creative control it gives me. There are thousands of voices I can pick from, a whole library of voices.
They're real people, people like me, actually, who have recorded their voice and then licensed it to ElevenLabs and get paid a small amount. Then when it's used, there's actually compensation to those who've licensed their voices to it.
It’s not like the large language models like ChatGPT where the whole universe seems to have been scraped and compiled into this thing. They're being super diligent about making sure it's all kosher, that it's real people's voices and they're getting compensated.
Beyond that, the tools they're building give you control. They're incredibly open to listening to feedback, which has been brilliant. I'm talking to the programmers regularly. They've got a great Discord where they're asking for feedback.
With the tools, I can spend time perfecting the book. I can get the dialogue just the way I want it. I can create a duet audiobook with a male narrator for male POVs and a female for female POVs. I can even do multi-cast and assign different voices to each character in the book.
Probably most importantly, I can download the whole thing as WAV files or MP3s.
The big difference with something like Amazon Virtual Voice is that I own what I've created with ElevenLabs.
It's a commercial license, so I can put them into BookFunnel's audio delivery service, I can put them on my website, you can add them to a Kickstarter, stick them on YouTube, or just give them away for free if I wanted to.
In terms of publishing reach, they're doing a lot. We were kind of stuck with either self-publishing, YouTube, or Kobo, who are superstars and super open. But one of the game changers that's happened in the last few months is you can now add them to Spotify, which has come in as the big disruptor for Audible and Amazon.
You've done that recently with the book that we produced together. How's that been?
Joanna: Death Valley, which has been on the feed, you can listen to a couple of chapters, and that's using my voice clone. We'll come back to the voice clone in a minute.
As you mentioned, I think it's mainly the ownership of the files and the Spotify distribution.
At the moment, it really is only Google's auto-narration and ElevenLabs that you can use legitimately on Spotify through Findaway Voices. You cannot use the AVV files anywhere else.
So I think that's incredibly important because, of course—
We can talk forever about how to make audio, but it's also about selling audio, isn't it?
Simon: And for anyone who's dealt with KDP or Audible customer services, I probably don't have to say what the experience was like. So another reason I love ElevenLabs is their support has been brilliant.
There's this Discord I mentioned where there are dozens of super helpful and patient people giving input. Their customer service team replies quickly, it's personal, they're helpful, and they've got amazing documentation.
Stepping back a little bit, the fact that we can create well-narrated audiobooks for a hundred to two hundred dollars plus a few days of learning and production on each one is just incredible.
I took my two boys to a local Comic-Con recently, and there was a self-published author there with a single beautiful book. He'd clearly poured his heart and money into this thing.
There were beautiful cover bookmarks and giveaways, and then I saw he had an audiobook. We got talking about it. He'd got it professionally narrated, and he opened up and said it cost £7,000.
I honestly wanted to cry. I genuinely get emotional about it even now. I want us as authors and publishers to put our time, energy, and money into creating incredible stories and getting our words out into the world and just make everything around that as simple as possible, using tech where we can.
Joanna: I just want to comment on this because one of the reasons we timestamp these episodes is because I'll have people email me and say, “Oh, but you said this,” and I'm like, “Yeah, but when did I say that?”
For example, in 2014 when I started audiobook publishing on ACX, they were the only thing out there, and they were the bee's knees. We had a much higher royalty rate, there were very few audiobooks around, and you could make that money back. The amount of money you mentioned, you could make back quite quickly.
Now, I know some people will be saying, “Oh, but I make that money back.” And I'm like, “Well, yeah, if you are an established author, absolutely.” If you have a popular series, if you know that you already make that kind of money from audiobooks, then you can.
We are in a different era in 2025. There's a lot more audio, and of course, AI is a double-edged sword. There is going to be more audio than ever before.
The question is, how do we make that money back?
If we lower the costs, then we also lower the amount of revenue we need to make to offset that.
Simon: And you know, it's going to move on fast, but now is an extraordinary time. I love good audiobooks, and the fact that AI can help me make those now is very exciting to me.
Joanna: It's super fun. You and I both have a reasonably technical background, so we can use these tools. To be fair, you said wonderful documentation. I am terrible at reading documentation. I just jump in and give it a go.
There are people who don't know anything about AI audio. How does it work?
Can you give a few key elements and tips for authors if they want to use ElevenLabs for AI narration?
Simon: Yeah, I've got five tips for you. First, go in and check it out. There is a creator package that you can get for half price for the first month. I would say for exploration, it is worth getting for $11 just to have a little bit of a play with it.
Getting familiar with the platform can be a little intimidating because it does lots of different things, like voice changing, sound effects, and dubbing video.
We are really only interested in the Studio tool. As soon as you go into that Studio tool, it will start to feel familiar. You can click “Create an audiobook,” drop your ePub in there, and basically instantly see how this thing works, breaking it into chapters, applying a voice, and clicking play.
The warning though is this creator package, at $22 a month, is not good enough to create professional audiobooks. This is my first tip: you need the Pro package, which is $99 a month, because that is what outputs 192 kilobits per second.
That's the technical specification that you need to go on BookFunnel or Spotify. You only get that by using the $99 a month package. You get about 10 hours of audio creation in that, so for a lot of people, that could make a book. The hours roll over, so you can either wait for month two and have enough hours to do it.
As soon as you're done with your book, you can downgrade to a $5 a month package, so don't worry, it's not trapping you in there. Just know that you need the $99 a month Pro package to produce your audiobooks.
My second tip is to —
Really spend time choosing or making your voice.
You had an experience with this, Joanna, where you try out a voice, commit to it, and then realize two or three chapters in that you don't like it. I've had that experience too.
So use that first month on the creator package to really play with voices. Generate your first chapter in five or six different voices. Really get familiar and comfortable with a voice that you want to use so that you're not wasting time and credits when you get into producing something.
Third, don't get overwhelmed; have fun with it. It's amazing hearing your book come to life in audio. I feel if you give it an hour, the Studio tool is pretty intuitive. If you have the level of tech ability to do something like typesetting in Atticus or Vellum or use Scrivener, you can absolutely master using Studio.
My fourth tip, and a warning, is that it still takes time. This isn't some one-button wonder. Your novella, Death Valley, was six and a half hours long. That took 18 hours of editing.
Joanna: And this is where people get confused because with AVV, the Audible Virtual Voice, there is no control. You literally do click one button and it goes live. There's almost no point in proof-listening to it because you can't actually change it.
With Studio, you have such fine control that you can add pauses, a breath in the middle of a sentence, or change the emphasis.
You kind of direct it with Studio, don't you?
Simon: That's the word I use, yes. Directing. It's like you're directing an audiobook. If you are doing non-fiction, it is borderline a one-click wonder. It will deliver it amazingly, and you need to listen to it once, and you're good to go.
If you've spent a year or two writing a book, think about the effort we put into making it look good in the typesetting and the covers. A day or two to listen to it, refine it, and make it represent your vision is not time wasted. I'm only interested in high-quality audiobooks that do the story justice.
I want to be proud of it. I want Abby to be proud of it. I highly encourage people, particularly fiction writers, to be prepared to spend two or three days working on the book. It is so rewarding to get something that comes out the other end that you are proud of.
Joanna: And just on the proofing, if you work with a human narrator, you will be doing proofing. You listen to the audio, find the timestamp, explain what you want changed, and send it back to the human to rerecord.
The process is probably pretty similar in terms of the amount of time taken, but you can do it yourself, and there are areas that help.
For example, if there's a character name, you can fix that once for the whole audio, can't you?
Simon: Correct. It's a pronunciation dictionary for any words. It really struggled with “croissant.” It does little random things. I think our favorite was when it pronounced “desert” as “dessert.”
Joanna: It just would not stop wanting some dessert! What are some other tips?
Simon: My fifth and final tip right now, and this is only pertinent to those listening as this is broadcast, is if you are wanting to do an audiobook for your fiction book, you should wait.
If you're doing non-fiction, the existing models are amazing. But last week, their Version 3 model was released, and it is a game-changer.
The initial reactions are, “I can never go back to Version 2.”
Version 3, from an expression and liveliness perspective, but also from a control and direction perspective, is changing the game. It wasn't even supposed to come out for a couple of months, so they're moving forward with this fast.
The real reason to wait is it's got one massive feature upgrade that I've been waiting for for at least a year: You can add emotion tags. Previously, if we wanted someone to whisper, sometimes it would figure it out from the text.
Other times, we would literally be adding, “he whispered,” “she shouted,” “he said excitedly.” We were kind of gaming the system.
Now, we can add tags in square brackets to the text like [whispers], [shouts], [says thoughtfully], [says in a British accent].
There is this whole world of things it can do that allows us to work much more effectively as a director, particularly for dialogue and emphasis. There is even a button that will read the text and put in suggested tags throughout the book. The AI is reading those instructions but not reading them out loud.
So it is the big breakthrough in terms of us creating audiobooks that sound exactly how we want them to.
Joanna: That is really good. I'm looking forward to that as well. Let's wind it back for people. You mentioned non-fiction quite quickly.
For non-fiction, what do I do about the table of contents, URLs, or images in my text?
Simon: When you upload the ePub, you can just delete those bits.
I feel like people forget that you have control. You can completely change the front matter, the back matter, and the bits around it to be something that's going to work most effectively when it's delivered on the platforms you want. And you can create different versions.
Joanna: And I think it's really important for people to remember with audiobooks that it is an adaptation, however you're doing it. It is a different product.
With Death Valley, for example, I would say to you, “Oh, well, let's just rewrite that sentence,” because it would be easier for me to rewrite it and it will keep the same meaning.
Simon: Exactly. You have that luxury as the author, which is why people doing it themselves is wonderful. When producing your book for you, I can't take those liberties.
Joanna: So let's come to the voice clone idea because, of course, you mentioned earlier that you've licensed your voice. We used my voice clone for Death Valley, and I am still on the fence as to whether or not to license that publicly.
What are some tips for people who want to license their voice or do a voice clone?
Simon: For me, it's been amazing getting this bonus income that I totally didn't expect. For Abby, it's been life-changing. She is the most popular English British female voice. She's called Amelia on ElevenLabs. She's earning enough from her voice that she could quit a toxic job and go full-time writing. It’s extraordinary.
So, in terms of tips, if you are recording your own voice, whether you are going to use it yourself or think about sharing it with others, first of all, the quality of the recording is essential.
You want to be using a good microphone in a quiet place. There are lots of tools to clean it up, but nothing is going to compare to something that's recorded well.
When you are delivering your voice, the delivery needs to be varied but consistent. I generally get authors to read their own book. You want to give variations in terms of tone and volume, from whispering through to high energy, as though you are reading to an engaged audience.
You do not want to put on character voices. That's really important. The AI will pick up on the variations in your delivery, but it gets very confused if you've done character voices because it doesn't know how those fit in with how you speak.
A cheat code for improving the quality if you don't have a really good mic or a quiet area is Adobe Podcast. It's a free service with an enhanced speech function. You can put your recording in there and massively improve how it sounds.
The tip is to not put it out at a 100% treatment; you want kind of 70% to 90% of their enhanced speech applied, or else it sounds too obviously affected by AI.
Joanna: And right now, my J.F. Penn voice is my voice, and I'm the only one who can use it.
There's another step if you want to license it and put it in the voice market, isn't there?
Simon: Yes, and the first challenge of that is genuinely a moral evaluation. If you want to monetize your voice, you have to decide if you are prepared for your voice to be used to say almost anything.
ElevenLabs has controls to stop things like hate speech or sexual content, but to really monetize it, you have to switch off a feature called “live moderation,” which prevents things like swearing.
As soon as you turn that live moderation on, your voice becomes unavailable for most uses that would make money, like audiobooks or conversational AI.
The second option to consider is the notice period. You can choose to have the right to instantly withdraw your voice or set a notice period of up to two years. They pay more if you're prepared to have a longer minimum period.
As a producer, I am not going to start using someone's voice for an audiobook series if I might not have it to use in three months' time. I instantly filter for anything less than a year's notice period and generally only pick two years.
If you want to monetize your voice, you have to turn live moderation off and give a two-year notice period, in my opinion.
A final tip would be to be safe. Do not publicly share your voice's name and connect it with you as a person. Forget about voice recognition for telephone banking, for example.
Also, do your research. See what voices are most popular, what descriptions work best, and think about the sample you provide.
Joanna: As we head towards a close, we do need to quickly come back to —
ElevenReader. It's an emerging place to publish audiobooks, too. You can also upload e-books, and then listeners can choose the voice.
Back in 2020, I wrote in my book on AI that at some point there will be an app where listeners can choose whatever voice they want to listen to my book in, and this is it.
Simon: It's super exciting. It's an app you'll find on your iPhone or Android store. It's the consumer-facing side of ElevenLabs. You can drop in pretty much any content, like PDFs, e-books, and webpages, and it turns any text into speech. Right from the beginning, it's also offered books for direct sale.
Joanna: We have to mention that Melania Trump has used a voice clone of her quite distinctive voice to do her memoir, also called Melania. She has basically said this is the future of publishing. “Here's my AI voice clone, and it's on ElevenReader.”
I thought that was a tipping point for me because it means that it's going mainstream.
Simon: So you can see it like Audible or Spotify, except you can choose what voice you want to narrate it. For authors, it's an amazingly simple way to offer an audiobook.
You don't even have to go through the studio production process. You can just sign up to ElevenReader publishing and upload your book. Boom, they'll review it and publish it.
Joanna: I would say to people, you must —
Read the terms and conditions of any site that you ever upload anything to.
Also, if your e-book is in Kindle Unlimited and exclusive to Amazon, you can't upload that e-book to ElevenReader because it's exclusive.
Simon: And we have just taken Abby's books out of Kindle Unlimited so we can put them in ElevenReader this week.
Joanna: Before we go, you have courses coming and you also offer services to authors.
Tell us about those and where people can find you online.
Simon: Wonderful, thank you, Joanna. First, I'd be a very neglectful father if I didn't mention that Abby's latest book, Stolen Legacy, went live yesterday. You can find Abby Hope Patrick and her Deadly Ever After series on Amazon and, very soon, ElevenReader.
You can find my voice on ElevenReader; I'm “Christopher” on there.
The courses are something new. We've started a new website called Novel Productions. The first course will be “AI Audio for Authors” and will cover everything people need to know to get themselves not just onto ElevenLabs, but all platforms.
It's also going to have training on how to record your own voice clones and monetize them if you want to. I was about to publish it, and then Version 3 of ElevenLabs came out, so I don't want to train anyone on anything that's not going to be the best in a couple of months.
So right now, if you go to Novel.Productions, there will be a waiting list that you can sign up to.
Regarding services, you were my first beta tester outside the books that I publish myself. We're still weighing up how affordable we can make it. I'd rather teach people first, and if they don't want to then do it themselves, we'll see how we can help.
I'm beta testing that with authors, so you can email me at [email protected].
Joanna: Brilliant. Well, thank you so much for your time, Simon. That was great.
Simon: Thank you, Joanna. It has been such a pleasure.
The post Producing AI-Narrated Audiobooks Using ElevenLabs With Simon Patrick first appeared on The Creative Penn.
568 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.