Stars from Hollywood’s golden age are being reborn through celebrity estates trading artificial intelligence voice clones, suggesting a new business model is addressing some of the “Wild West” concerns about unauthorized artificial intelligence imitation.
ElevenLabs, an audio technology startup backed by venture capital firms including Andreessen Horowitz and Sequoia, has inked multiple deals with the legendary actor’s estate to develop its IconicVoices tool, which allows users to listen via audiobook apps. Have an AI-generated voice read to them. Stars include Burt Reynolds, Judy Garland, James Dean and Sir Laurence Olivier.
Launched in 2023, ElevenLabs creates news for books and news articles, video game characters, film pre-production, as well as social media and advertising. The company already works with publishers such as The New York Times and The Washington Post, and earlier this year it was selected by Disney to join its accelerator program.
“You need about 30 minutes of high-quality audio to create a professional voice clone,” said Sam Sklar, a member of the ElevenLabs development team. The voices are generated from celebrity catalogs. Once created, it can be called to read text (articles, PDFs, ePubs, newsletters, or other text content). However, speech and content cannot be exported, all listening is in the reading app.
For example, a user can read an article by James Dean narrated to them in the app, but the user cannot access the voice of anything that is not already in the app.
Such deals could help set boundaries for a future where AI-generated speech content becomes less controversial and more of a controlled, curated realm. Google Play and Apple Books already take advantage of AI-generated sounds to some extent, although there are significant barriers to reconstructing the rhythm, intonation and emotion of human speech.
The artificial intelligence industry has been dogged by concerns over the use of celebrity voices, with OpenAI accusing the company of plagiarizing actress Scarlett Johansson’s voice after she refused to license it.
“We are very aware of the risks associated with synthetic media and take the safe use of our tools very seriously,” Sklar said. Safeguards include active censorship of content, forcing accountability through bans, and special provisions to protect the impact of AI voices on the 2024 election.
There is still a lot of anxiety among the current generation of actors about using artificial intelligence to generate voice content. Voice actors in video games have raised concerns, and last year’s film and television strikes stemmed in large part from anxiety over the use of artificial intelligence. Using the signature sound of estates for sale is a market niche that potentially avoids these pitfalls and represents a new revenue stream from artificial intelligence, rather than one that is lost because of artificial intelligence.
The problem of using similar celebrity voices has existed long before the advent of artificial intelligence, such as the 1988 case of Frito Lay using a voice similar to that of Tom Waits in an advertisement, and the 2007 case of Waits Another case after I rejected advertising deals for a long time. AI offers an easier way to create voices, and a recent lawsuit filed against AI startup Lovo, alleging it improperly and gratuitously used voice actors when generating AI voices, is a reminder that AI voice generation The world may still be a complex one to some extent. (Lovo denies the allegations in the lawsuit and points to its revenue-sharing model for providing actors with cloned voices.)
Steve Cohen, a partner at Pollock & Cohen, said it’s difficult to assess the protections in place without reviewing the specific language of IconicVoices’ contract.
ElevenLabs points out how its IconicVoices tool obtains permissions and manages sound usage.
“Allowing the use of one’s voice is one of the fundamental principles,” Cohen said. “I think the key elements are permission, compensation and control.”
Cohen said clearer new laws could also curb those who try to use their voices inappropriately, “not for hardcore bad guys, but for extreme cases.” But he quoted Bette Davis in “All About Eve,” saying, “‘Buckle up; it’s going to be a bumpy ride.'”
How realistic cloned sounds will be is also an evolving question. Many experts say performance quality is limited because artificial intelligence doesn’t “know” what it’s talking about. Sklar said ElevenLabs’ latest voice quality levels are indistinguishable from real human speech. “ElevenLabs’ text-to-speech tool understands the context of individual words,” he said.
Artificial intelligence is only as good as the model that trains it, and actor voice data is integrated as part of that process.
“The power of neural models comes from imitating/memorizing nuances and patterns that exist in the training material,” said Nauman Dawalatabad, a postdoc in MIT’s Computer Science and Artificial Intelligence Laboratory who has conducted extensive research on artificial intelligence speech generation. . “The quality and diversity of training data significantly affects model performance.”
Movie star voices can enhance AI imitation and learning by providing “a high-quality speech dataset for training and fine-tuning large models,” which Dhavaratabad said is critical to the process. But he has reservations about “sounding like a human” as the correct test in the field of artificial intelligence speech, because it may exacerbate the antagonistic relationship between human and synthetic voices.
Voice actors remain divided over the technology, with some refusing to consider any deal, but others saying the opportunity to clone their voices to make some form of audiobook faster and cheaper cannot be ignored. “Artificial intelligence technology can help with workflow,” said Michele Cobb, executive director of the Audio Publishers Association. “AI is not a new tool for voiceover talent, producers and publishers; Many people use it to improve quality control in post-production.
Davaratabad says that recent generative models have shown huge improvements compared to earlier iterations, making it increasingly difficult to distinguish falsetto from real sounds by ear alone. He added that AI voice licensing could ease the workload of voice actors but would not replace them, as they “mediate by focusing on correcting or enhancing ineffable aspects such as intonation, warmth and accent, which There are still challenges.