ElevenLabs, one of many extra fashionable startups working within the area AI audio, mentioned Thursday that it has raised a Collection C spherical of $180 million, valuing the corporate at $3.3 billion post-money. a16z and ICONIQ Progress are co-leading funding.
Rumors of the fundraise have been first reported by TechCrunch. The ultimate numbers affirm some however not all the particulars we beforehand reported (particularly, the general dimension of the spherical is smaller than we had heard; the valuation and lead buyers are the identical).
The funding can be used to proceed constructing out ElevenLabs’ audio instruments and for enterprise improvement.
Mati Staniszewski, the CEO who co-founded the corporate with childhood buddy Piotr Dabkowski, mentioned in an interview that the startup is focusing its analysis on constructing audio AI fashions which can be extra expressive and have extra management. Staniszewski added that the corporate can also be specializing in “omni-models”: combining text-based fashions with its audio fashions for multimodal interactions.
There was a frenzy of investor curiosity in ElevenLabs going again a number of months, on the again of two essential currents. First, there was an enormous wave of hype round generative AI that has been catching numerous firms in its wake. Second, ElevenLabs has emerged as a serious participant amongst these offering artificial voice know-how. Dozens of main publishers and content material creators throughout verticals like media and gaming, in addition to quite a lot of different tech startups, are all utilizing ElevenLabs’ know-how to energy their voice and audio options.
Unsurprisingly, that has translated into a really crowded funding spherical with numerous distinguished names.
New buyers on this Collection C embrace NEA, World Innovation Lab (WiL), Valor, Endeavor Catalyst Fund, and Abu Dhabi funding agency Lunate. Previous buyers additionally taking part embrace Sequoia Capital, Salesforce Ventures, Smash Capital, SV Angel, NFDG, and BroadLight Capital.
Alongside these, ElevenLabs can also be choosing up quite a lot of new strategic backers — that’s, firms utilizing its know-how who at the moment are investing in it, too. These embrace Deutsche Telekom, LG Expertise Ventures, HubSpot Ventures, NTT DOCOMO Ventures, and RingCentral Ventures.
ICONIQ accomplice Seth Pierrepont will be a part of the corporate’s board, alongside present board members Jennifer Li from a16z and the co-founders of the corporate.
ICONIQ has been ramping up its actions round generative AI startups. Tapping into written output, the agency additionally co-led a $200 million spherical in Author final November.
“We’ve got all the time felt that audio is an important modality, and we thought there can be a really huge firm constructed on this class,” Pierrepont advised TechCrunch. “We’ve got noticed ElevenLabs from its launch, and we have been impressed by the standard of the know-how, how shortly it ascended by way of mindshare and momentum, and the depth of area experience of the founders.”
Pierrepont added that as a board member, numerous the conversations with the corporate can be round creating new use instances for audio and discovering the proper markets for it.
At a time when startups are nonetheless discovering it difficult to shut development rounds, it’s notable that ElevenLabs raised its Collection B spherical of $80 million, which valued it at $1 billion, only a yr in the past. ElevenLabs has raised a complete of $281 million up to now.
The product roadmap
Along with a concentrate on bettering its AI fashions, the corporate plans to make use of the funding to develop its conversational AI builder with an ambition to succeed in extra shoppers instantly and thru partnerships.
Final yr, the corporate debuted an AI conversational agent platform, and a key a part of that product was growing a speech-to-text element. Staniszewski famous that the corporate desires to enhance in that space much more.
“We wish to perceive what’s being mentioned by you in a dialog higher. We’re engaged on methods to maneuver away from solely producing content material and understanding and transcribing speech,” Staniszewski mentioned. “Many individuals say that speech-to-text is a solved downside. However for a lot of languages, it’s fairly unhealthy. We predict we are able to construct higher speech detection fashions as a result of we’ve got in-house groups to annotate knowledge and provides us fast suggestions.”
The corporate additionally desires to double down on creating AI-powered conversational brokers by supporting legacy communications like telephony and higher integrating completely different sorts of information sources. That is partly why it’s partnering with telcos on this spherical.
It’s additionally being utilized by its clients to faucet into their very own archives. Final yr, ElevenLabs partnered with TIME publication to deploy a conversational bot for customers to ask questions on TIME Individual of the Yr.
Staniszewski mentioned the corporate envisions extra conversational AI brokers on websites: on information websites, for instance, customers would be capable of ask questions on tales or ask the bot to summarize them.
The CEO additionally famous that whereas AI-powered voice bots’ high quality has improved, the issue of sounding pure whereas reacting to people talking or emoting in several methods has not been solved but.
“The best way I converse to you impacts the way you react or reply to me. Typically, I’ll be excited, or typically, I’ll be calm, and at occasions, I’ll interrupt you. You’ll reply to me accordingly. Present-gen AI options are on the verge of being good, however they’re not so good as people,” Staniszewski mentioned.
ICONIQ’s Pierrepont additionally emphasised that if AI doesn’t perceive you properly when you’re speaking, machine communication breaks down and customers instantly lose curiosity.
ElevenLabs has principally grown its attain (and income funnel) by means of B2B partnerships. But it surely’s additionally going out on a direct limb, too.
In 2024, the startup launched its first purely consumer-facing product, ElevenLabs Reader, an app that reads out articles, textual content, and paperwork. Later, the corporate added the power to create a podcast with generative AI voices from paperwork and internet pages — not not like what you are able to do with Google’s NotebookLM. Staniszewski mentioned that it desires to increase into extra shopper experiences.
It might truly already be doing that. TechCrunch noticed that the corporate has been testing a program on the ElevenLabs Reader app inviting customers to publish audiobooks on the platform. The corporate additionally desires to offer instruments to creators to have a number of voices learn out an audiobook sooner or later whereas additionally creating higher localization.
Staniszewski famous that the corporate is determining methods for customers and corporations to higher distribute their content material, together with by itself app. Whether or not that brings it into precise direct competitors with its clients can be one thing to observe. (That has been one purpose why many B2B tech firms desire to steer clear of direct-to-consumer performs.) Notably, ElevenLabs powers voice know-how for audio content material platforms like Lightspeed-backed Pocket FM and Google-backed Kuku FM.
ElevenLabs already powers AI-generated audio on merchandise and platforms like Perplexity, Rabbit R1, Chess.com, ESPN, Lex Fridman podcast, The Atlantic, and Synthesia. The aim for the corporate is to be in additional locations and likewise personal an end-to-end dialog stack so it could generate extra experiences and insights for its clients.
Security
Not all of ElevenLabs’ silver linings have been with out clouds: its tech has been implicated in a couple of notable misinformation campaigns. A latest report from menace intelligence firm Recorded Future discovered that the corporate’s product was utilized in a Russian propaganda operation. Final yr, somebody used the corporate’s voice platform to create an audio deepfake of Joe Biden. In 2023, Motherboard reported that 4chan members allegedly used the AI audio technology software to create voices that appeared like Joe Rogan, Ben Shapiro, and Emma Watson to unfold problematic content material.
However the firm has been fast to reply. At this time, it has a coverage prohibiting “unauthorized, dangerous, or misleading impersonation.” Plus, it makes use of a mixture of machine-led and human moderation to weed out such content material. Nevertheless, as the corporate grows its set of instruments and has extra direct shopper touchpoints, this opens the door to extra alternatives for malicious actors to search for methods to misuse it.
“As one of many frontrunners of AI audio work, we do deal with it as our accountability to construct the proper security mechanism as we construct out the know-how. We are going to often make decisions to prioritize security over pace of deployment or business profit,” Staniszewski mentioned.
Staniszewski added that whereas the corporate follows C2PA, an ordinary to trace content material utilizing metadata, it additionally has a public software that permits anybody to verify if audio was generated via ElevenLabs know-how utilizing digital signatures it locations within the audio throughout technology. That is also a monitor that continues to develop over time as approaches for misuse additionally develop into extra refined.