In just two years since its inception, ElevenLabs, a startup specializing in AI voice technology, has achieved unicorn status, boasting a valuation of $1.1 billion. The company, co-founded by former Google and Palantir technologists, recently announced an $80 million Series B funding round. This significant investment, a ten-fold increase from its Series A valuation of $100 million, was led by prominent investors, including Andreessen Horowitz (a16z), Nat Friedman (former GitHub CEO), and Daniel Gross (former Apple AI leader), along with contributions from Sequoia Capital and SV Angel.
Originally established to address the challenges of voice cloning and synthesis across multiple languages, ElevenLabs has rapidly advanced in this field. The latest capital infusion is earmarked for further research and development, enhancing existing products, and introducing innovative features. Among these new features are a tool for dubbing full-length movies and a novel marketplace where users can monetize their cloned voices.
The upcoming weeks are expected to witness the rollout of these features, marking a significant leap in content accessibility. In a world of diverse dialects and languages, localizing content to cater to all audiences has been a persistent challenge. Traditionally, this process has involved focusing on English or other mainstream languages and relying on dubbing artists for other markets. However, this approach often results in content that strays from the original in terms of quality and authenticity. Moreover, scaling such content for wider distribution has been a logistical and financial challenge, particularly for smaller production teams.
This challenge was personally observed by Piotr Dabkowski, a former Google machine learning engineer, and Mati Staniszewski, a former deployment strategist at Palantir. Both founders, hailing from Poland, were inspired to create ElevenLabs after witnessing the limitations of traditional dubbing methods in movies. Their vision was clear: harness the power of AI to make all content universally accessible in any language and voice.
Since its debut in 2022, ElevenLabs has shown consistent growth, initially making a mark with a text-to-speech model that produced natural-sounding AI voices in English. The model then expanded to include support for multiple languages, including Polish, German, Spanish, French, Italian, Portuguese, and Hindi. Concurrently, the company developed a Voice Lab, enabling users to clone their own voices or create new synthetic ones. This feature allowed for the conversion of text, such as podcast scripts, into audio content in the desired voice and language.
ElevenLabs’ proprietary technology stands out for its context awareness and high compression capabilities, delivering ultra-realistic speech. Unlike traditional models that generate sentences in isolation, ElevenLabs’ model comprehends word relationships and adapts delivery based on broader context. Its dynamic prediction capability enables it to generate thousands of voice characteristics while producing speech, a feature emphasized by Staniszewski in his interview with VentureBeat.
The company’s innovative tools quickly gained traction, amassing over a million users shortly after their beta launch. ElevenLabs also expanded its AI voice research, introducing AI Dubbing, a tool for speech-to-speech conversion that allows translation of audio and video content into 29 languages while preserving the original speaker’s voice and emotional nuances. This innovation has attracted a significant portion of the Fortune 500 companies as clients, including notable publishers like Storytel, The Washington Post, and TheSoul Publishing.
ElevenLabs has established over 100 B2B partnerships, leveraging AI voices across various sectors, including content creation, education, publishing, entertainment, and accessibility. The company is continually innovating to provide users with an advanced set of features. A notable addition is the Dubbing Studio workflow, building upon the AI Dubbing product. This new tool offers professional users a comprehensive suite for dubbing movies, including the ability to generate and edit transcripts, translations, and timecodes. Supporting 29 languages, the Dubbing Studio workflow enhances content localization, though it currently lacks lip-syncing capabilities.
As ElevenLabs continues to grow, it remains committed to breaking barriers in voice technology and content accessibility. The company’s journey from a startup to a unicorn in a remarkably short span is a testament to the transformative potential of AI in the realm of voice synthesis and localization. With its innovative approach and dedication to research, ElevenLabs is poised to redefine how we experience and interact with content in a multilingual world.