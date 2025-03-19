Google is diving deeper into voice AI, announcing the addition of its Chirp 3 speech-to-text and HD text-to-speech models to its Vertex AI development platform, set to launch next week. This marks a significant step for the tech giant as it continues to expand the capabilities of its AI tools, already available for use by developers.

Alongside this announcement, Google revealed that Chirp 3 will roll out eight new voices in 31 languages, allowing developers to create voice assistants, audiobooks, and voice-overs for videos. Google’s platform aims to cater to industries ranging from customer support to content creation, with an emphasis on natural-sounding speech and multilingual support. This move also comes as part of Google’s broader strategy to bolster its generative AI offerings, which now include its flagship language model, Gemini, and its image-generation tool, Imagen.

Expanding the AI Voice Landscape

This comes at a time when other companies are also ramping up their work in the voice AI space. Sesame, for instance, recently launched its own platform to allow developers to build customized apps using its advanced AI voices, which have already garnered attention for their realism. As Google pushes ahead with its own advancements, it faces competition from the likes of ElevenLabs, a startup that has raised significant funding to accelerate its AI voice offerings.

However, Google has made it clear that there will be usage restrictions with Chirp 3, which will be monitored closely for potential misuse. Despite these limitations, Thomas Kurian, CEO of Google Cloud, mentioned that the platform’s safety features are still in progress as the company navigates the challenges of scaling AI voice models responsibly.

Chirp 3 is still considered experimental, and its capabilities will be tested in the coming months. The voice AI technology is part of Google’s larger effort to bring generative AI features into its suite of tools, including video generation tools like Veo 2. While it’s yet to be seen whether Chirp 3 can rival the more “realistic” voice models of competitors like Sesame, Demis Hassabis, CEO of DeepMind, emphasized that AI’s progress is not instantaneous. He believes it will be years before AI reaches the capabilities associated with artificial general intelligence (AGI).

Google’s move with Chirp 3 is significant, but it’s just another step in the ongoing development of voice AI. The company originally began its voice services under the codename “Chirp,” seeking to compete with Amazon’s Alexa, and it now looks set to enhance its offerings in response to the increasing demand for voice-driven experiences.

What The Author Thinks Voice AI, especially platforms like Chirp 3, could revolutionize content creation across industries. With the power to create voice-overs, audiobooks, and even assist with customer service, the potential for this technology is vast. However, it raises concerns about the future of voice actors, content creators, and even ethical issues regarding the use of AI-generated voices. As this technology advances, it will be important to balance innovation with responsibility.

Featured image credit: PickPik

