DMR News

Advancing Digital Conversations

OpenAI Rolls Out Advanced Voice Mode for ChatGPT to Select Users

ByHilary Ong

Jul 31, 2024

OpenAI Rolls Out Advanced Voice Mode for ChatGPT to Select Users

OpenAI has begun rolling out an advanced Voice Mode for ChatGPT, designed to enhance the user experience with more natural, real-time conversations.

Starting today, July 31, a select group of paid ChatGPT users will have access to this new feature, and the full rollout to all ChatGPT Plus members is planned for the fall.

The company announced on X that this advanced version of Voice Mode allows users to interrupt the AI chatbot at any time during conversations. Additionally, it features the capability to sense and respond to users’ emotions, aiming to provide a more interactive and engaging experience.

This development builds on OpenAI’s previous support for voice conversations, which was introduced in September last year. The advanced Voice Mode, first publicly demoed in May, utilizes a single multimodal model for its voice capabilities.

This is a departure from the prior version, which relied on three separate models, and aims to reduce latency, thereby improving the responsiveness and fluidity of conversations with the chatbot.

Addressing Concerns Over Voice Similarities

However, the May demo faced criticism due to the voice option’s striking resemblance to actress Scarlett Johansson. Johansson, known for voicing the AI character Samantha in Spike Jonze’s film Her, became a point of comparison for the chatbot’s voice, leading to controversy.

Although OpenAI clarified that the voice actor was not intended to imitate Johansson, the similar-sounding voice was subsequently removed, and the release of the advanced Voice Mode was delayed to address these concerns.

OpenAI spokesperson Taya Christianson revealed that the delay, originally scheduled for an alpha release in late June, was extended by a month to ensure the model met safety standards. The safety measures include:

  • Content Detection: Improved the model’s ability to detect and refuse inappropriate or sensitive content.
  • External Testing: Collaborated with over 100 external red teamers to identify and address potential weaknesses.
  • Copyright Filters: Added filters to block requests for generating copyrighted audio, such as music.

The advanced Voice Mode now includes four preset voices developed in collaboration with voice actors. Christianson emphasized that ChatGPT’s new mode cannot impersonate other individuals’ voices, including public figures, and will restrict outputs to these preset voices.


Featured Image courtesy of OpenAI

Follow us for more OpenAI news updates.

Hilary Ong

Hello, from one tech geek to another. Not your beloved TechCrunch writer, but a writer with an avid interest in the fast-paced tech scenes and all the latest tech mojo. I bring with me a unique take towards tech with a honed applied psychology perspective to make tech news digestible. In other words, I deliver tech news that is easy to read.

Leave a Reply

Your email address will not be published. Required fields are marked *