Home Technology OpenAI demonstrated technology for creating synthetic voices

OpenAI demonstrated technology for creating synthetic voices


OpenAI has officially presented the Voice Engine neural network voice generation model, which has been in development since the end of 2022. She only needs a 15-second audio sample to create a synthetic voice. After this, artificial intelligence is able to generate audio recordings based on a given text, including in different languages.


OpenAI Voice Engine

In fact, Voice Engine technology is already used in the ChatGPT chatbot to speak generated text. Only there they use pre-installed voices, while the new technology is potentially capable of imitating any voice. For this reason, OpenAI is not yet ready for large-scale deployment of the neural network, fearing the possibility of its use for unscrupulous purposes.

“We hope to start a dialogue about the responsible use of synthetic voices and how society can adapt to these new capabilities. Based on these conversations and the results of small-scale trials, we will make a more informed decision about whether to implement this technology on a large scale,” the company said in a blog post .

The OpenAI website provides examples of the Voice Engine in action and several potential use cases for the technology:

  • Help children and people who cannot or cannot read read with natural, emotive voices representing a wider range of speakers than is possible with preset voices. 
  • Translating content such as videos and podcasts, allowing authors and companies to reach more people around the world using their own voices.
  • Reaching global communities by improving the delivery of essential services in remote areas.
  • Use by people with conditions affecting speech.
  • Helping people recover their voices who suffer from sudden or degenerative speech disorders.  

OpenAI Voice Engine

OpenAI partners who have access to the Voice Engine have agreed to the company’s policy not to impersonate any person or entity without consent or legal authority. 

“We believe that any widespread adoption of synthetic voice technology should be accompanied by voice authentication, which verifies that the original speaker is knowingly adding his voice to the service, and a list of banned voices, which identifies and prevents the creation of voices that are too similar to famous people,” emphasizes company.