ComputersNewsPhonesTecnologyUncategorized

A year later, Openai did not issue an audio cloning tool


In late March, Openai announced a “small inspection” for AI, the audio engine, which the company claimed could reproduce the person’s voice with only 15 seconds of speech. Almost a year later, the tool remains in the inspection, and Openai did not provide any indication of the time of its launch – or whether it will be released at all.

The company’s reluctance to offer the service on a large scale may indicate fears of misuse, but it may also reflect an attempt to avoid the regulatory audit. Openai was historically accused of giving priority “glossy products” at the expense of safety, and rushing versions to overcome the marketing companies for marketing.

In a statement, an Openai techcrunch spokesman told the company continues to test the audio engine with a limited group of “trusted partners”.

“We are learning from how to use (our partners) technology so that we can improve the benefit and safety of the model.” “We were excited to see the different methods that are used, from speech therapy, to language learning, to customer support, to video game characters, to AI Avatar.”

I pushed back

Voice Engine, which operates the sounds available in the text application programming interface to Openai, as well as the voice mode of Chatgpt, creates a natural letter similar to the original headphone closely. The tool converts the written letters into speech, which is limited only by some handrails to the content. But he was prone to delay and diverting the issuance windows from the beginning.

As Openai explained in a blog in June 2024, the Voice Engine model is learned to predict the most likely sounds that the speaker will make for a specific text, taking into account the voices, dialects and various speaking patterns. Next, the model can not only create versions of the spoken text, but also “spoken words” that reflect how you can read different types of speakers a loud text.

Openai initially aims to bring the English Voice, which is originally called custom sounds, to its application programming interface on March 7, 2024, according to a blog seen by Techcrunch. The plan was to give a group of up to 100 “reliable developers” before the first time appeared, while giving priority to the Devs building applications that provided “social benefit” or showed “innovative and responsible” uses of technology. Openai has set a brand and its prices: $ 15 per million “standard” voices and $ 30 per million people for “HD Quality” sounds.

Then, at eleven o’clock, the company postponed the advertisement. Openai ended with the audio engine detection a few weeks after the registration option. Openai said that access to the tool will remain limited to a group of about 10 Devs that the company started working with in late 2023.

“We hope to start a dialogue on responsible publishing of artificial sounds and how society can adapt to these new capabilities,” wrote Openai at the Engine Declaration blog in late March 2024.

Long in business

The audio engine has been in business since 2022, according to Openai. The company claims to gather the tool for “global policy makers at the highest levels” in the summer of 2023 to show its capabilities – risks.

Many partners can access the audio engine today, including the start of LIVOX, which builds devices that enable people with disabilities to communicate normally. CEO Carlos Pereira told Teccrunch while Livox was eventually able to build a sound engine in a product because of the online tool requirements (many LIVOX customers do not have the Internet), he found that technology was “really great”.

“The sound quality and the possibility of making the sounds speaking in different languages ​​are unique – especially for people with disabilities and our customers,” Pereira told Techcrunch via email. “It is really the most impressive and easy to use (tool) to create sounds I have seen (…) We hope that Openai will develop an unconnected version soon.”

Pereira says he did not receive directives from Openai about launching a possible audio engine, and no signs the company planned to start imposing fees on the service. To date, Levox has not been forced to pay for its use.

In this position mentioned above, Openai hinted that one of his considerations in delaying the sound engine is the possibility of abuse during the American election session last year. Ely through discussions with stakeholders, Voice Engine has many reduced safety measures, including the watermark to track the source of the created sound.

Developers must obtain “explicit approval” from the original speaker before using the audio engine, according to Openai, and they must prepare “clear disclosure” for their fans that the sounds are generated. However, the company did not say how to implement these policies. Doing this may be very difficult, even for an Openai resource.

In its blog publications, Openai also guaranteed that it hopes to build a “audio authentication experience” to verify speakers and a “non -navigation” list that prevents the creation of sounds that seem very similar to prominent numbers. Both are technologically ambitious projects, and this will reflect a bad mistake on a company that is often accused of marginalizing safety initiatives.

Effective liquidation and rapid identity verification have become the baseline requirements for technical filling technology. The cloning of the sound of artificial intelligence was the third fastest growth in 2024, according to one source. This has led to the transgressions and bank security checks, as the privacy and copyright laws are fighting in order to keep pace with this. Harmful actors have used the audio reproduction to create a burning Deepfakes from celebrities and politicians, and that deep has spread like a wildfire through social media.

Openai can launch the audio engine next week – or never. The company has repeatedly said it weighs maintaining small service in its scope. But there is one clear thing: for the causes of optics, or the causes of safety, or both, the limited inspection of the audio engine has become one of the longer than the history of Openai.



Source link

Back to top button