Baidu’s Wenxin AI debuts voice cloning in 2 secs!

On April 9th, Baidu officially announced the launch of the Wenxin AI Voice Customization feature. In just 2 seconds, AI can perfectly replicate anyone’s voice, allowing everyone to have their own AI voice actor.

Users simply need to open the Wenxin AI App, select to create an intelligent entity, click on creating their own voice, and then the system will provide a sentence that users need to read in their usual tone.

In about 2 seconds, users can get a synthesis effect that is comparable to a real person, smooth and natural. The synthesized audio perfectly maintains the emotions, style, and naturalness of reading that particular sentence.

Not only that, users can also build their personalized sound library and match it with a virtual image, quickly creating a digital avatar.

Furthermore, this feature is suitable for people of different genders and ages. It performs exceptionally well with children and various accents, effectively retaining the corresponding styles and accents. This is particularly suitable for the diverse geographical distribution and numerous accents in Chinese society.

Moreover, compared to traditional academic speech synthesis technologies, Baidu’s new technology has strong noise resistance. Even in the presence of noisy background audio during the original recording, it can still produce smooth and clean synthesis results.

According to reports, the reason behind the ability to replicate the voice within 2 seconds is Baidu’s new speech synthesis technology, which enables AI to truly understand the correspondence between text and voice.

Often, it can even understand the emotions in the text, thereby preserving the original emotions, style, and naturalness to the greatest extent possible. Hence, only a very short sample is needed, just a few seconds to complete the process.

Sources: 快科技

