Generative audio

{{Short description|Creation of audio files from databases of audio clips}}

Generative audio refers to the creation of audio files from databases of audio clips.{{cn|date=January 2024}} This technology differs from synthesized voices such as Apple's Siri or Amazon's Alexa, which use a collection of fragments that are stitched together on demand.

Generative audio works by using neural networks to learn the statistical properties of an audio source, then reproduces those properties.{{Cite news|url=https://www.economist.com/news/science-and-technology/21724370-generating-convincing-audio-and-video-fake-events-fake-news-you-aint-seen|title=Fake news: you ain't seen nothing yet|newspaper=The Economist|date=July 2017|access-date=2017-07-01}}

Implications

With this technology, a person's voice can be replicated to speak phrases that they may have never spoken. This could lead to a synthetic version of a public figure's voice being used against them.{{Cite book|last1=Zotkin|first1=D. N.|last2=Shamma|first2=S. A.|last3=Ru|first3=P.|last4=Duraiswami|first4=R.|last5=Davis|first5=L. S.|title=2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03) |chapter=Pitch and timbre manipulations using cortical representation of sound |date=April 2003|volume=5|pages=V–517–20|doi=10.1109/ICASSP.2003.1200020|isbn=978-0-7803-7663-2|s2cid=10372569}}

Technology

Modern generative audio systems employ various deep learning architectures. One notable approach uses generative adversarial networks (GANs), where two machine learning models work against each other to create realistic audio. Other architectures include WaveNet, which uses dilated causal convolutions to model raw audio waveforms, and implementations like 15.ai, which demonstrated in 2020 the ability to clone voices using as little as 15 seconds of training data through specialized neural network architectures.{{cite web |last=Chandraseta |first=Rionaldi |date=January 21, 2021 |title=Generate Your Favourite Characters' Voice Lines using Machine Learning |url=https://towardsdatascience.com/generate-your-favourite-characters-voice-lines-using-machine-learning-c0939270c0c6 |url-status=live |access-date=December 18, 2024 |website=Towards Data Science |archive-date=January 21, 2021 |archive-url=https://web.archive.org/web/20210121132456/https://towardsdatascience.com/generate-your-favourite-characters-voice-lines-using-machine-learning-c0939270c0c6}}{{cite web |last=Temitope |first=Yusuf |date=December 10, 2024 |title=15.ai Creator reveals journey from MIT Project to internet phenomenon|url=https://guardian.ng/technology/15-ai-creator-reveals-journey-from-mit-project-to-internet-phenomenon/ |access-date=December 25, 2024 |website=The Guardian |quote= |archive-url=https://web.archive.org/web/20241228152312/https://guardian.ng/technology/15-ai-creator-reveals-journey-from-mit-project-to-internet-phenomenon/ |archive-date=December 28, 2024|url-status=live}}

References

Category:Sound production

Generative audio

Implications

Technology

See also

References