music and artificial intelligence

{{Short description|Usage of artificial intelligence to generate music}}

{{Use dmy dates|date=November 2024}}

{{Use American English|date=May 2025}}

{{Artificial intelligence}}

Music and artificial intelligence (music and AI) is the development of music software programs which use AI to generate music.{{cite journal |author1=D. Herremans |author2=C.H. |author3=Chuan, E. Chew |year=2017 |title=A Functional Taxonomy of Music Generation Systems |journal=ACM Computing Surveys |volume=50 |issue=5 |pages=69:1–30 |arxiv=1812.04186 |doi=10.1145/3108242 |s2cid=3483927}} As with applications in other fields, AI in music also simulates mental tasks. A prominent feature is the capability of an AI algorithm to learn based on past data, such as in computer accompaniment technology, wherein the AI is capable of listening to a human performer and performing accompaniment.{{Cite web |last=Dannenberg |first=Roger |title=Artificial Intelligence, Machine Learning, and Music Understanding |url=https://pdfs.semanticscholar.org/f275/4c359d7ef052ab5997d71dc3e9443404565a.pdf |url-status=dead |archive-url=https://web.archive.org/web/20180823141845/https://pdfs.semanticscholar.org/f275/4c359d7ef052ab5997d71dc3e9443404565a.pdf |archive-date=August 23, 2018 |access-date=August 23, 2018 |website=Semantic Scholar |s2cid=17787070}} Artificial intelligence also drives interactive composition technology, wherein a computer composes music in response to a live performance. There are other AI applications in music that cover not only music composition, production, and performance but also how music is marketed and consumed. Several music player programs have also been developed to use voice recognition and natural language processing technology for music voice control. Current research includes the application of AI in music composition, performance, theory and digital sound processing. Composers/artists like Jennifer Walshe or Holly Herndon have been exploring aspects of music AI for years in their performances and musical works. Another original approach of humans “imitating AI” can be found in the 43-hour sound installation String Quartet(s) by Georges Lentz.

20th century art historian Erwin Panofsky proposed that in all art, there existed three levels of meaning: primary meaning, or the natural subject; secondary meaning, or the conventional subject; and tertiary meaning, the intrinsic content of the subject.{{Cite web |url=http://tems.umn.edu/pdf/Panofsky_iconology2.pdf |title=Erwin Panofsky, Studies in Iconology: Humanistic Themes in the Art of the Renaissance. Oxford 1939. |access-date=3 March 2024 |archive-date=29 December 2023 |archive-url=https://web.archive.org/web/20231229110530/http://tems.umn.edu/pdf/Panofsky_iconology2.pdf |url-status=live }}{{Citation |last=Dilly |first=Heinrich |title=Panofsky, Erwin: Zum Problem der Beschreibung und Inhaltsdeutung von Werken der bildenden Kunst |date=2020 |work=Kindlers Literatur Lexikon (KLL) |pages=1–2 |editor-last=Arnold |editor-first=Heinz Ludwig |url=https://doi.org/10.1007/978-3-476-05728-0_16027-1 |access-date=2024-03-03 |place=Stuttgart |publisher=J.B. Metzler |language=de |doi=10.1007/978-3-476-05728-0_16027-1 |isbn=978-3-476-05728-0|url-access=subscription }} AI music explores the foremost of these, creating music without the "intention" which is usually behind it, leaving composers who listen to machine-generated pieces feeling unsettled by the lack of apparent meaning.{{Cite journal |date=2021 |title=Handbook of Artificial Intelligence for Music |url=https://books.google.com/books?id=p7I2EAAAQBAJ |journal=SpringerLink |language=en |doi=10.1007/978-3-030-72116-9 |isbn=978-3-030-72115-2 |editor-last1=Miranda |editor-first1=Eduardo Reck |archive-date=10 September 2024 |access-date=10 September 2024 |archive-url=https://web.archive.org/web/20240910010426/https://books.google.com/books?id=p7I2EAAAQBAJ |url-status=live |url-access=subscription }}

History

In the 1950s and the 1960s, music made by artificial intelligence was not fully original, but generated from templates that people had already defined and given to the AI, with this being known as rule-based systems. As time passed, computers became more powerful, which allowed machine learning and artificial neural networks to help in the music industry by giving AI large amounts of data to learn how music is made instead of predefined templates. By the early 2000s, more advancements in artificial intelligence had been made, with generative adversarial networks (GANs) and deep learning being used to help AI compose more original music that is more complex and varied than possible before. Notable AI-driven projects, such as OpenAI’s MuseNet and Google’s Magenta, have demonstrated AI’s ability to generate compositions that mimic various musical styles.{{Cite web |title=AI and the Sound of Music |url=https://www.yalelawjournal.org/forum/ai-and-the-sound-of-music |access-date=2025-02-26 |website=www.yalelawjournal.org}}

=Timeline=

Artificial intelligence finds its beginnings in music with the transcription problem: accurately recording a performance into musical notation as it is played. Père Engramelle's schematic of a "piano roll", a mode of automatically recording note timing and duration in a way which could be easily transcribed to proper musical notation by hand, was first implemented by German engineers J.F. Unger and J. Hohlfield in 1952.{{Cite journal |title=Research in music and artificial intelligence |url=https://dl.acm.org/doi/epdf/10.1145/4468.4469 |access-date=2024-03-06 |journal=ACM Computing Surveys |date=1985 |language=en |doi=10.1145/4468.4469 |last1=Roads |first1=Curtis |volume=17 |issue=2 |pages=163–190 |archive-date=7 April 2024 |archive-url=https://web.archive.org/web/20240407191135/https://dl.acm.org/doi/epdf/10.1145/4468.4469 |url-status=live }}

In 1957, the ILLIAC I (Illinois Automatic Computer) produced the "Illiac Suite for String Quartet", a completely computer-generated piece of music. The computer was programmed to accomplish this by composer Leonard Isaacson and mathematician Lejaren Hiller.{{rp|v–vii}}

In 1960, Russian researcher Rudolf Zaripov published worldwide first paper on algorithmic music composing using the Ural-1 computer.{{cite journal|last=Zaripov|first=Rudolf|title=Об алгоритмическом описании процесса сочинения музыки (On algorithmic description of process of music composition)|journal=Proceedings of the USSR Academy of Sciences|year=1960|volume=132|issue=6}}

In 1965, inventor Ray Kurzweil developed software capable of recognizing musical patterns and synthesizing new compositions from them. The computer first appeared on the quiz show I've Got a Secret that same year.{{Cite web |title=Ray Kurzweil |url=https://nationalmedals.org/laureate/ray-kurzweil/ |access-date=2024-09-10 |website=National Science and Technology Medals Foundation |language=en-US |archive-date=10 September 2024 |archive-url=https://web.archive.org/web/20240910011928/https://nationalmedals.org/laureate/ray-kurzweil/ |url-status=live }}

By 1983, Yamaha Corporation's Kansei Music System had gained momentum, and a paper was published on its development in 1989. The software utilized music information processing and artificial intelligence techniques to essentially solve the transcription problem for simpler melodies, although higher-level melodies and musical complexities are regarded even today as difficult deep-learning tasks, and near-perfect transcription is still a subject of research.{{Cite journal |last1=Katayose |first1=Haruhiro |last2=Inokuchi |first2=Seiji |date=1989 |title=The Kansei Music System |url=https://www.jstor.org/stable/3679555 |journal=Computer Music Journal |volume=13 |issue=4 |pages=72–77 |doi=10.2307/3679555 |jstor=3679555 |issn=0148-9267 |archive-date=26 March 2020 |access-date=6 March 2024 |archive-url=https://web.archive.org/web/20200326032127/https://www.jstor.org/stable/3679555 |url-status=live |url-access=subscription }}

In 1997, an artificial intelligence program named Experiments in Musical Intelligence (EMI) appeared to outperform a human composer at the task of composing a piece of music to imitate the style of Bach.{{cite news |last1=Johnson |first1=George |title=Undiscovered Bach? No, a Computer Wrote It |url=https://www.nytimes.com/1997/11/11/science/undiscovered-bach-no-a-computer-wrote-it.html |access-date=29 April 2020 |work=The New York Times |date=11 November 1997 |quote=Dr. Larson was hurt when the audience concluded that his piece -- a simple, engaging form called a two-part invention -- was written by the computer. But he felt somewhat mollified when the listeners went on to decide that the invention composed by EMI (pronounced Emmy) was genuine Bach. |archive-date=19 March 2024 |archive-url=https://web.archive.org/web/20240319073325/https://www.nytimes.com/1997/11/11/science/undiscovered-bach-no-a-computer-wrote-it.html |url-status=live }} EMI would later become the basis for a more sophisticated algorithm called Emily Howell, named for its creator.

In 2002, the music research team at the Sony Computer Science Laboratory Paris, led by French composer and scientist François Pachet, designed the Continuator, an algorithm uniquely capable of resuming a composition after a live musician stopped.{{Cite journal |last=Pachet |first=François |date=September 2003 |title=The Continuator: Musical Interaction With Style |url=http://www.tandfonline.com/doi/abs/10.1076/jnmr.32.3.333.16861 |journal=Journal of New Music Research |volume=32 |issue=3 |pages=333–341 |doi=10.1076/jnmr.32.3.333.16861 |issn=0929-8215|hdl=2027/spo.bbp2372.2002.044 |hdl-access=free }}

Emily Howell would continue to make advancements in musical artificial intelligence, publishing its first album From Darkness, Light in 2009.{{Cite news |last=Lawson |first=Mark |date=2009-10-22 |title=This artificially intelligent music may speak to our minds, but not our souls |url=https://www.theguardian.com/commentisfree/2009/oct/22/music-computer-compose-copy |access-date=2024-09-10 |work=The Guardian |language=en-GB |issn=0261-3077}} Since then, many more pieces by artificial intelligence and various groups have been published.

In 2010, Iamus became the first AI to produce a fragment of original contemporary classical music, in its own style: "Iamus' Opus 1". Located at the Universidad de Malága (Malága University) in Spain, the computer can generate a fully original piece in a variety of musical styles.{{Cite news |date=2013-01-02 |title=Iamus: Is this the 21st century's answer to Mozart? |url=https://www.bbc.com/news/technology-20889644 |access-date=2024-09-10 |work=BBC News |language=en-GB |archive-date=10 September 2024 |archive-url=https://web.archive.org/web/20240910010426/https://www.bbc.com/news/technology-20889644 |url-status=live }}{{rp|468–481}} In August 2019, a large dataset consisting of 12,197 MIDI songs, each with their lyrics and melodies,{{Citation |last=yy1lab |title=yy1lab/Lyrics-Conditioned-Neural-Melody-Generation |date=2024-11-13 |url=https://github.com/yy1lab/Lyrics-Conditioned-Neural-Melody-Generation |access-date=2024-11-19 |archive-date=6 January 2024 |archive-url=https://web.archive.org/web/20240106131106/https://github.com/yy1lab/Lyrics-Conditioned-Neural-Melody-Generation |url-status=live }} was created to investigate the feasibility of neural melody generation from lyrics using a deep conditional LSTM-GAN method.

With progress in generative AI, models capable of creating complete musical compositions (including lyrics) from a simple text description have begun to emerge. Two notable web applications in this field are Suno AI, launched in December 2023, and Udio, which followed in April 2024.{{Cite web |last=Nair |first=Vandana |date=2024-04-11 |title=AI-Music Platform Race Accelerates with Udio |url=https://analyticsindiamag.com/ai-music-platform-race-accelerates-with-udio/ |access-date=2024-04-19 |website=Analytics India Magazine |language=en-US |archive-date=19 April 2024 |archive-url=https://web.archive.org/web/20240419234240/https://analyticsindiamag.com/ai-music-platform-race-accelerates-with-udio/ |url-status=live }}

Software applications

=ChucK=

{{main|ChucK}}

Developed at Princeton University by Ge Wang and Perry Cook, ChucK is a text-based, cross-platform language.[http://chuck.cs.princeton.edu/ ChucK => Strongly-timed, On-the-fly Audio Programming Language] {{Webarchive|url=https://web.archive.org/web/20031118220911/http://chuck.cs.princeton.edu/ |date=18 November 2003 }}. Chuck.cs.princeton.edu. Retrieved on 2010-12-22. By extracting and classifying the theoretical techniques it finds in musical pieces, the software is able to synthesize entirely new pieces from the techniques it has learned.{{Cite web |url=https://ccrma.stanford.edu/~ge/publish/files/2008-icmc-learning.pdf |title=Foundations of On-the-fly Learning in the ChucK Programming Language |access-date=3 March 2024 |archive-date=21 April 2024 |archive-url=https://web.archive.org/web/20240421003133/https://ccrma.stanford.edu/~ge/publish/files/2008-icmc-learning.pdf |url-status=live }} The technology is used by SLOrk (Stanford Laptop Orchestra)Driver, Dustin. (1999-03-26) [https://www.apple.com/pro/profiles/slork/ Pro - Profiles - Stanford Laptop Orchestra (SLOrk), pg. 1] {{Webarchive|url=https://web.archive.org/web/20120118062636/http://www.apple.com/pro/profiles/slork/ |date=18 January 2012 }}. Apple. Retrieved on 2010-12-22. and PLOrk (Princeton Laptop Orchestra).

=Jukedeck=

{{main|Jukedeck}}

Jukedeck was a website that let people use artificial intelligence to generate original, royalty-free music for use in videos.{{Cite news |title=From Jingles to Pop Hits, A.I. Is Music to Some Ears |url=https://www.nytimes.com/2017/01/22/arts/music/jukedeck-artificial-intelligence-songwriting.html |access-date=2023-01-03 |website=The New York Times |date=22 January 2017 |language=en |archive-date=28 February 2023 |archive-url=https://web.archive.org/web/20230228233942/https://www.nytimes.com/2017/01/22/arts/music/jukedeck-artificial-intelligence-songwriting.html |url-status=live }}{{Cite web |title=Need Music For A Video? Jukedeck's AI Composer Makes Cheap, Custom Soundtracks |url=https://techcrunch.com/2015/12/07/jukedeck/?guccounter=1 |access-date=2023-01-03 |website=techcrunch.com |date=7 December 2015 |language=en |archive-date=3 January 2023 |archive-url=https://web.archive.org/web/20230103224334/https://techcrunch.com/2015/12/07/jukedeck/?guccounter=1 |url-status=live }} The team started building the music generation technology in 2010,{{Cite web |title=What Will Happen When Machines Write Songs Just as Well as Your Favorite Musician? |url=https://www.motherjones.com/media/2019/03/what-will-happen-when-machines-write-songs-just-as-well-as-your-favorite-musician |access-date=2023-01-03 |website=motherjones.com |language=en |archive-date=3 February 2023 |archive-url=https://web.archive.org/web/20230203013903/https://www.motherjones.com/media/2019/03/what-will-happen-when-machines-write-songs-just-as-well-as-your-favorite-musician/ |url-status=live }} formed a company around it in 2012,{{Cite news |title=Jukedeck's computer composes music at touch of a button |url=https://www.ft.com/content/16e9f2dc-9c3e-11e5-b45d-4812f209f861 |access-date=2023-01-03 |newspaper=Financial Times |date=7 December 2015 |language=en |last1=Cookson |first1=Robert |archive-date=3 January 2023 |archive-url=https://web.archive.org/web/20230103224342/https://www.ft.com/content/16e9f2dc-9c3e-11e5-b45d-4812f209f861 |url-status=live }} and launched the website publicly in 2015. The technology used was originally a rule-based algorithmic composition system,{{Cite magazine |title=Jukedeck: the software that writes music by itself, note by note |url=https://www.wired.co.uk/article/shuffle-your-tunes |access-date=2023-01-03 |magazine=Wired UK |language=en |archive-date=3 January 2023 |archive-url=https://web.archive.org/web/20230103224334/https://www.wired.co.uk/article/shuffle-your-tunes |url-status=live }} which was later replaced with artificial neural networks. The website was used to create over 1 million pieces of music, and brands that used it included Coca-Cola, Google, UKTV, and the Natural History Museum, London.{{Cite web |title=Robot rock: how AI singstars use machine learning to write harmonies |url=https://www.standard.co.uk/tech/jukedeck-maching-learning-ai-startup-music-a3779296.html |access-date=2023-01-03 |website=standard.co.uk |date=March 2018 |language=en |archive-date=3 January 2023 |archive-url=https://web.archive.org/web/20230103224336/https://www.standard.co.uk/tech/jukedeck-maching-learning-ai-startup-music-a3779296.html |url-status=live }} In 2019, the company was acquired by ByteDance.{{Cite web |title=TIKTOK OWNER BYTEDANCE BUYS AI MUSIC COMPANY JUKEDECK |url=https://www.musicbusinessworldwide.com/tiktok-parent-bytedance-buys-ai-music-company-jukedeck/ |access-date=2023-01-03 |website=musicbusinessworldwide.com |date=23 July 2019 |language=en |archive-date=8 February 2023 |archive-url=https://web.archive.org/web/20230208211719/https://www.musicbusinessworldwide.com/tiktok-parent-bytedance-buys-ai-music-company-jukedeck/ |url-status=live }}{{Cite web |title=As TikTok's Music Licensing Reportedly Expires, Owner ByteDance Purchases AI Music Creation Startup JukeDeck |url=https://www.digitalmusicnews.com/2019/07/23/tiktok-bytedance-acquires-jukedeck/ |access-date=2023-01-03 |website=digitalmusicnews.com |date=23 July 2019 |language=en |archive-date=3 January 2023 |archive-url=https://web.archive.org/web/20230103224338/https://www.digitalmusicnews.com/2019/07/23/tiktok-bytedance-acquires-jukedeck/ |url-status=live }}{{Cite web |title=An AI-generated music app is now part of the TikTok group |url=https://sea.mashable.com/entertainment/5205/an-ai-generated-music-app-is-now-part-of-the-tiktok-group |access-date=2023-01-03 |website=sea.mashable.com |date=24 July 2019 |language=en |archive-date=29 January 2023 |archive-url=https://web.archive.org/web/20230129185548/https://sea.mashable.com/entertainment/5205/an-ai-generated-music-app-is-now-part-of-the-tiktok-group |url-status=live }}

=MorpheuS=

MorpheuS{{cite journal| title = MorpheuS: Automatic music generation with recurrent pattern constraints and tension profiles

| author1 = D. Herremans |author2=E. Chew

| year = 2016

| journal = IEEE Transactions on Affective Computing

| volume =

| doi = 10.1109/TAFFC.2017.2737984| arxiv = 1812.04832

| s2cid = 54475410 }} is a research project by Dorien Herremans and Elaine Chew at Queen Mary University of London, funded by a Marie Skłodowská-Curie EU project. The system uses an optimization approach based on a variable neighborhood search algorithm to morph existing template pieces into novel pieces with a set level of tonal tension that changes dynamically throughout the piece. This optimization approach allows for the integration of a pattern detection technique in order to enforce long term structure and recurring themes in the generated music. Pieces composed by MorpheuS have been performed at concerts in both Stanford and London.

=AIVA=

{{main|AIVA}}

Created in February 2016, in Luxembourg, AIVA is a program that produces soundtracks for any type of media. The algorithms behind AIVA are based on deep learning architectures{{Cite web |date=2017-03-09 |title=A New AI Can Write Music as Well as a Human Composer |url=https://futurism.com/a-new-ai-can-write-music-as-well-as-a-human-composer |access-date=2024-04-19 |website=Futurism |archive-date=19 April 2024 |archive-url=https://web.archive.org/web/20240419005852/https://futurism.com/a-new-ai-can-write-music-as-well-as-a-human-composer |url-status=live }} AIVA has also been used to compose a Rock track called On the Edge,{{Cite web |last=Technologies |first=Aiva |date=2018-10-24 |title=The Making of AI-generated Rock Music with AIVA |url=https://medium.com/@aivatech/the-making-of-ai-generated-rock-music-with-aiva-9ae0257e6d5c |access-date=2024-04-19 |website=Medium |language=en |archive-date=25 October 2018 |archive-url=https://web.archive.org/web/20181025190241/https://medium.com/@aivatech/the-making-of-ai-generated-rock-music-with-aiva-9ae0257e6d5c |url-status=live }} as well as a pop tune Love Sick{{Cite AV media |url=https://www.youtube.com/watch?v=gQSPjAYTlx8 |title=Lovesick {{!}} Composed with AIVA Artificial Intelligence - Official Video with Lyrics {{!}} Taryn Southern |date=2 May 2018 |access-date=25 October 2018 |archive-date=19 July 2018 |archive-url=https://web.archive.org/web/20180719220335/https://www.youtube.com/watch?v=gQSPjAYTlx8 |url-status=live }} in collaboration with singer Taryn Southern,{{Cite web |last=Southern |first=Taryn |date=2018-05-10 |title=Algo-Rhythms: The future of album collaboration |url=https://techcrunch.com/2018/05/10/ai-is-the-future-of-rhythm-nation/ |access-date=2024-04-19 |website=TechCrunch |language=en-US |archive-date=25 October 2018 |archive-url=https://web.archive.org/web/20181025190411/https://techcrunch.com/2018/05/10/ai-is-the-future-of-rhythm-nation/ |url-status=live }} for the creation of her 2018 album "I am AI".

=Google Magenta=

File:Hypnotic ambient electronic music by MusicLM.mp3

Google's Magenta team has published several AI music applications and technical papers since their launch in 2016.{{Cite web |date=2016-06-01 |title=Welcome to Magenta! |url=https://magenta.tensorflow.org/blog/2016/06/01/welcome-to-magenta/ |access-date=2024-04-19 |website=Magenta |language=en |archive-date=1 February 2023 |archive-url=https://web.archive.org/web/20230201210319/https://magenta.tensorflow.org/blog/2016/06/01/welcome-to-magenta/ |url-status=live }} In 2017 they released the NSynth algorithm and dataset,{{Cite journal |last1=Engel |first1=Jesse |last2=Resnick |first2=Cinjon |last3=Roberts |first3=Adam |last4=Dieleman |first4=Sander |last5=Eck |first5=Douglas |last6=Simonyan |first6=Karen |last7=Norouzi |first7=Mohammad |date=2017 |title=Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders |journal=PMLR |arxiv=1704.01279}} and an open source hardware musical instrument, designed to facilitate musicians in using the algorithm.{{Citation |title=Open NSynth Super |date=2023-02-13 |url=https://github.com/googlecreativelab/open-nsynth-super |publisher=Google Creative Lab |access-date=2023-02-14 |archive-date=19 December 2022 |archive-url=https://web.archive.org/web/20221219223155/https://github.com/googlecreativelab/open-nsynth-super |url-status=live }} The instrument was used by notable artists such as Grimes and YACHT in their albums.{{Cite web |title=Cover Story: Grimes is ready to play the villain |url=https://crackmagazine.net/article/long-reads/grimes-is-ready-to-play-the-villain/ |access-date=2023-02-14 |website=Crack Magazine |archive-date=3 June 2023 |archive-url=https://web.archive.org/web/20230603170859/https://crackmagazine.net/article/long-reads/grimes-is-ready-to-play-the-villain/ |url-status=live }}{{Cite web |date=2019-09-18 |title=What Machine-Learning Taught the Band YACHT About Themselves |url=https://losangeleno.com/people/what-machine-learning-taught-the-band-yacht-about-themselves/ |access-date=2023-02-14 |website=Los Angeleno |language=en-US |archive-date=28 January 2023 |archive-url=https://web.archive.org/web/20230128092944/https://losangeleno.com/people/what-machine-learning-taught-the-band-yacht-about-themselves/ |url-status=dead }} In 2018, they released a piano improvisation app called Piano Genie. This was later followed by Magenta Studio, a suite of 5 MIDI plugins that allow music producers to elaborate on existing music in their DAW.{{Cite web |title=Magenta Studio |url=https://magenta.tensorflow.org/studio/ |access-date=2024-04-19 |website=Magenta |language=en |archive-date=13 January 2023 |archive-url=https://web.archive.org/web/20230113145752/https://magenta.tensorflow.org/studio |url-status=live }} In 2023, their machine learning team published a technical paper on GitHub that described MusicLM, a private text-to-music generator which they'd developed.{{Cite web |date=2023 |title=MusicLM |url=https://google-research.github.io/seanet/musiclm/examples/ |access-date=2024-04-19 |website=google-research.github.io |archive-date=1 February 2023 |archive-url=https://web.archive.org/web/20230201211033/https://google-research.github.io/seanet/musiclm/examples/ |url-status=live }}{{Cite web |last=Sandzer-Bell |first=Ezra |date=2024-02-16 |title=Best Alternatives to Google's AI-Powered MusicLM and MusicFX |url=https://www.audiocipher.com/post/musiclm |access-date=2024-04-19 |website=AudioCipher |language=en |archive-date=1 February 2023 |archive-url=https://web.archive.org/web/20230201204518/https://www.audiocipher.com/post/musiclm |url-status=live }}

=Riffusion=

{{excerpt|Riffusion}}

= Spike AI =

Spike AI is an AI-based audio plug-in, developed by Spike Stent in collaboration with his son Joshua Stent and friend Henry Ramsey, that analyzes tracks and provides suggestions to increase clarity and other aspects during mixing. Communication is done by using a chatbot trained on Spike Stent's personal data. The plug-in integrates into digital audio workstation.{{Cite web |last=Levine |first=Mike |date=2024-10-04 |title=Spike AI — A Mix Product of the Week |url=https://www.mixonline.com/business/spike-ai-a-mix-product-of-the-week |access-date=2024-11-19 |website=Mix |publisher=Future US |language=en-US |archive-date=14 November 2024 |archive-url=https://web.archive.org/web/20241114165303/https://www.mixonline.com/business/spike-ai-a-mix-product-of-the-week |url-status=live }}{{Cite web |title=Spike Stent offers his expertise in Spike AI |url=https://www.soundonsound.com/news/spike-stent-offers-his-expertise-spike-ai |access-date=2024-11-19 |website=Sound on Sound |archive-date=17 December 2024 |archive-url=https://web.archive.org/web/20241217191753/https://www.soundonsound.com/news/spike-stent-offers-his-expertise-spike-ai |url-status=live }}

Musical applications

Artificial intelligence can potentially impact how producers create music by giving reiterations of a track that follow a prompt given by the creator. These prompts allow the AI to follow a certain style that the artist is trying to go for. AI has also been seen in musical analysis where it has been used for feature extraction, pattern recognition, and musical recommendations.{{Cite journal |last=Zhang |first=Yifei |date=December 2023 |title=Utilizing Computational Music Analysis and AI for Enhanced Music Composition: Exploring Pre- and Post-Analysis |journal=Journal of Advanced Zoology |volume=44 |issue=S-6 |pages=1377–1390 |doi=10.17762/jaz.v44is6.2470 |s2cid=265936281 |doi-access=free }} New tools that are powered by artificial intelligence have been made to help aid in generating original music compositions, like AIVA (Artificial Intelligence Virtual Artist) and Udio. This is done by giving an AI model data of already-existing music and having it analyze the data using deep learning techniques to generate music in many different genres, such as classical music or electronic music.{{Cite web |last=Mańdziuk |first=Jacek |date=2024-11-16 |title=Artificial intelligence in music: recent trends and challenges |url=https://www.researchgate.net/publication/385881724 |website=ResearchGate}}

Musical deepfakes

A more nascent development of AI in music is the application of audio deepfakes to cast the lyrics or musical style of a pre-existing song to the voice or style of another artist. This has raised many concerns regarding the legality of technology, as well as the ethics of employing it, particularly in the context of artistic identity.{{Cite web |url=https://ceur-ws.org/Vol-3528/paper3.pdf |title=DeepDrake ft. BTS-GAN and TayloRVC: An Exploratory Analysis of Musical Deepfakes and Hosting Platforms |access-date=6 March 2024 |archive-date=26 March 2024 |archive-url=https://web.archive.org/web/20240326091102/https://ceur-ws.org/Vol-3528/paper3.pdf |url-status=live }} Furthermore, it has also raised the question of to whom the authorship of these works is attributed. As AI cannot hold authorship of its own, current speculation suggests that there will be no clear answer until further rulings are made regarding machine learning technologies as a whole.{{Cite web |url=https://www.cigionline.org/static/documents/DPH-paper-Josan.pdf |title=AI and Deepfake Voice Cloning: Innovation, Copyright and Artists' Rights |access-date=6 March 2024 |archive-date=3 June 2024 |archive-url=https://web.archive.org/web/20240603150539/https://www.cigionline.org/static/documents/DPH-paper-Josan.pdf |url-status=live }} Most recent preventative measures have started to be developed by Google and Universal Music group who have taken into royalties and credit attribution to allow producers to replicated the voices and styles of artists.{{Cite news |title=Google and Universal Music negotiate deal over AI 'deepfakes' |url=https://www.ft.com/content/6f022306-2f83-4da7-8066-51386e8fe63b |access-date=2024-04-03 |newspaper=Financial Times |date=8 August 2023 |last1=Murgia |first1=Madhumita |last2=Nicolaou |first2=Anna |archive-date=28 March 2024 |archive-url=https://web.archive.org/web/20240328192800/https://www.ft.com/content/6f022306-2f83-4da7-8066-51386e8fe63b |url-status=live }}

= "Heart on My Sleeve" =

In 2023, an artist known as ghostwriter977 created a musical deepfake called "Heart on My Sleeve" that cloned the voices of Drake and The Weeknd by inputting an assortment of vocal-only tracks from the respective artists into a deep-learning algorithm, creating an artificial model of the voices of each artist, to which this model could be mapped onto original reference vocals with original lyrics.{{Cite magazine |last=Robinson |first=Kristin |date=2023-10-11 |title=Ghostwriter, the Mastermind Behind the Viral Drake AI Song, Speaks For the First Time |url=https://www.billboard.com/music/pop/ghostwriter-heart-on-my-sleeve-drake-ai-grammy-exclusive-interview-1235434099/ |access-date=2024-04-03 |magazine=Billboard |language=en-US |archive-date=5 April 2024 |archive-url=https://web.archive.org/web/20240405062325/https://www.billboard.com/music/pop/ghostwriter-heart-on-my-sleeve-drake-ai-grammy-exclusive-interview-1235434099/ |url-status=live }} The track was submitted for Grammy consideration for the best rap song and song of the year.{{Cite web |title=Drake/The Weeknd deepfake song "Heart on My Sleeve" submitted to Grammys |url=https://www.thefader.com/2023/09/06/drake-the-weeknd-song-heart-on-my-sleeve-submitted-to-grammys |access-date=2024-04-03 |website=The FADER |language=en |archive-date=3 April 2024 |archive-url=https://web.archive.org/web/20240403073305/https://www.thefader.com/2023/09/06/drake-the-weeknd-song-heart-on-my-sleeve-submitted-to-grammys |url-status=live }} It went viral and gained traction on TikTok and received a positive response from the audience, leading to its official release on Apple Music, Spotify, and YouTube in April 2023.{{Cite web |title=The AI deepfake of Drake and The Weeknd will not be eligible for a GRAMMY |url=https://mixmag.net/read/ai-deepfake-drake-and-the-weeknd-track-is-not-eligible-for-grammy-award-news |access-date=2024-04-03 |website=Mixmag |archive-date=3 April 2024 |archive-url=https://web.archive.org/web/20240403073305/https://mixmag.net/read/ai-deepfake-drake-and-the-weeknd-track-is-not-eligible-for-grammy-award-news |url-status=live }} Many believed the track was fully composed by an AI software, but the producer claimed the songwriting, production, and original vocals (pre-conversion) were still done by him. It would later be rescinded from any Grammy considerations due to it not following the guidelines necessary to be considered for a Grammy award. The track would end up being removed from all music platforms by Universal Music Group. The song was a watershed moment for AI voice cloning, and models have since been created for hundreds, if not thousands, of popular singers and rappers.

= "Where That Came From" =

In 2013, country music singer Randy Travis suffered a stroke which left him unable to sing. In the meantime, vocalist James Dupré toured on his behalf, singing his songs for him. Travis and longtime producer Kyle Lehning released a new song in May 2024 titled "Where That Came From", Travis's first new song since his stroke. The recording uses AI technology to re-create Travis's singing voice, having been composited from over 40 existing vocal recordings alongside those of Dupré.{{cite web|url=https://www.tennessean.com/story/entertainment/music/2024/05/06/randy-travis-now-where-that-came-now-ai-origin/73585407007/|title=Randy Travis' shocks music industry with AI pairing for 'Where That Came From.' How the song came together|author=Marcus K. Dowling|date=May 6, 2024|website=The Tennesseean|accessdate=May 6, 2024}}{{cite web|url=https://apnews.com/article/randy-travis-artificial-intelligence-song-voice-589a8c142f70ed8ccf53af6d32c662dc|title=With help from AI, Randy Travis got his voice back. Here's how his first song post-stroke came to be|author=Maria Sherman|date=May 6, 2024|website=AP News|accessdate=May 6, 2024|archive-date=7 May 2024|archive-url=https://web.archive.org/web/20240507043629/https://apnews.com/article/randy-travis-artificial-intelligence-song-voice-589a8c142f70ed8ccf53af6d32c662dc|url-status=live}}

Technical approaches

Artificial intelligence music encompasses a number of technical approaches used for music composition, analysis, classification, and suggestion. Techniques used are drawn from deep learning, machine learning, natural language processing, and signal processing. Current systems are able to compose entire musical compositions, parse affective content, accompany human players in real-time, and acquire patterns of user and context-dependent preferences.{{Cite journal |last1=Mycka |first1=Jan |last2=Mańdziuk |first2=Jacek |title=Artificial intelligence in music: recent trends and challenges |journal=Neural Computing and Applications |year=2024 |volume=37 |issue=2 |pages=801–839 |doi=10.1007/s00521-024-10555-x |doi-access=free }}{{Cite arXiv |last1=Briot |first1=Jean-Pierre |last2=Hadjeres |first2=Gaëtan |last3=Pachet |first3=François-David |title=Deep learning techniques for music generation – A survey |year=2017 |class=cs.SD |eprint=1709.01620 }}{{Cite journal |last1=Herremans |first1=Dorien |last2=Chuan |first2=Ching-Hua |last3=Chew |first3=Elaine |title=A functional taxonomy of music generation systems |journal=ACM Computing Surveys |volume=50 |issue=5 |pages=69 |year=2017 |doi=10.1145/3108242|arxiv=1812.04186 }}{{Cite journal |last1=Sturm |first1=Bob L. |last2=Søndergaard |first2=Martin |title=Music Information Retrieval and Artificial Intelligence: Best Friends or Worst Enemies? |journal=Transactions of the International Society for Music Information Retrieval |volume=2 |issue=1 |pages=1–19 |year=2019 |doi=10.5334/tismir.28|doi-broken-date=2 April 2025 |doi-access=free }}

= Symbolic music composition =

Symbolic music generation is the generation of music in discrete symbolic forms such as MIDI, where note and timing are precisely defined. Early systems employed rule-based systems and Markov models, but modern systems employ deep learning to a large extent. Recurrent Neural Networks (RNNs), and more precisely Long Short-Term Memory (LSTM) networks, have been employed in modeling temporal dependencies of musical sequences. They may be used to generate melodies, harmonies, and counterpoints in various musical genres.{{Cite arXiv|last1=Briot |first1=Jean-Pierre |last2=Hadjeres |first2=Gaëtan |last3=Pachet |first3=François-David |title=Deep learning techniques for music generation – A survey |year=2017 |class=cs.SD |eprint=1709.01620 }}

Transformer models such as Music Transformer and MuseNet became more popular for symbolic generation due to their ability to model long-range dependencies and scalability. These models were employed to generate multi-instrument polyphonic music and stylistic imitations.{{Cite journal |last=Huang |first=Cheng-Zhi Anna |title=Music Transformer: Generating Music with Long-Term Structure |journal=International Conference on Learning Representations |year=2018 |arxiv=1809.04281 }}

= Audio-based music generation =

This method generates music as raw audio waveforms instead of symbolic notation. DeepMind's WaveNet is an early example that uses autoregressive sampling to generate high-fidelity audio. Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) are being used more and more in new audio texture synthesis and timbre combination of different instruments.

NSynth (Neural Synthesizer), a Google Magenta project, uses a WaveNet-like autoencoder to learn latent audio representations and thereby generate completely novel instrumental sounds.{{Cite journal |last=Engel |first=Jesse |title=Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders |journal=Proceedings of the 34th International Conference on Machine Learning |year=2017 |arxiv=1704.01279 }}

= Music information retrieval (MIR) =

Music Information Retrieval (MIR) is the extraction of musically relevant information from audio recordings to be utilized in applications such as genre classification, instrument recognition, mood recognition, beat detection, and similarity estimation. CNNs on spectrogram features have been very accurate on these tasks. SVMs and k-Nearest Neighbors (k-NN) are also used for classification on features such as Mel-frequency cepstral coefficients (MFCCs).

= Hybrid and interactive systems =

Hybrid systems combine symbolic and sound-based methods to draw on their respective strengths. They can compose high-level symbolic compositions and synthesize them as natural sound. Interactive systems in real-time allow for AI to instantaneously respond to human input to support live performance. Reinforcement learning and rule-based agents tend to be utilized to allow for human–AI co-creation in improvisation contexts.

= Affective computing and emotion-aware music systems =

Affective computing techniques enable AI systems to classify or create music based on some affective content. The models use musical features such as tempo, mode, and timbre to classify or influence listener emotions. Deep learning models have been trained for classifying music based on affective content and even creating music intended to have affective impacts.{{Cite journal |last1=Kim |first1=Yejin |last2=Elliott |first2=Mark T. |last3=Kim |first3=Youngmoo E. |title=Modeling Musical Affect Using Deep Neural Networks |journal=Proceedings of the International Society for Music Information Retrieval |year=2020}}

= AI-based music recommendation systems =

Music recommenders employ AI to suggest tracks to users based on what they have heard, their tastes, and information available in context. Collaborative filtering, content-based filtering, and hybrid filtering are most widely applied, deep learning being utilized for fine-tuning. Graph-based and matrix factorization methods are used within commercial systems like Spotify and YouTube Music to represent complex user-item relationships.{{Cite journal |last=Schedl |first=Markus |title=Deep Learning in Music Recommendation Systems: A Survey |journal=Journal of New Music Research |year=2021 |volume=50 |issue=3 |pages=232–247 |doi=10.1080/09298215.2021.1939563|doi-broken-date=2 April 2025 }}

= AI for automatic mixing and mastering =

AI is also used in audio engineering automation such as mixing and mastering. Such systems level, equalize, pan, and compress to give well-balanced sound outputs. Software such as LANDR and iZotope Ozone utilize machine learning in emulating professional audio engineers' decisions.{{Cite journal |last=Bocko |first=Mark |title=The Role of Artificial Intelligence in Automated Music Production |journal=Audio Engineering Society Conference |year=2019}}

= Lyrics generation and songwriting aid =

Natural language generation also applies to songwriting assistance and lyrics generation. Transformer language models like GPT-3 have also been proven to be able to generate stylistic and coherent lyrics from input prompts, themes, or feeling. There even exist AI programs that assist with rhyme scheme, syllable count, and poem form. .{{Cite journal |last=Vechtomova |first=Olga |title=Lyrics Generation with Neural Networks: Challenges and Opportunities |journal=Transactions of the International Society for Music Information Retrieval |year=2021}}

= Multimodal and cross-modal systems =

Recent developments include multimodal AI systems that integrate music with other media, e.g., dance, video, and text. These can generate background scores in synchronization with video sequences or generate dance choreography from audio input. Cross-modal retrieval systems allow one to search for music using images, text, or gestures.{{Cite journal |last=Choi |first=Keunwoo |title=Theoretical model for the Seebeck coefficient in superlattice materials with energy relaxation |journal=Journal of Applied Physics |arxiv=1908.01090 |year=2019|volume=126 |issue=5 |doi=10.1063/1.5108607 |bibcode=2019JAP...126e5105V }}

Cultural impact

{{AI-generated|section|date=April 2025|certain=no}}

The advent of AI music has caused heated cultural debates, especially its impacts on creativity, morality, and audience. As much as there have been praises about the democratization of music production, there have been fears raised about its impacts on producers, audience, and society in general.

= Reactions and controversies =

The most contentious application of AI music creation has been its misuse to produce offensive work. The music AI platforms have been used in several instances to produce songs with offensive lyrics that were racist, antisemitic, or contained violence and have tested moderation and accountability in generative AI platforms.[{{Cite web |last=Segarra |first=Edward |title=Many musicians are speaking out against AI in music. But how do consumers feel? |url=https://www.usatoday.com/story/entertainment/music/2024/05/17/artificial-intelligence-impact-music-listeners/73509270007/ |access-date=2025-04-04 |website=USA TODAY |language=en-US}}] The case has renewed argument about accountability in users and developers in producing moral outputs in generative models.

Aside from that, there have been several producers and artists denouncing the use of AI music due to threats to originality, handmade craftsmanship, and cultural authenticity. The music created by AIs lacks the emotional intelligence and lived life upon which human work relies, according to its critics. The concern comes in an era when there are steadily more songs made from AIs appearing on platforms and which others consider lowering human artistry.[{{Cite web |last=Wiggers |first=Kyle |date=2024-06-03 |title=People are using AI music generators to create hateful songs |url=https://techcrunch.com/2024/06/03/people-are-using-ai-music-generators-to-create-hateful-songs/ |access-date=2025-04-04 |website=TechCrunch |language=en-US}}]

= Musicians vs. consumers =

Interestingly enough, while professional musicians have been generally more dismissive about using AI in music production, the general consumer or listener has been receptive or neutral to the idea. Surveys have found that in a commercial context, the average consumer often doesn't know or even care whether they hear music made by human beings or AI and that a high percentage says that it doesn't affect their enjoyment.[] The contrast between artist sentiment and consumer sentiment may hold far-reaching consequences in terms of the future economics within the music industry and the worth assigned to human creativity.

= Public perception and general perception =

The cultural value placed on AI music is similarly related to overall popular perceptions regarding generative AI. How generative AI-produced work—whether music or writing—is received in human terms has been found to be dependent upon such factors as emotional meaning and authenticity.[{{Cite journal |last1=Chu |first1=Haoran |last2=Liu |first2=Sixiao |date=2024-10-01 |title=Can AI tell good stories? Narrative transportation and persuasion with ChatGPT |url=https://academic.oup.com/joc/article/74/5/347/7756907 |journal=Journal of Communication |language=en |volume=74 |issue=5 |pages=347–358 |doi=10.1093/joc/jqae029 |issn=0021-9916}}] As long as the output from AI proves persuasive and engaging, audiences may in some cases be willing to accept music whose author is not a human being, with the potential to reshape conventions regarding creators and creativity.

Future directions

{{AI generated|section|date=March 2025}}

The field of music and artificial intelligence is still evolving. Some of the key future directions for advancement include advancements in generation models, changes in how humans and AI collaborate musically, and the development of legal and ethical frameworks to address the technology's impact.

= Advancements in generation models =

Future research and development is expected to move beyond established techniques such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). More recent architectures such as diffusion models and transformer based networks{{cite arXiv | eprint=2409.03715 | last1=Chen | first1=Yanxu | last2=Huang | first2=Linshu | last3=Gou | first3=Tian | title=Applications and Advances of Artificial Intelligence in Music Generation:A Review | date=2024 | class=cs.SD }} are showing promise for generating more complex, nuanced, and stylistically coherent music. These models may lead to higher quality audio generation and better long term structure in music compositions.

= Human-AI collaboration =

Besides the act of generation itself, a significant future direction of interest involves deepening the collaboration between human musicians and AI. Developments are increasingly focused on understanding the way these collaborations can occur, and how they can be facilitated to be ethically sound.Newman, Michele; Morris, Lidia; Lee, Jin Ha (7 December 2023). "Human-AI Music Creation: Understanding the Perceptions and Experiences of Music Creators for Ethical and Productive Collaboration". Zenodo. doi:10.5281/zenodo.1026522. This involves studying musicians perceptions and experiences with AI tools to inform the design of future systems.

Research actively explores these collaborative models in different domains. For instance, studies investigate how AI can be co-designed with professionals such as music therapists to act as supportive partners in complex creative and therapeutic processes,{{cite book | doi=10.1145/3613904.3642764 | chapter=Understanding Human-AI Collaboration in Music Therapy Through Co-Design with Therapists | title=Proceedings of the CHI Conference on Human Factors in Computing Systems | date=2024 | last1=Sun | first1=Jingjing | last2=Yang | first2=Jingyi | last3=Zhou | first3=Guyue | last4=Jin | first4=Yucheng | last5=Gong | first5=Jiangtao | pages=1–21 | arxiv=2402.14503 | isbn=979-8-4007-0330-0 }} showing a trend towards developing AI not just as an output tool, but as an integrated component designed to augment human skills.

= Regulatory changes and ethical considerations =

As AI generated music becomes more capable and widespread, legal and ethical frameworks worldwide are expected to continue adapting. Current policy discussions have been focusing on copyright ownership, the use of AI to mimic artists (deepfakes), and fair compensation for artists."Innovation and Artists’ Rights in the Age of Generative AI". Georgetown Journal of International Affairs. 10 July 2024. Retrieved March 30, 2025. https://gjia.georgetown.edu/2024/07/10/innovation-and-artists-rights-in-the-age-of-generative-ai/ Recent legislative efforts and debates, such as those concerning AI safety and regulation in places like California, show the challenges involved in balancing innovation with potential risks and societal impacts.Mehmood, Irfan; Mahroof, Kamran (October 3, 2024). "California’s governor blocked landmark AI safety laws. Here’s why it’s such a key ruling for the future of AI worldwide". The Conversation. Retrieved March 30, 2025. https://theconversation.com/californias-governor-blocked-landmark-ai-safety-laws-heres-why-its-such-a-key-ruling-for-the-future-of-ai-worldwide-240182 Tracking these developments is crucial for understanding the future of AI in the music industry.Hight, Jewly (25 April 2024). "AI music isn't going away. Here are 4 big questions about what's next". NPR. Retrieved March 30, 2025. https://www.npr.org/2024/04/25/1246928162/generative-ai-music-law-technology

See also

References

{{Reflist|30em}}

Further reading

  • [http://www.aaai.org/Press/Books/balaban.php Understanding Music with AI: Perspectives on Music Cognition] {{Webarchive|url=https://web.archive.org/web/20210110041609/https://www.aaai.org/Press/Books/balaban.php |date=2021-01-10 }}. Edited by Mira Balaban, Kemal Ebcioglu, and Otto Laske. AAAI Press.
  • [http://portal.acm.org/citation.cfm?id=647303&coll=GUIDE&dl=GUIDE Proceedings of a Workshop held as part of AI-ED 93], World Conference on Artificial Intelligence in Education on Music Education: An Artificial Intelligence Approach
  • {{Cite book|last=Tanguiane (Tangian) |first=Andranick |date=1993

|title= Artificial Perception and Music Recognition|series= Lecture Notes in Artificial Intelligence|volume=746

|publisher=Springer |location=Berlin-Heidelberg|isbn=978-3-540-57394-4}}

  • [https://www.transcript-verlag.de/media/pdf/6f/85/fd/oa9783839469224SEoXTXMjO0gxZ.pdf Artificial Intelligence - Intelligent Art? Human-Machine Interaction and Creative Practice.] (Digital Society - Digitale Gesellschaft). Edited by Voigts, Eckart, Robin Auer, Dietmar Elflein, Sebastian Kunas, Jan Röhnert, Christoph Seelinger Bielefeld: transcript,2024.