AI tool characterizes a song's genre and provides insights regarding perception music
Date: August 12, 2019
Source: University of Southern California
Summary: An artificial intelligence tool can characterize a song's genre and provides increased understanding how we perceive and process music. Applications include how music content is marketed, consumed and tagged; neuropsychology and the mechanisms of human thought; and affective computing systems that impact human emotions.
Vinyl records in music store (stock image).
Credit: © Ivan Kurmyshov / Adobe Stock
The debate can finally be put to rest: Lil Nas X's record-setting, chart-topping hit "Old Town Road" is indeed country. But it's also a little rock 'n roll. And when you analyze the lyrics and chords together, it's straight-up pop.
At least, that's according to an artificial intelligence tool developed by USC computer science PhD student Timothy Greer. Greer's method automatically predicts music genres by analyzing how lyrics and chords interact with one another throughout the song.
The method classified "Old Town Road" as country according to the lyrics; rock according to the chords (based on a Nine Inch Nails music sample); and pop according to the chords and lyrics combined.
The paper, titled "Using Shared Vector Representations of Words and Chords in Music for Genre Classification," will be presented at the Speech, Music and Mind 2019 conference on Sept.14.
A Very Human Experience
"Old Town Road is an interesting song," said Greer, a lifelong musician who currently plays saxaphone and keyboard in an LA-based band (music genre: Indie rock).
"The lyrics are steeped in the country genre, but the chords and the instrumentation don't sound like country at all. The algorithm highlights the complexity of music, both in terms of how the music is constructed and how it is perceived, in other words, how people process it."
This effort in music research -- to computationally understand the stories we tell with it, and how people experience and are influenced by it -- is a part of a larger research program in Computational Media Intelligence at USC Signal Analysis and Interpretation Laboratory (SAIL) (SAIL).
"Music construction and perception are related, but they are not one and the same," said Greer's supervisor and paper co-author Shrikanth Narayanan.
Narayanan, SAIL director and the Niki and Max Nikias Chair and Professor of Electrical and Computer Engineering, has previously analyzed vocal patterns of beatboxers and opera singers using MRI scans, predicted violence ratings using movie scripts and developed technology that uses voice to assess speaker emotions. He said he is excited about this new research because it's a new way of analyzing music computationally and could reveal unexpected patterns.
"We always say there is no hard-set rule for human experiences of music," said Narayanan, a classical music enthusiast who plays the Indian stringed instrument veena and the violin. "AI and machine learning can provide a lens from which to look at this very human experience."
A New Sound
"Old Town Road," which has now been at the top of the charts for 18 weeks, has been notable for its genre-blending characteristic. As one of the most hotly debated topics in the pop world this summer, everyone seems to have a different opinion -- is it country, pop, rock? Or something else altogether?
In April 2019, the song was removed from the Billboard Hot Country chart because it did "not embrace enough elements of today's country music to chart in its current version," according to a Billboard statement.
Greer put the song to the test with three models he had developed to predict genre: using only chord embeddings, only lyric embeddings and using chord-and-lyric embeddings combined. He trained the system on a dataset with 190,165 musical segments from 5,304 pop songs with lyrics and corresponding chords.
While most genre prediction tools use a song's entire audio file, which means retrieving and processing a high-quality recording, Greer's method can classify genre using only chords and lyrics, which are usually available online with a quick Google Search.
"This interplay between chord sequences and lyric sequences may give us a better glimpse into how we perceive genre than using either alone, although both of these modalities contains useful information alone, as well," said Greer.
The study gives a better understanding of how we perceived and process music, specifically the differences in human music perception -- and categorization -- of music genre depending on the "looking glass" used.
Applications include how music content is marketed, consumed and tagged; neuropsychology and the mechanisms of human thought; and affective computing systems that impact human emotions.