Researchers at French music streaming service Deezer have published a paper which describes attempts to classify musical moods using deep learning algorithms.
The authors sought a completely different approach to the conventional music mood analysis approaches of the last several decades. Most traditional approaches use feature engineering, which requires manually training a machine learning system to take a guess at a song or musical piece’s mood.
Source: Blue Coat Photos/CC BY-SA 2.0The researchers worked with a dataset of over 18,000 songs, populated with metadata from the Million Song Database (MSD), which the authors claim is one of the largest music mood detection datasets ever proposed. They first analyzed only the audio or lyrics using datasets that associate words and musical features with emotional arousal characteristics. The group then combined audio and lyrics into a fusion model.
The researchers used about 60% of the metadata dataset to train the AI, then tested it using the remaining 40%. The group said its AI was an improvement over conventional methods that use feature engineering.
Multimodal music mood prediction, which focuses on both music and lyrics, is an integral part of music information retrieval (MIR), a field becoming increasingly important with the growth of stream services that must automatically process massive music collections.
