Here at musicmetric we’ve been doing research into the intriguing puzzle of inferring similarity between artists and attaching a quantitative value to the match. Lots of methods already exist, tag based metrics, crowd sourcing recommendations, manual annotation and waveform analysis, which all perform well under certain conditions. Many of these methods are used by successful music recommendation websites and work well for more popular artists but can give odd suggestions for less well known acts. With the exception of waveform analysis which can suffer from lack of sources of the music as well as being computationally expensive.
Our business is based on the analysis of all artists, including those in the long tail of popularity, up and coming artists of all genres who may not be so well tagged as well as high profile acts. This coupled with the fact that knowing the similarity between artists is essential for some of our analytics tools we decided it was beneficial to develop a custom method for inferring similarities between artists.
At present we are experimenting with a modified version of the iterative network ranking algorithm. More usually employed by search engines to rank the relevance of results, we have combined them with machine learning algorithms trained to spot relevant features and to correctly identify the subject of the text being analysed gathered by our web crawlers. This allows us to accurately classify artists into parent classes and child classes. Artists can partially belong to multiple classes and are weighted by their relevance. This data is then used in a clustering algorithm which successfully gauges how similar two artists are.