Posts Tagged ‘music analytics’


Lily Allen and the worthless BRIT…

APR. 28
2010

Remember earlier this year when Lily Allen won the BRIT Award for British Female Solo Artist...

You don’t?

Well whatever…she did and at the time she was pleased as punch.

But now it appears she’s not…. apparently the BRIT Awards are one big record label conspiracy…

So was the BRIT Award worthless? Well just look what it did to her Social Network Buzz…

The BRIT<br />
Effect - click to expand

The BRIT Effect - click to expand

I put that big arrow there so you can see the affect… or lack of it.

Similarly there was little change in her MySpace Views & Plays. In fact the BRIT Awards coincide with a drop in plays & views but I would put that down to a regular fluctuation rather than something caused by the Award show.

MySpace Ain't Moving - click to expand

MySpace Ain't Moving - click to expand

There is one site where the award had an affect… or should I say Twaffect?

No. I don’t think I should…

Twimeline of Twollowers - click to expand

Twimeline of Twollowers - click to expand

As you can see Lily had a steady increase of followers from the day of the Award Show onwards; although to be fair this was probably caused by her performance at the Awards rather than the Award itself.

Alternatively it could have been her drunken acceptance speech which made people think “Hey, Lily’s pretty outspoken and entertaining. I wonder if she’s on Twitter?” In fact the second arrow shows the start of a peak which is more than likely caused by Lily’s spat with Courtney Love at the NME Awards.

What do you think? Are award shows relevant anymore? Or are they just a place for celebrities to get drunk and embarrass themselves (and then reap the publicity benefits…)

Answers on a postcard please!

Analysing trends over time with musicmetric

DEC. 13
2009

In this blog post we’re going to look at an example of some of the data mining and large scale analysis which we do at musicmetric, detecting patterns and similarities in time series data.

One use of this analysis is that given an artist, we can find another artist with the closest trend in some variable over time – for example MySpace plays per hour. Alternatively we could generate a list of artists who are increasing in popularity in a certain way, or show which artists have had a brief surge in activity – maybe caused an album release or gig.

Because we store all the data indefinitely and in such a way that we can access it very rapidly, we can run regular batch analysis on the contents of our data warehouse to unlock interesting information.

In this example, we will compare the play count time series data for the top 20,000 artists by total plays on MySpace. It is important to consider that some trends may follow each other with a time lag, so we compare the 20K time series at multiple time lags from 0 to 30 days in the past, in 1 day increments. This means the approximate number of time series comparisons our analysis servers must do for this particular problem is 6 Billion, each one comparing hourly resolution data over a period of 4 months.

Let’s take a look at which artist has a similar trend to Kings of Leon:

Kings of Leon and The Fray - MySpace Plays Per Hour

Kings of Leon and The Fray - MySpace Plays Per Hour

We can see the plays per hour for The Fray seem to be following a similar long term trend to that of Kings of Leon, but offset by the difference in their popularity on MySpace – although they are converging as time goes on. The peaks and troughs also line up, so clearly the fine resolution hourly variation in the data has something to do with the overall use of MySpace at any period in time, not just the popularity of the artist. This is something that can be seen over most MySpace data.

Now let’s look at two artists who have even more similar plays per hour to each other:

Dido and The Clash - MySpace Plays Per Hour

Dido and The Clash - MySpace Plays Per Hour

The Clash and Dido show very high similarity for plays per hour on MySpace over the time frame shown in the chart above. A lot of this will have to do with the overall use of MySpace at any period of time, and the fact that the two artists have not had a lot of activity during that period to make their play counts diverge from each other.

Finally, we’ll search for artists that show similar short term peaks to one other. In this case Muse was flagged as a high match for 50 Cent in September 2009, as is clear in the chart below:

Muse and 50 Cent - MySpace Plays Per Hour

Muse and 50 Cent - MySpace Plays Per Hour

If we look at their discographies – we discover that both Muse and 50 Cent made a release on the same day in September.

We’ll investigate the different reasons why two artists might have similar trends to each other in another blog post, so check back soon!

Twitter Filtering

DEC. 4
2009

In this blog we’re going to show you an important feature that helps distinguish the quality of data supplied by musicmetric: The ability to disambiguate whether mentions of an artist with a common word as their name are in fact referring to the artist. Likewise, distinguishing between two artists that have the same name.

These methods are applicable to any text based data, but for this example we’ll take a look at Twitter.

Musicmetric collects all mentions of an artist on Twitter. Taking an example of the rock band Oasis, we collects tweets in the following 3 categories:

  • name mentions: “Oasis”
  • replies: “@Oasis”
  • retweets: “RT @Oasis”

If the artist does not have a twitter ID, we still track their name mentions – and we are currently tracking over 500,000 artists.

It is obvious that all replies and retweets are definitely relevant to the band but some name mentions are probably not. When people post a tweet which includes the word “Oasis”, they might mean Oasis rock band, an isolated area of vegetation and water in a desert or just a name of a random bar or restaurant. Therefore it would be naive to collect tweets without filtering them because this trend data would not reflect the real popularity of the band Oasis on Twitter.

These name mentions are important since a lot of the time people will not cite the @username of the artist when referring to them on twitter (as can be seen in the examples below) and of course, not all bands even have a twitter ID.

At musicmetric, we have developed proprietary algorithms to deal with irrelevant tweets effectively. We analyse all tweets and successfully filter out irrelevant messages by assigning a probability that the tweet is relevant to that particular artist.

The table below shows a good example of our algorithm’s efficiency:

Filtering tweets about the band "Oasis"

Even though there are still few irrelevant tweets (highlighted red) and some vague tweets which we can not tell whether they are relevant or not (highlighted blue), the accuracy has been improved a lot in comparison to the raw data. Currently for bands or artists who have very common names like Oasis, our model can filter up to 70%-80% of irrelevant tweets. For bands or artists who have distinct names like Lady Gaga or Robbie Williams, the model can filter up to 95%-100% of irrelevant tweets.

The chart below shows the number of tweets mentioning Oasis per hour before and after being filtered. You can see a big difference and that is why the filter is very important.

Filtered and unfiltered tweets mentioning "Oasis"

We are still collecting more data and adding more valuable information to our model. Therefore it is expected to work more and more accurately – it learns as it goes, and it can read 96 Million tweets per day, so it learns very quickly.

Why not check some live stats for your bands by registering for a musicmetric Essentials trial?

Trung

Interesting facts: Happiness on twitter by day

DEC. 4
2009

Not that relevant to music, but this graph is pretty cool. We ran a really basic text extraction on 11 Million tweets logged by our servers during the past week, and plotted the proportion of messages each day that contain ’ :) ‘

It’s been corrected for varying popularity of twitter on different days.

Saturday is a happy day, and it’s tomorrow – so cheer up!

I should mention, our sentiment analysis algorithms at musicmetric are rather more advanced than this :-)

A brief look at musicmetric

DEC. 1
2009

In this post we’re going to give a quick fire tour of some charts you can see in our app, demonstrating some of the main functions and how they can be used.

Let’s start off with the big picture. Online Buzz gives an indicator of how many people are talking about an artist on the web. We use clever machines that learn how to cut through the noise and only detect the artist in question.

The chart below shows how the Online Buzz for the band Muse changed since 2006. It shows the number of comments per day about Muse, compared to the overall number of comments about bands.

Muse - Online Buzz since 2006

If we zoom in to the last 6 months as is shown below, we can see the online buzz for Muse has been pretty constant, with a slight increase overall:

Muse - Online Buzz since June 2009
If you need a more granular view than Online Buzz, you can check what’s happening on some music social networks in the Social Networks section.

So, below are the MySpace Views and Plays per hour for Muse; the big spike in September shows when they released their single “Uprising”. The peak immediately after that one was the album release:

Muse - MySpace Plays and Views
These charts show a 24 hour moving average for Plays and Views per hour.
That means we take the average number of plays or views for the last 24 hours and plot that on the graph.

This gives a better visualisation of the trend as the raw data can be confusing. Below (in red) we can see what the raw data looks like without the moving average overlaid:

Muse - MySpace Plays raw data
Remember, musicmetric isn’t just limited to superstar bands like Muse. Let’s take a look at some stats for Master Shortie – an up and coming London rapper.

Here is a view of where people follow Master Shortie online:

Master Shortie - Social Network Fan Locations
Looking at some data about those fans, we can see Master Shortie is pretty popular with the ladies:

Master Shortie - Gender Breakdown
And their age profile fits a distribution around the 18 year old mark:

Master Shortie - Age Breakdown
Now let’s drill down a bit to see where their MySpace fans live.

The chart below shows that fans of Master Shortie on MySpace are located mainly in the USA and UK:

Master Shortie - Top Cities for MySpace Fans
The overall user demographic of MySpace is pretty biased towards these two countries, so let’s check out the top cities for fans of Master Shortie on Twitter:

Master Shortie - Top Cities for Twitter Followers
Nine of the top 10 cities for locations of fans of Master Shortie on Twitter are in the UK, with only New York showing up for the USA.
Now let’s look at where Master Shortie’s Twitter fans live on a map of the world:

Master Shortie - Twitter Fan Locations Map

Each one of those circles represents one or more downloads, when you hover over a circle in the musicmetric application with your mouse you can see an instant pop-up of where and how many downloads the circle represents. It even tells you the exact time a download was made.

The darker and more solid the colour, the more downloads are being overlaid onto the same area, giving a really good indication of popularity by region.

Here is the same map for the location of Master Shortie’s fans, this time on MySpace:

Master Shortie - MySpace Fan Location Map
Now let’s look at the most influential people relevant to Master Shortie on Twitter.
This will tell you the most relevant people on Twitter to target with marketing material, because they actually care about the artist in question, and are very influential in those circles.

We don’t just calculate this based on the number of followers each person gets, but the number of followers their followers get, and so on.

If that doesn’t make sense, imagine it works a bit like the Google PageRank algorithm, because it does. Someone with a million spam bots following them will have a lower rank than another person who’s only being followed by a few very influential people (like a music magazine or a record label).

Master Shortie - Top Twitter Influencers

Let’s move on to Bittorent data now, and take a look at some charts for Robbie Williams.

The chart below shows the number of peers per hour connected to the torrents for the single Bodies and the new album Reality Killed the Video Star. Just so you know, our Bittorent data is anonymous and aggregated to the city level. Tracking individuals isn’t our game.

Robbie Williams - Bittorent Peers Over Time
And here is the map of locations of people downloading the torrents at 7:00pm yesterday (30th November 2009):

Robbie Williams - Bittorent Peers Map Snapshot
Now prepare yourself for the all time cumulative map for Bittorent downloads of Robbie Williams – Reality Killed the Video Star:

Robbie Williams - Bittorent Peers Map All Time
Clearly Robbie is very popular worldwide, so let’s get a closer look below at the largest solid coloured area in the UK and Europe:

Robbie Williams - Bittorent Peers Map All Time Zoomed Into UK
To clearly see the top cities, a table is more suitable. Below are the top cities for Robbie Williams – Bodies on Bittorent:

Robbie Williams - Bittorent Top Cities
So there you have it!

These were just some of the top functions currently launched in our beta version of musicmetric.

Get ready for our full launch over the next few weeks as we’ll be unveiling a rocking host of extra functions, including twitter activity, results from wider ranging web crawls, sentiment analysis for tracks and artists, more social networks, authority ranking for all sources of data, and individual song tracking.

Plus, we’ll be revealing our advanced analytics functions which allow the whole collection of data to be probed in more detail, picking out patterns, similarities, trends and more.

Our development cycle has been insane and it’s really ramping up now! We’ve hired more full time developers, upgraded our data centre, bought dozens more servers, hundreds of TB of storage… We’re just about ready to explode with data, and we love it.

Keep checking back because the updates will keep coming, and if you just can’t wait then register now to begin tracking everything in real time with a free demo of musicmetric essentials.

Scope of musicmetric analytics

OCT. 19
2009

An update from the development team…

Our aim at musicmetric is quite simple: We will collect and analyse all the data on the web (and some that isn’t) related to trends in music and present it to our users in an easily accessible and actionable format. Over the next few months we will have downloaded and analysed a large proportion of all relevant published articles, and will continue to do so as they are written to keep right up to date with opinions, trends and buzz.

Our aims are simple, but the challenges we’ve faced over the last year and a half approaching our launch have been far from trivial, and hopefully this post will give some insight into the technical side of what we’re doing.

Gathering the data, although the easy part, needs an extensive hardware infrastructure to download, extract and archive text from millions of pages a month. Accurately analysing, scaling and detecting patterns in the data locked up in these terabytes of text is the real challenge and most interesting part of working on musicmetric. It would be naive to simply present raw data as trends in the global music landscape (although we do supply raw data), the trend tracking methods we have developed would be useless if not scaled by accurate influence ranking for the sources of these trends, and simply calculating these scores is a huge task in itself.

Likewise, following activity on just one or two social media websites and presenting this as trends would give a massively biased view of where an artist is actually popular. For example, the social media website Orkut is hugely popular in Brazil, so all data originating from this website would be biased towards that country. Likewise with Twitter, trends would lean towards the UK / USA and not necessarily reflect a global view. We are rolling out tracking for multiple social networks over the next month.

Another challenge faced are the methods we have developed for text mining and sentiment analysis (and not just the fact that we need to analyse over a million documents per day). An example would be the band Pavement. How does a machine know if a piece of text is referring to the band, or a pavement alongside a road. What about two artists with the same name? There are three artists that go by the name Nirvana, seven are called Justice. Which one does our customer care about? Perhaps all of them? Disambiguation is key for these applications to work correctly. The methods we use for sentiment analysis also have to cope with changing vocabulary, or even different languages so adaptive methods are key, for this reason we employ a machine learning approach to this problem, which again has taken a long time in development.

Because we know our customers are using this data to make important decisions in how they run their business or manage their artists, we are making absolutely sure that the data is reliable, trustworthy and complete. Traceability of data sources is paramount to reliability. Our infrastructure allows full audit of any piece of data at any time, from how it was scaled or normalised, right back to which one of our servers originally collected the raw version. This is important for a variety of reasons, particularly the ability to show exactly why trends are occurring, and improves trust in our analytics. It is one thing displaying a line chart or an index showing success for an artist, it is quite another presenting a full breakdown of each source of data and how it was included in the analysis, giving clear perspective on how that line chart or index was calculated.

musicmetric is a well funded team of 6 fulltime staff (and growing) with extensive backgrounds and deep knowledge in the field, we are using cutting edge technology and work closely with our partners to solve difficult problems and have spent the last year and a half working these out. We are extremely excited to be coming towards the end of our development / alpha stage and into our official beta, then preparing for our full launch in November.

Lady Gaga huge jump in online views

OCT. 15
2009

Lady Gaga’s recent involvement in the Gay Rights march on Washington DC resulted in this huge increase in views and plays per hour.




It’s not just superstar artists like Lady Gaga who we track – why not check out a demo of musicmetric Essentials and see the stats for the 500,000+ artists we’re currently tracking !