Merry Christmas everyone!
Check out this chart showing when people are wishing Merry Christmas to the world on Twitter.
The chart shows the number of Merry Christmas tweets per 10 minute time interval since the 22nd December.
Musicmetric tracks what is happening to music online. We do this by data mining the web, we crawl and analyse tens of thousands of pages per day, and monitor thousands of live data sources and p2p networks to deliver a fully featured music analytics platform.
The other day we made a blog post about Twitter activity per hour for Rage Against The Machine – here is an update, and a comparison to Twitter activity for Joe McElderry.
Who’s going to get the No.1 single this Christmas?
We’re very proud to announce that musicmetric will be supplying top 10 charts for online activity to the UK music industry publication MusicWeek in the New Year.
The partnership was announced this week on page 2 in the magazine, along with a great article outlining what we do at musicmetric – so go out and buy your copy now, check out their website: www.musicweek.com or follow them on Twitter at @MusicWeekNews
Check back in the New Year to see the charts in print and online.
Our development team are proud to announce that the musicmetric hardware infrastructure and virtualisation set-up they designed has been immortalised in the book Practical Virtualization Solutions!
Possibly one of the geekier books to be featured in, they describe in one chapter how we utilise Sun hardware and virtualisation to maximise the power of our infrastructure.
If you want to read all about it, then you can purchase a copy here: http://bit.ly/8i4Xzl :-)

If you’ve been reading the news recently, you’ll know that Rage Against The Machine are heading to beat X Factor’s Joe McElderry for the Christmas number one.
Check out the recent activity on Twitter for people tweeting about Rage Against The Machine. The chart below shows number of mentions per hour:
Notice the daily variation due to time zones causing regular peaks when people are awake and tweeting.
Will tomorrows peak be even bigger than today? Sign up for a free musicmetric Essentials demo and find out :-)
In this blog post we’re going to look at an example of some of the data mining and large scale analysis which we do at musicmetric, detecting patterns and similarities in time series data.
One use of this analysis is that given an artist, we can find another artist with the closest trend in some variable over time – for example MySpace plays per hour. Alternatively we could generate a list of artists who are increasing in popularity in a certain way, or show which artists have had a brief surge in activity – maybe caused an album release or gig.
Because we store all the data indefinitely and in such a way that we can access it very rapidly, we can run regular batch analysis on the contents of our data warehouse to unlock interesting information.
In this example, we will compare the play count time series data for the top 20,000 artists by total plays on MySpace. It is important to consider that some trends may follow each other with a time lag, so we compare the 20K time series at multiple time lags from 0 to 30 days in the past, in 1 day increments. This means the approximate number of time series comparisons our analysis servers must do for this particular problem is 6 Billion, each one comparing hourly resolution data over a period of 4 months.
Let’s take a look at which artist has a similar trend to Kings of Leon:
We can see the plays per hour for The Fray seem to be following a similar long term trend to that of Kings of Leon, but offset by the difference in their popularity on MySpace – although they are converging as time goes on. The peaks and troughs also line up, so clearly the fine resolution hourly variation in the data has something to do with the overall use of MySpace at any period in time, not just the popularity of the artist. This is something that can be seen over most MySpace data.
Now let’s look at two artists who have even more similar plays per hour to each other:
The Clash and Dido show very high similarity for plays per hour on MySpace over the time frame shown in the chart above. A lot of this will have to do with the overall use of MySpace at any period of time, and the fact that the two artists have not had a lot of activity during that period to make their play counts diverge from each other.
Finally, we’ll search for artists that show similar short term peaks to one other. In this case Muse was flagged as a high match for 50 Cent in September 2009, as is clear in the chart below:
If we look at their discographies – we discover that both Muse and 50 Cent made a release on the same day in September.
We’ll investigate the different reasons why two artists might have similar trends to each other in another blog post, so check back soon!
In this blog we’re going to show you an important feature that helps distinguish the quality of data supplied by musicmetric: The ability to disambiguate whether mentions of an artist with a common word as their name are in fact referring to the artist. Likewise, distinguishing between two artists that have the same name.
These methods are applicable to any text based data, but for this example we’ll take a look at Twitter.
Musicmetric collects all mentions of an artist on Twitter. Taking an example of the rock band Oasis, we collects tweets in the following 3 categories:
If the artist does not have a twitter ID, we still track their name mentions – and we are currently tracking over 500,000 artists.
It is obvious that all replies and retweets are definitely relevant to the band but some name mentions are probably not. When people post a tweet which includes the word “Oasis”, they might mean Oasis rock band, an isolated area of vegetation and water in a desert or just a name of a random bar or restaurant. Therefore it would be naive to collect tweets without filtering them because this trend data would not reflect the real popularity of the band Oasis on Twitter.
These name mentions are important since a lot of the time people will not cite the @username of the artist when referring to them on twitter (as can be seen in the examples below) and of course, not all bands even have a twitter ID.
At musicmetric, we have developed proprietary algorithms to deal with irrelevant tweets effectively. We analyse all tweets and successfully filter out irrelevant messages by assigning a probability that the tweet is relevant to that particular artist.
The table below shows a good example of our algorithm’s efficiency:

Even though there are still few irrelevant tweets (highlighted red) and some vague tweets which we can not tell whether they are relevant or not (highlighted blue), the accuracy has been improved a lot in comparison to the raw data. Currently for bands or artists who have very common names like Oasis, our model can filter up to 70%-80% of irrelevant tweets. For bands or artists who have distinct names like Lady Gaga or Robbie Williams, the model can filter up to 95%-100% of irrelevant tweets.
The chart below shows the number of tweets mentioning Oasis per hour before and after being filtered. You can see a big difference and that is why the filter is very important.
We are still collecting more data and adding more valuable information to our model. Therefore it is expected to work more and more accurately – it learns as it goes, and it can read 96 Million tweets per day, so it learns very quickly.
Why not check some live stats for your bands by registering for a musicmetric Essentials trial?
Trung
Not that relevant to music, but this graph is pretty cool. We ran a really basic text extraction on 11 Million tweets logged by our servers during the past week, and plotted the proportion of messages each day that contain ’ :) ‘
It’s been corrected for varying popularity of twitter on different days.
Saturday is a happy day, and it’s tomorrow – so cheer up!

I should mention, our sentiment analysis algorithms at musicmetric are rather more advanced than this :-)
In this post we’re going to give a quick fire tour of some charts you can see in our app, demonstrating some of the main functions and how they can be used.
Let’s start off with the big picture. Online Buzz gives an indicator of how many people are talking about an artist on the web. We use clever machines that learn how to cut through the noise and only detect the artist in question.
The chart below shows how the Online Buzz for the band Muse changed since 2006. It shows the number of comments per day about Muse, compared to the overall number of comments about bands.

If we zoom in to the last 6 months as is shown below, we can see the online buzz for Muse has been pretty constant, with a slight increase overall:

If you need a more granular view than Online Buzz, you can check what’s happening on some music social networks in the Social Networks section.
So, below are the MySpace Views and Plays per hour for Muse; the big spike in September shows when they released their single “Uprising”. The peak immediately after that one was the album release:

These charts show a 24 hour moving average for Plays and Views per hour.
That means we take the average number of plays or views for the last 24 hours and plot that on the graph.
This gives a better visualisation of the trend as the raw data can be confusing. Below (in red) we can see what the raw data looks like without the moving average overlaid:

Remember, musicmetric isn’t just limited to superstar bands like Muse. Let’s take a look at some stats for Master Shortie – an up and coming London rapper.
Here is a view of where people follow Master Shortie online:

Looking at some data about those fans, we can see Master Shortie is pretty popular with the ladies:

And their age profile fits a distribution around the 18 year old mark:

Now let’s drill down a bit to see where their MySpace fans live.
The chart below shows that fans of Master Shortie on MySpace are located mainly in the USA and UK:

The overall user demographic of MySpace is pretty biased towards these two countries, so let’s check out the top cities for fans of Master Shortie on Twitter:

Nine of the top 10 cities for locations of fans of Master Shortie on Twitter are in the UK, with only New York showing up for the USA.
Now let’s look at where Master Shortie’s Twitter fans live on a map of the world:

Each one of those circles represents one or more downloads, when you hover over a circle in the musicmetric application with your mouse you can see an instant pop-up of where and how many downloads the circle represents. It even tells you the exact time a download was made.
The darker and more solid the colour, the more downloads are being overlaid onto the same area, giving a really good indication of popularity by region.
Here is the same map for the location of Master Shortie’s fans, this time on MySpace:

Now let’s look at the most influential people relevant to Master Shortie on Twitter.
This will tell you the most relevant people on Twitter to target with marketing material, because they actually care about the artist in question, and are very influential in those circles.
We don’t just calculate this based on the number of followers each person gets, but the number of followers their followers get, and so on.
If that doesn’t make sense, imagine it works a bit like the Google PageRank algorithm, because it does. Someone with a million spam bots following them will have a lower rank than another person who’s only being followed by a few very influential people (like a music magazine or a record label).

Let’s move on to Bittorent data now, and take a look at some charts for Robbie Williams.
The chart below shows the number of peers per hour connected to the torrents for the single Bodies and the new album Reality Killed the Video Star. Just so you know, our Bittorent data is anonymous and aggregated to the city level. Tracking individuals isn’t our game.

And here is the map of locations of people downloading the torrents at 7:00pm yesterday (30th November 2009):

Now prepare yourself for the all time cumulative map for Bittorent downloads of Robbie Williams – Reality Killed the Video Star:

Clearly Robbie is very popular worldwide, so let’s get a closer look below at the largest solid coloured area in the UK and Europe:

To clearly see the top cities, a table is more suitable. Below are the top cities for Robbie Williams – Bodies on Bittorent:

So there you have it!
These were just some of the top functions currently launched in our beta version of musicmetric.
Get ready for our full launch over the next few weeks as we’ll be unveiling a rocking host of extra functions, including twitter activity, results from wider ranging web crawls, sentiment analysis for tracks and artists, more social networks, authority ranking for all sources of data, and individual song tracking.
Plus, we’ll be revealing our advanced analytics functions which allow the whole collection of data to be probed in more detail, picking out patterns, similarities, trends and more.
Our development cycle has been insane and it’s really ramping up now! We’ve hired more full time developers, upgraded our data centre, bought dozens more servers, hundreds of TB of storage… We’re just about ready to explode with data, and we love it.
Keep checking back because the updates will keep coming, and if you just can’t wait then register now to begin tracking everything in real time with a free demo of musicmetric essentials.