Help and frequently asked questions
What's all this about then?
After grumbling about the way last.fm calculates artist rankings, I realised that in the age of open data it shouldn't be too much hassle to knock together a little app to apply the normalisation calculation I discussed. The resulting application does the following:
- Takes the your last.fm username and grabs the XML list of your top 50 artists/albums;
- Goes through those artists and grabs album and track data for them from the MusicBrainz web service;
- Calculates the median track duration for each artist, using it to estimate how much time you have spent listening to the artist;
- Sorts the artist list by estimated time.
The resulting table shows your top artist ranking based on the estimated time you have spent listening to them. I think this is a more realistic summary of the underlying attention data, but its also quite interesting to see how much the ranking changes between the two algorithms.
Why does it take ages to work out my new rankings?
It often involves hundreds of web service calls to process a list of 50 artists or albums, especially if you like golden oldies that have been cranking stuff out for donkeys' years. It can take time to calculate the average track lengths. To try and speed things up, we cache average album track time on our local database, but please be patient if the thing is chugging a bit. We're currently caching 70,426 artists and 315,874 albums.
Why is it suddenly going much more quickly?
You can see your list at any time but it is only regenerated once every 24 hours.
Why is it missing some of my favourite artists/albums?
There are many causes that may mean that we can't access track listing data for a certain album or artist, and if we can't find the data, we don't display it. It may be that the metadata attached to the music files you play has a typo. It may be that last.fm's scarily large database cannot match an album with a MusicBrainz ID (and we therefore can't track it). It may be that the even more scarily large MusicBrainz database does not contain the cutting edge tunes that a forward-thinking connoisseur like yourself obviously enjoys. It may even (probably) be that something's gone awry with our data import routines, database cache, web server or carrier pigeons.
Version 2 of the application allows you to manually find artists or albums that are not associated with Musicbrainz IDs by last.fm. If you click the "Try to find on Musicbrainz" link, you will be able to manually select the best match and update your charts.
The figures don't look right at all
We use the median value of track length to estimate the total time spent listening to an artist/album. We hope this gives a reasonable approximation of the listening time, but any algorithm of this nature will only ever be an approximation of the actual figures. Any tracks that are less than 30 seconds long are ignored in the calculations.
Can I link directly to my ranking list?
When you generate a ranking list, there are a few links at the bottom of the page that you can bookmark.
Can I publish the ranking list?
Please feel free to use your data how you wish. Use the little "XML" buttons at the bottom of your ranking list to get directly to the raw figures, hopefully in a format you can do something with. If you come up with any code snippets for standard languages/applications/frameworks etc., please send them through and we'll be happy to share them with others on your behalf.
Why won't it do individual track charts?
Put simply, there's bleedin' millions of 'em. We just don't have the resources to store all those track lengths.
It's also another potential source of errors - last.fm does not include a MusicBrainz ID for each track you listen to in its data feeds, and probably for good reason. A lot of tracks exist in multiple formats (album tracks, singles, remixes, live versions, etc.) and its a hell of a job trying to work out which version you have listened to without having a unique ID for it. We'll keep an eye on it, but don't anticipate a solution anytime soon. Sorry.
Technical details
This application is an extravagant blend of the following key ingredients:
- Django
- Python
- MySQL
- jQuery
- Google Chart API & google-chartwrapper
- A incredible selection of gig photos found on flickr
- Mark James' invaluable Silk Icons
- last.fm and Musicbrainz, obviously