I love last.fm. I’m a bit of a quantified-self nerd, so if there’s data to be gleaned about my habits I’m all for it.
With that it mind, I set out to download all my historical Last.fm data, but it turns out there’s no way to do so. Foiled! Except their API doesn’t seem to care so much, so let’s use that.
The full code for this is available on GitHub, and I suggest cloning it first.
A word of warning: I have 55,646 scrobbles at the time of writing and it took 11m13s, so expect it to take some time.
Once you have it cloned,
cd into the repo and install the requirements with
pip3 install -r requirements.txt
What follows is just an explanation of what is happening.
Forgive my terrible Python, it’s not always this bad, I promise.
We need just a few dependencies for this project, here’s my
pandas~=0.25.2 requests~=2.22.0 matplotlib~=3.3.2 requests_toolbelt
- for making REST requests
- for creating a dataframe and outputting to CSV at the end
- for creating our plot
- for making our API requests in a thread pool
We’ll need an API key, and you generate one at https://www.last.fm/api/account/create. If you don’t provide any credentials the script will exit with an error message.
Snippets from the script
Here is the signature of the method I use to download the scrobbles:
Make a request to get the number of pages of scrobbles we’ll be receiving
Estimating the total time
This is fairly innacurate because of network speed, last.fm’s API performance, computer performance etc., but I try to calculate the estimated time and it’s usually fairly accurate.
and the definition of
Some objects to store the data in
urls is going to be a list of URLs which we will construct and pass to a thread pool to be processed parallely.
Add request URLs to urls list
Starting the thread pool and making all requests
Parsing the date from last.fm
Now we have a
p object containing all the responses from the API, we can do
some processing on it and grab what we want out of it.
@attr are to do with currently scrobbling songs. They return a
different structure of Json so I just skip them.
All we’re doing here is looping through the API responses and putting them in the appropriate
list. Note that they will be out of order, so we also take the
UTC time stamp to eventually
sort our DataFrame by.
Pandas to create a CSV
And the final step is to create a Pandas dataframe in order to output our data to CSV:
Notice that we use the timestamps to sort the dataframe so the most recent tracks are at the top.
Finally, we write that data to a CSV and save it to disk:
You will now have a populated
data/lastfm_scrobbles.csv in the repo where you cloned the code.
Now you have a CSV containing all of your last.fm scrobbles from now back to when you first started scrobbling.
Here’s an excerpt of the CSV I get out:
artist,album,track,timestamps,datetime Wingnut Dishwashers Union,Burn the Earth! Leave It Behind!,Proudhon in Manhattan,1603055259,2020-10-18 21:07:39 Ramshackle Glory,Live the Dream,Your Heart Is a Muscle the Size of Your Fist - Live,1603055038,2020-10-18 21:03:58 Rent Strike,IX,I: Snowdrop,1603054871,2020-10-18 21:01:11 Rent Strike,Burn It All!,Burn it all!,1603054676,2020-10-18 20:57:56 Rent Strike,Burn It All!,They Live!!,1603054446,2020-10-18 20:54:06 Rent Strike,Burn It All!,Ether Rag,1603054286,2020-10-18 20:51:26 Rent Strike,IX,IX: To the West!!,1603053963,2020-10-18 20:46:03 Rent Strike,IX,VIII: Shadow&Gloom,1603053735,2020-10-18 20:42:15 Rent Strike,IX,VII:,1603053518,2020-10-18 20:38:38 Rent Strike,IX,VI: Don't Let Love Bog You Down,1603053297,2020-10-18 20:34:57 Rent Strike,IX,V: Fair Trade Death March,1603052966,2020-10-18 20:29:26 Rent Strike,IX,"IV: Me, Myself & The Eye",1603052652,2020-10-18 20:24:12 Rent Strike,IX,III: Family Graveyard,1603052352,2020-10-18 20:19:12 Rent Strike,IX,II: The Road Giveth...,1603052111,2020-10-18 20:15:11 Rent Strike,IX,I: Snowdrop,1603051944,2020-10-18 20:12:24 Elton John,Madman Across The Water,Tiny Dancer,1603050610,2020-10-18 19:50:10 Wingnut Dishwashers Union,Burn the Earth! Leave It Behind!,My Idea of Fun!,1603028042,2020-10-18 13:34:02 Wingnut Dishwashers Union,Burn the Earth! Leave It Behind!,Urine Speaks Louder Than Words,1603027921,2020-10-18 13:32:01
Generating a bar chart from your results
Now you have all your data, you can use
plot.py which is included in the repository to generate a bar chart.
Finding how many times you’ve listened to a song
rg "little wing" data/lastfm_scrobbles.csv | wc -l 33
Finding how many times you’ve listened to an artist
➜ rg "rent strike" data/lastfm_scrobbles.csv| wc -l 89
If you want to see the full Python script, it’s here: https://github.com/mathieuhendey/lastfm_downloader/blob/master/downloader.py