Grooveshark: Behind the scenes
Posted by Chris | 0 CommentsSince its revamp in 2010, most of the magic in Grooveshark happens on the client-side. I will not be talking about these technologies as it was already covered by Jerod Santo's post: The Tech Behind the New Grooveshark.
What I'd like to address is the data exchanges between the client and the server when you hit the play button and the search button.
I do not work for Grooveshark. Some of this information may be incorrect.
When you first load Grooveshark a session id is set by PHP:
Set-Cookie: PHPSESSID=1c3b5c7d906f60cab128b1a2b4c30201; expires=Wed, 06-Apr-2011 23:11:20 GMT; path=/; domain=.grooveshark.com
This session id is required throughout the API and is used to compute a communication token.
The SWF proxy
Most of the communication between the client-side and the server is handled by the Flash object JSQueue.swf.
For those of you who are interested, Grooveshark logs all traffic between the SWF proxy and the server to the browser's console:

Most calls are POST requests to https://listen.grooveshark.com/more.php?<method name> with a JSON formatted string as the data:
{
"parameters": <params>,
"method": <method name>,
"header": {
"uuid": <uuid>,
"clientRevision": <"20101222.5" | "20101222">,
"country: <localization hash>,
"privacy": 0,
"session": <session id>,
"client": <"jsqueue" | "htmlshark">,
"token": <hashed communication token>
}
}
Depending on the method called, client and clientRevision will differ. For instance, searching for songs must be done with htmlshark while downloading songs must be jsqueue.
The API responds with a JSON formatted string with the following structure:
{
"header": {
"session": <session id>,
"serviceVersion: "20100903".
"prefetchEnabled": true
},
"result": <result>
}
The communication token
When the app first loads, an API call is made to getCommunicationToken with the following parameters:
{ "secretKey": "<secret key>" }
The secret key is actually just a MD5 hash of your session id:
var n=hex_md5(GS.service.sessionID);req=new a("getCommunicationToken",{secretKey:n},t,w,{},true)}
The result of this call is a 13 hex character string which they call the communication token.
Hashing the communication token
The communication token is actually used by the Flash proxy to compute another token that the server validates your request with.
This final token is composed of two parts: the randomizer and the hash.
The randomizer consists of 6 random hex characters that are regenerated by the Flash proxy before each API request.
The hash is computed as follows:
<Removed as requested by Grooveshark>
Searching for songs

When you search for a song using the main search box multiple requests are made to the API:
- A search request for
Songs - A search request for
Playlists - A search request for
Users - A search request for
Artists - A search request for
Albums
The data returned by the API is used to build the various elements of the results view:

What is interesting about these requests is that the pagination is done on the client-side. The API appears to always return a maximum of 200 results.
All the queries are sent to the same method getSearchResultsEx with the following parameters:
{
"query": <search query>,
"type": <"Songs" | "Playlists" | "Users" | "Artists" | "Albums">,
"guts": 0,
"ppOverride": false
}
From my tests, it appears that this method requires the client value of the JSON header to be set to htmlshark.
Grooveshark has a lot of interesting data. Searching by "type": "Songs" return many interesting fields such as:
AlbumIDArtistIDSongNameAlbumNameArtistNameCoverArtFilenameEstimateDuration(in seconds)IsLowBitrateAvailablePopularityArtistPopularitySongPlaysArtistPlaysSphinxWeightScoreRank(used to sort the results, I think)
Sample song entry:
{
SongID: '7507736',
AlbumID: '1123311',
ArtistID: '490',
GenreID: '16',
Name: 'Cry for Help',
SongName: 'Cry for Help',
AlbumName: 'Greatest Hits',
ArtistName: 'Rick Astley',
Year: '',
TrackNum: '16',
CoverArtFilename: '1123311.jpg',
TSAdded: '1209773471',
AvgRating: 0,
AvgDuration: 246,
EstimateDuration: 244,
Flags: 0,
IsLowBitrateAvailable: '1',
IsSponsored: '0',
IsVerified: '1',
SongVerified: '1',
AlbumVerified: 1,
ArtistVerified: 1,
Popularity: 1100400047,
AlbumPopularity: 0,
ArtistPopularity: 1100400084,
SongPlays: 87,
ArtistPlays: 2882,
SphinxWeight: 350700,
Score: 41414.48648202,
Rank: 0.99154757716654
}
Downloading cover arts
Cover arts are available in 3 formats:
- Small: 100x100
- Medium: 170x170
- Large: 240x240
They can be accessed through: http://beta.grooveshark.com/static/amazonart/<first letter of size><cover art filename>.
For example, http://beta.grooveshark.com/static/amazonart/s1123311.jpg:

Obtaining a song
About a year ago I was looking into how Grooveshark worked. At the time, when you played a song, the API would return a link to an MP3 file hosted on an Akamai server. The link would have an expiry time of about 7 minutes.
Since then, they have rolled out a new way of serving the media. They seem to be running a bunch of lighttpd servers that stream the media.
The first step is to obtain the stream's ip and a stream key for a given song.
This is done by calling getStreamKeyFromSongIDEx with the following parameters:
{
"prefetch": <true | false>,
"mobile": <true | false>,
"country": <same localization hash as header>,
"songID": <song id>
}
The API response contains:
uSecs: The length of the song in micro seconds.ip: Hostns to tame of the stream server.streamKey: Key used to obtain the song from the stream server.
The next step is quite straightforward.
Perform a POST request to http://<ip>/stream.php with the following url encoded parameter and the MP3 will be returned:
streamKey=<stream key>
Interesting facts about the media files
When mobile=false songs are sent as they were uploaded. This includes the original ID3 tags and the original bitrate.
When mobile=true songs are converted to mono and a bitrate of 64 kbps. In addition, all ID3 tags have been stripped.
Closing remarks
If you want to use Grooveshark's API, you might be better off (legally) to use their HTTP API: ApiShark.com.
Although this explains how to download songs from Grooveshark, it's not an excuse to do so. Grooveshark's terms of service explicitly disallows any storage of data from their service among other things. You should very carefully read and take notice of it.
If someone from Grooveshark happens to stumble across this, let me first say that I am a great fan of your service. If you are unhappy with this blog post for some reason, feel free to contact me.
-Christian Joudrey