Big Data Strikes The Right Chord With Music Listeners

Mat Young explains the big data magic behind companies like Spotify and Shazam

While most think Big Data is strictly the preserve of the enterprise, its uses have a much further reach. If you have listened to a station on a platform such as Pandora or Spotify and discovered a new favourite song, you will have experienced the power of big data first hand. In fact, music discovery applications are one of the simplest ways to explain the power of analytics found in big data algorithms and applications.

Big data applications are as diverse as the many different music genres. Just as there is a record for every occasion, there is a big data application for just about every business analytics need.

Finding Tunes Through Fine Tuned Data

The US online radio service Pandora provides tailored cloud-based radio stations that learn depending on user feedback. Most people don’t think much about the system behind Pandora’s amazing powers of suggestion. What many don’t realise is that this system, which Pandora calls the Music Genome Project, is big data analytics at its finest.

Rock and WaspThe original goal of the Music Genome Project was to “capture the essence of music at the fundamental level.” To do this, Pandora musicologists analyse every song in the Music Genome Project database, according to 450 different musical characteristics. Then, using an algorithm developed by big data scientists, the application they created was able to provide suggestions and tailored playlists on the fly, giving birth to Pandora as we know it today.

The science behind big data brings insight to unorganised, or unstructured data. That data can include terabytes of different database tables, billions of photos, or millions of songs. In the case of Pandora, the attributes assigned to each song create a database that can be used to categorise music by various characteristics. Vocals, tone, and lyrics are examples of a few general categories, but each one of those could have dozens of specific sub-categories, making classification by genre seem archaic in comparison.

By taking thousands, or even millions, of random, unorganised songs and turning them into structured data with specific categories in columns and tables, Pandora can quickly and accurately stream a song to listen to at the click of a button, based on the previous choices a listener has made or the station they created.

If this big data achievement doesn’t quite impress, remember that Pandora’s datacentre does not only need to make a microsecond suggestion for you; it also needs to deliver a unique suggestion to each of its other 70.1 million active users simultaneously.

Big Data As A Tool For Personalisation

These algorithms are not just running cloud services powered by massive datacentres. Big data analysis is also happening on our personal computers connected to the cloud. Many of today’s music libraries have features that will suggest songs to purchase or play based on listening habits and other characteristics. These applications can also build playlists of songs that mesh well together, based on feedback and criteria like those used in Pandora’s Music Genome Project. Juice Team

Spotify recently released a similar feature called Discover, which includes instant, personalised recommendations based on users’ preferences, new releases from artists they follow, and shared playlists created by friends.

People listening to music playing live on the radio, at the bar, or anywhere else, are also using big data to find their next favourite tune. Shazam, and other mobile music identification services help their customers instantly find out who’s responsible for a particular piece of music.

By breaking down tones into 1s and 0s, Shazam creates a fingerprint for each song in its database. Just as every thumbprint is unique, each song’s fingerprint is different, including alternate renditions of the same song. After taking a snippet of an unknown song and running it through this same algorithm, Shazam can find a match in its database. The application then relays the name and artist of the song to the listener.

Big data is about much more than just organising data – it is the solution that can find the proverbial needle in a haystack. There is no benefit to counting the grains of sand on a beach, or songs in a database. The value of big data is unearthed when it helps find a watch somebody lost on the beach, or the listener’s new favourite track.

And while big data is capable of some amazing things, without the brilliant minds behind big data applications, like those at Pandora, Spotify and Shazam, it would be useless.

Who knows what big data applications will spring up next? Perhaps applications that play music according to our moods, based on our biomedical information. While that may seem like a crazy idea today, with the right data inputs and the powerful platforms to run the application, anything is possible with big data.

Mat Young is the Senior Director of Data Propulsion Lab at Fusion-io, where he oversees the testing and implementation of applications  that incorporate Fusion-io memory.

What do you know about wearable computing? Take our quiz!