What is Audio Fingerprinting? When, Where and Why You Should Use It

by Intrasonics Team

Music festival poster design template modern vintage retro style vector id1055969752

Have you ever been listening to a piece of music, a new track or song by an unknown band (at least to you) somewhere and wondered what it was? In the past, you may have had to ask someone in the hopes they knew.

Nowadays though, you can just take a sample recording of the music and use any one of several Apps offering music identification services and find out that it was Taylor Swift or Ed Sheeran all along.

What does that have to do with the question posed in the title of this post? Everything, because those services rely on audio fingerprinting to identify the music.

Before we dig a little deeper and look at how the technology involved can produce an audio fingerprint, we’ll look at where the word fingerprinting comes into all this.

Fingerprinting History

Over the last couple of decades, fingerprinting technology has been developed and improved upon so that it is used in more than just forensic science by criminal investigation teams to identify individuals.

Fingerprints are now used for protecting our phones, computers and other devices.

It’s no surprise that this is the case though, as human fingerprints are unique and therefore, it makes for a great data reduction identification method. Fingerprints can be identified from just a few key features.

Understanding Audio Fingerprinting

Audio fingerprinting follows similar principles. It is essentially a condensed summary in audio form of an audio signal that can be used to identify the file quickly or locate at speed items that are similar to it in a database of sounds/music.

The Problems

It has in the past been very difficult to successfully identify electronic sounds because of the actual digital audio itself.

Regardless of the equipment you are using and the environment you are in, the human ear listening to the same piece of audio will essentially hear the same thing.

For example, Blue Monday by New Order is very similar whether it played from an MP3 encoded file that has high-quality bit rate or from a CD.

The difference is that a computer that has been tasked with opening, storing and cataloguing the same recordings will identify them as being different.

How Audio Fingerprinting Solves These Problems

How does modern audio fingerprinting get around this issue? It relies on a software algorithm to extract a section or small audio sample of the music and generate a digital summary of the recording’s main attributes.

Parameters such as time, intensity and frequency are used to create a virtual overview of the anchor points and peaks for these attributes.

Podcast line button white colored on gradient background vector vector id1179997501

Take, for example, if you have a recording with a notable spike in frequency at 1 minute of 823-hertz, the software would mark that spot.

For the whole audio sample, all of the marks that are generated are connected by a single line. Lots more lines are produced and together these form the fingerprint of the audio.

You can see the similarities between audio fingerprints and physical fingerprints, can’t you? Al audio fingerprints are unique, and all recorded music and other audio have them.

This fingerprint is taken and stored in a special database.

Rewinding to the scenario we set out at the start, when you take a recording of the audio content you want to identify on your smartphone or device, the same algorithm is run to produce another fingerprint.

It then uses that fingerprint to find a song or audio match from the database.

What Audio Fingerprinting Can Be Used For

Now that we know how audio fingerprints work, let’s explore what they are used for in practical terms. They have many uses which include, but aren’t limited to:

  • Identifying a song
  • Identify advertisements through the tunes or melodies it contains
  • Music or sound library management i.e. library of sound effects
  • Video file identification
  • Other media identification

Looking at the above, it would seem that an audio fingerprint is only used to identify files but its other main use is monitoring and tracking other useful metrics on the following media platforms:

  • Radio
  • TV
  • Records
  • Movies
  • CDs
  • Streaming platforms
  • Peer-to-peer networks.

This type of monitoring has been used to track compensation owed to artists, ensuring compliance with copyright, tracking interaction and engagement data (number of listens, duration etc), and licensing among other things.

Who Uses Audio Fingerprinting?

There are a variety of different companies offering music identification services, as noted earlier, that uses audio fingerprinting. Some of the most popular include:

Shazam

Shazam were one of the first companies to produce an algorithm. Their offering involved using a spectrogram to identify its strongest peaks and then store the corresponding signatures of those peaks.

It then forms the fingerprint by connecting peaks that are close to one another, forming something akin to a spiderweb.

This option is robust to some of the issues like distortions such as white noise, as these will not have a huge impact on the particularly strong peaks.

The downside to this method is that it is not completely clear how many peaks and how many connections are necessary.

The more complex a print created, the bigger the dataset and trickier it would be to compare to references. On the other hand, making it more simplistic could mean there is a greater chance of false positives.

Although the majority of distortions and noises won’t impact the big peaks, they can modify and even shift them.

Phillips

Around the same period, Philips was trialling another kind of fingerprinting. The difference between theirs and Shazam’s method was that Philips used an algorithm that was designed to compress the full spectrogram of the file and looked for changes in frequency and time.

Intrasonics

When Intrasonics was looking into audio fingerprinting, they were trying to find something that fell between Shazam and Philips.

Rather than focusing on which parts of a spectrogram they think is the most important, it utilises cutting-edge machine learning to pick out the best features of any given spectrogram for an audio clip.

This is first achieved by the algorithm going into its training phase, with the spectrogram filtered through a variety of filters set at a variety of different frequencies.

Over 10,000 filters are used and then its results are then collected together. The more characteristic parts will look different from the differing segments, while also looking similar to the similar ones.

The software looks for the features that separate these data points as much as it can and picks the best filters.

Once it has established the most effective filters, they are used to take features from the spectrogram that will make it easy to identify audio.

One of the major advantages this has over other systems is that it does not rely on what is thought to be the most important features of spectrograms, but on the ones that have been proven as such.

Get in Touch For More Information

We’d love to show you how our audio fingerprint technology works to protect your song or any other sound and audio. Contact us for more information and we’ll take you through the process and show you how it works.


< Previous Page
Intrasonics Ltd

Bateman House
82-88 Hills Road
Cambridge
United Kingdom
CB2 1LQ

Get In Touch

+44 (0)1223 927 070
Make an enquiry