Music Fingerprinting and Recognition

This is a project for NWAPW to do music recognition by training the database with audio fingerprints.

The task of music recognition consists of two essential parts: audio fingerprinting and similarity measurement. The former can be thought of as audio feature extractions and labelling. We are likely to apply common audio feature extraction techniques such as fast fourier transform (FFT) and short-time fourier transform (STFT) to process raw wav files and turn them into high-dimensional feature vectors. Due to the high dimensionality, we use hashing algorithm to build database. By building a database of hash values, one can realistically incorporate a large corpus for comparisons. This can be considered as the training phase. In the testing phase (the actual music “recognition” part), we will process the test audio and label it with a hash value in the same manner as we did to all the training data, and we will match the target audio with pieces/songs in the database that are most similar to it in terms of acoustic features (eg. offset alignment). The program will be refined and improved by adjusting parameters and hyperparameters as well as trying different hash functions and audio processing techniques.

Music Fingerprinting and Recognition

The Team

Team members:

Demo Video

Technical Demo Video

Project Links