This project uses the GTZAN Music Genre Dataset to classify songs into different genres by experimenting with multiple deep learning models — AutoEncoder, CNN, LSTM, and Transformer — trained and evaluated separately. Each model learns distinct representations of the audio features:
- AutoEncoder compresses and reconstructs features for unsupervised learning.
- CNN captures spatial and frequency patterns in spectrograms.
- LSTM models the temporal dynamics in musical sequences.
- TCN
The performance of each architecture is compared to identify the most effective approach for accurate and robust music genre classification.
