Classifying Song Genres from Audio Data. Applying Machine Learning methods in Python to classify songs into genres.
Using a dataset comprised of songs of two music genres (Hip-Hop and Rock), I trained a classifier to distinguish between the two genres based only on track information derived from Echonest(now part of Spotify). Firstly, I used the pandas and seaborn packages in Python for subsetting the data, aggregating information, and creating plots when exploring the data for obvious trends or factors. Next, I used the scikit-learn package to predict whether I can correctly classify a song's genre based on features such as danceability, energy, acousticness, tempo, etc. I have gone over implementations of common algorithms such as PCA, logistic regression, decision trees, and so forth.
Tasks:
- Preparing our dataset
- Pairwise relationships between continuous variables
- Normalizing the feature data
- Principal Component Analysis on our scaled data
- Further visualization of PCA
- Train a decision tree to classify genre
- Compare our decision tree to a logistic regression
- Balance our data for greater performance
- Does balancing our dataset improve model bias?
- Using cross-validation to evaluate our models