Skip to content

aakashsbhatia2/Visualisation-MiniProject-2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 

Repository files navigation

Visualisation-MiniProject-2

Overview: Perform PCA and MDS on Univerisity Rankings dataset obtained from Kaggle (https://www.kaggle.com/joeshamen/world-university-rankings-2020). Visualize the data using scatter-plots and scatter-matrix.

Assignment Tasks: Task 1: data clustering and decimation (30 points)

  • implement random sampling and stratified sampling (remove 75% of data)
  • the latter includes the need for k-means clustering (optimize k using elbow) Task 2: dimension reduction on both org and 2 types of reduced data (30)
  • find the intrinsic dimensionality of the data using PCA
  • produce scree plot visualization and mark the intrinsic dimensionality
  • show the scree plots before/after sampling to assess the bias introduced
  • obtain the three attributes with highest PCA loadings Task 3: visualization of both original and 2 types of reduced data (40 points)
  • visualize the data projected into the top two PCA vectors via 2D scatterplot
  • visualize the data via MDS (Euclidian & correlation distance) in 2D scatterplots
  • visualize the scatterplot matrix of the three highest PCA loaded attributes

About

Perform PCA and MDS on Univerisity Rankings dataset obtained from Kaggle (https://www.kaggle.com/joeshamen/world-university-rankings-2020). Visualize the data using scatter-plots and scatter-matrix.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published