This project focuses on detecting the user's mode of transport (Car, Still) using smartphone sensor data.
It involves two key stages:
- Pre-processing and feature extraction from raw sensor data.
- Training Machine Learning models for transport mode classification.
The project includes Python scripts for data preprocessing and a Jupyter Notebook for model training and evaluation.
Make sure you have the following installed:
- Python 3.7+
- NumPy
- Pandas
- SciPy
- Scikit-learn
- Matplotlib (for evaluation plots)
Install required libraries using:
pip install numpy pandas scipy scikit-learn matplotlib
In this section we show the functionalities developed in our work and the relative parameters used.
Module name | Parameter | Description |
---|---|---|
load_data(data -dir) | data_dir (str): path to directory containing raw .csv files | Loads all ride data into a single pandas DataFrame. |
clean_data(df) | df (DataFrame): raw ride data | Removes missing values and outliers using Z-score method. |
compute_orientation(df) | df (DataFrame): cleaned ride data | Calculates roll, pitch, game rotation vector, and orientation using a simplified complementary filter. |
calculate_statistics(df) | df (DataFrame): ride data with orientation | Generates statistical features per ride (mean, min, max, std) for each sensor. |
main() | Main execution flow: load data, clean, engineer features, and save to a standardized CSV. |
The preprocessing script generates:
- standardized_ride_data.csv:
A clean and feature-engineered dataset ready for model training.
The Jupyter Notebook covers the complete machine learning pipeline:
- Loading the standardized dataset (
standardized_ride_data.csv
) - Splitting the dataset into training and testing sets
- Model Training using:
- Decision Tree Classifier
- Random Forest Classifier
- Support Vector Machine (SVM)
- Model Evaluation using:
- Accuracy Score
- Confusion Matrix
- Classification Report (Precision, Recall, F1-Score)
The notebook demonstrates end-to-end training and testing for transport mode classification based on smartphone sensor data.
Place your ride data .csv
files in a folder (e.g., ride_data/
).
Then execute the preprocessing script:
python preprocessing.py
This will generate the standardized_ride_data.csv
file.
Open the Jupyter Notebook:
jupyter notebook TransportModeDetection.ipynb
Follow the steps inside the notebook to:
- Load the dataset
- Train machine learning models
- Evaluate model performance
###Project Structure
.
├── ride_data/ # Folder containing raw ride data (.csv files)
├── standardized_ride_data.csv # Output dataset after preprocessing
├── preprocessing.py # Python script for data preprocessing
├── TransportModeDetection.ipynb # Jupyter Notebook for model training and evaluation
└── README.md # Project documentation
This work is licensed under a MIT License.
Lakshman Navaneetha Krishnan and Roshan