This project aims to develop a product review system powered by NLP models that aggregate customer feedback from different sources. The key tasks include classifying reviews, clustering product categories, and using generative AI to summarize reviews into recommendation articles.
With thousands of reviews available across multiple platforms, manually analyzing them is inefficient. This project seeks to automate the process using NLP models to extract insights and provide users with valuable product recommendations.
- Objective: Classify customer reviews into positive, negative, or neutral categories to help the company improve its products and services.
- Task: Create a model for classifying the textual content of reviews into these three categories.
Since the dataset contains star ratings (1 to 5), you should map them to three sentiment classes as follows:
| Star Rating | Sentiment Class |
|---|---|
| 1 - 2 | Negative |
| 3 | Neutral |
| 4 - 5 | Positive |
This is a simple approach, but you are encouraged to experiment with different mappings!
Model Building
For classifying customer reviews into positive, negative, or neutral, use pretrained transformer-based models to leverage powerful language representations without training from scratch.
distilbert-base-uncased– Lightweight and fast, ideal for limited resources.bert-base-uncased– A strong general-purpose model for sentiment analysis.roberta-base– More robust to nuanced sentiment variations.nlptown/bert-base-multilingual-uncased-sentiment– Handles multiple languages, useful for diverse datasets.cardiffnlp/twitter-roberta-base-sentiment– Optimized for short texts like social media reviews.
Explore models on Hugging Face and experiment with fine-tuning to improve accuracy.
- Evaluated the model's performance on a separate test dataset using various evaluation metrics:
- Accuracy: Percentage of correctly classified instances.
- Precision: Proportion of true positive predictions among all positive predictions.
- Recall: Proportion of true positive predictions among all actual positive instances.
- F1-score: Harmonic mean of precision and recall.
- Calculated confusion matrix to analyze model's performance across different classes.
- Model achieved an accuracy of X% on the test dataset.
- Precision, recall, and F1-score for each class are as follows:
- Class 1: Precision=X%, Recall=X%, F1-score=X%
- Class 2: Precision=X%, Recall=X%, F1-score=X%
- ...
- Confusion matrix showing table and graphical representations
- Objective: Simplify the dataset by clustering product categories into 4-6 meta-categories.
- Task: Create a model to group all reviews into 4-6 broader categories. Example suggestions:
- Ebook readers
- Batteries
- Accessories (keyboards, laptop stands, etc.)
- Non-electronics (Nespresso pods, pet carriers, etc.)
- Note: Analyze the dataset in depth to determine the most appropriate categories.
-
Objective: Summarize reviews into articles that recommend the top products for each category.
-
Task: Create a model that generates a short article (like a blog post) for each product category. The output should include:
- Top 3 products and key differences between them.
- Top complaints for each of those products.
- Worst product in the category and why it should be avoided.
Consider using Pretrained Generative Models like T5, GPT-3, or BART for generating coherent and well-structured summaries. These models excel at tasks like summarization and text generation, and can be fine-tuned to produce high-quality outputs based on the extracted insights from reviews. You are encouraged to explore other Transformer-based models available on platforms like Hugging Face. Fine-tuning any of these pre-trained models on your specific dataset could further improve the relevance and quality of the generated summaries.
- Primary Dataset: Amazon Product Reviews
- Larger Dataset: Amazon Reviews Dataset
- Additional Datasets: You are free to use other datasets from sources like HuggingFace, Kaggle, or any other platform.
- All your three components (classification, clustering, and text summarizer) should be visible or possible to interact with on the page in some form.
- You are free to host the models on your laptop or any cloud platform (e.g., Gradio, AWS, etc.).
- Source Code:
- Well-organized and linted code (use tools like
pylint). - Notebooks should be structured with clear headers/sections.
- Alternatively, provide plain Python files with a
main()function.
- Well-organized and linted code (use tools like
- README:
- A detailed README file explaining how to run the code and reproduce the results.
- Final Output:
- Generated blog posts with product recommendations.
- A website, text file, or Word document containing the final results.
- PPT Presentation:
- A presentation (no more than 15 minutes) tailored for both technical and non-technical audiences.
- Bonus | Deployed Model:
- Bonus: A deployed website/app using the framework of your choice.
- Bonus: Host the app so it can be queried by anyone.
| Task | Points |
|---|---|
| Data Preprocessing | 15 |
| Model for Review Classification | 20 |
| Clustering Model | 20 |
| Summarization Model | 30 |
| PDF Report (Approach, Results, Analysis) | 5 |
| PPT Presentation | 10 |
| Bonus: Deployment & Hosting the App Publicly | 10 |
Passing Score: 70 points.
- Teamwork: Work individually or in groups of no more than 2 people.
- Presentation: Tailor your presentation for both technical and non-technical audiences.
- Data Collection: Gather and preprocess the dataset(s).
- Model Development:
- Build and evaluate the review classification model.
- Develop and test the clustering model.
- Create the summarization model using Generative AI.
- OPTIONAL | Deployment: Deploy the models using your chosen framework.
- Documentation: Prepare the README, PDF report, and PPT presentation.
- Final Delivery: Submit all deliverables.
