🛒 Walmart Sales Analysis & Forecast Dashboard

🖼️ Final StreamLit Result

A full machine learning pipeline and interactive dashboard to analyze and forecast Walmart sales. Combines model experimentation with an intuitive Streamlit app for exploring weekly sales trends and predictions.

📋 Table of Contents

📌 Project Overview
📊 Dataset
🔄 Pipeline & Workflow
📈 Insights
💻 Usage Examples
🤝 Contributing
📬 Contact

📌 Project Overview

This project analyzes Walmart’s sales data and forecasts future weekly sales using machine learning. It includes:

A Jupyter Notebook that trains multiple models (Random Forest, XGBoost, LightGBM, ...) using a custom time-series pipeline.
A Streamlit dashboard to explore sales data and compare predictions interactively, allowing users to filter by store, department, time range and also ranking from best to worst based on total sales, $ Growth and % Growth.

📊 Dataset

Sources:

Model Training: Historical sales and market data from https://www.kaggle.com/competitions/walmart-recruiting-store-sales-forecasting/overview, yfinance, pandas-datareader, and akshare. Also some columns for holidays and special events or Tax return used a ramp up and down do represent importance.
Dashboard: Pre-processed CSV file df_wm_store_sales_predictions.csv, which contains weekly sales and predictions.

Data Type	Source	Description
Stock Data	`yfinance` / `pandas-datareader`	Weekly Walmart stock prices and economic indicators
Processed Dataset	Local CSV (`df_wm_store_sales_predictions.csv`)	Sales & predictions used in the dashboard

🔄 Pipeline & Workflow

🧠 Model Training (Jupyter Notebook)

Data Fetching – Collect WMT stock and macroeconomic indicators
Feature Engineering – Create time-aware features and lags
Time-Series CV – TimeSeriesSplit with performance tracking
Modeling – Train Random Forest, XGBoost, LightGBM, ...
Interpretation – SHAP values and permutation importance ( not completed because of lack of compute power and time constrains

📊 Streamlit Dashboard

app.py: Main controller for layout, interaction, routing
data_loader.py: Loads cached data using @st.cache_data
filters.py: Applies store/department/date filters
metrics.py: Calculates KPIs (sales totals, growth, date ranges)
ui_components.py: Charts, grids, headers, KPIs, footers

📈 Insights

The Random Forest model accurately captures general sales trends.
SHAP analysis would highlights the most impactful features on predictions.
Dynamic visual tools make it easy to identify underperforming stores or departments.

Next Steps

🔧 Hyperparameter Tuning: Optimize Random Forest for better accuracy ( if possible get more data on departments and have daily or hourly sales instead of weekly)
📊 More Visuals: Add SHAP force plots

💻 Usage Examples

Install requirements:

For notebook usage just ran all cells. If you want to run on kaggle just delete the """ in the first cell. ( note that there is a function called "%%skip". This function was used to run all cell but those that start with that.
To check streamlit app just run the streamlit.bat found in the main directory.

🤝 Contributing

Fork the repo
Create a feature branch (git checkout -b feature/YourFeature)
Commit your changes (git commit -m "Add new analysis")
Push (git push origin feature/YourFeature)
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
streamlitapp		streamlitapp
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
main.ipynb		main.ipynb
requirements.txt		requirements.txt
streamlit.bat		streamlit.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛒 Walmart Sales Analysis & Forecast Dashboard

🖼️ Final StreamLit Result

📋 Table of Contents

📌 Project Overview

📊 Dataset

Sources:

🔄 Pipeline & Workflow

🧠 Model Training (Jupyter Notebook)

📊 Streamlit Dashboard

📈 Insights

Next Steps

💻 Usage Examples

Install requirements:

🤝 Contributing

📬 Contact

Authors

Jorge M. M. L. Rodrigues

About

Uh oh!

Releases

Packages

Languages

JorgeMMLRodrigues/ml_walmart_price

Folders and files

Latest commit

History

Repository files navigation

🛒 Walmart Sales Analysis & Forecast Dashboard

🖼️ Final StreamLit Result

📋 Table of Contents

📌 Project Overview

📊 Dataset

Sources:

🔄 Pipeline & Workflow

🧠 Model Training (Jupyter Notebook)

📊 Streamlit Dashboard

📈 Insights

Next Steps

💻 Usage Examples

Install requirements:

🤝 Contributing

📬 Contact

Authors

Jorge M. M. L. Rodrigues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages