← All Projects

BigMart Sales Prediction

Predicting item-level outlet sales with interpretable linear regression

Overview

We wanted to predict how much each product would sell at each store so the retailer can plan demand.

Methodology

flowchart LR
  A[Raw Data] --> B[Clean & Encode]
  B --> C[EDA]
  C --> D[Train/Test Split]
  D --> E["Linear Regression"]
  E --> F["Tune (Cross-Validation)"]
  F --> G["Evaluate: R2 / RMSE"]

The Data

We used two store-sales tables and cleaned up messy categories and missing entries before modeling.

Exploratory Analysis

We charted each variable and how they relate, finding price is the main thing tied to sales.

Key Drivers of Sales

Item price and store type were the strongest factors pushing sales up or down.

Modeling & Results

A log-transformed linear regression met all statistical assumptions and predicted sales reliably.

Key Takeaways

Stock high-priced items in visible spots and grow large supermarket-type stores to raise sales.

More Visualizations

Tech Stack

Attribution

This project was completed as part of the MIT Applied Data Science Program (MIT IDSS / Great Learning). The program provided the case-study scaffolding; the analysis, code, and results are my own. Published with permission, for portfolio use only.