Yelp Restaurant Recommendation System

Overview

Help Yelp users discover restaurants they will love by learning from the reviews and ratings other diners have already left.

Yelp (founded 2004) publishes crowd-sourced reviews of local businesses; the goal is to surface restaurants a user is likely to enjoy.
Frame the task two ways: predict a user's rating for unseen restaurants, and find restaurants similar to one a user already likes.
Address the cold-start problem (new users with no history) with a rank-based popularity fallback.
Optimize both precision and recall so recommendations are relevant and few good restaurants are missed.
Compare multiple algorithm families to find the strongest recommender for this data.

Methodology

flowchart LR
  A["User-Item Ratings"] --> B[EDA & Filtering]
  B --> C["Approaches: Popularity / Collaborative Filtering / SVD"]
  C --> D["Evaluate: RMSE / Precision@K"]
  D --> E[Top-N Recommendations]

The Data (Yelp Reviews)

The dataset is hundreds of thousands of Yelp restaurant ratings, where each row is one diner rating one restaurant.

229,907 rating observations, each a single user-restaurant interaction with no duplicate visits.
45,981 unique users and 11,537 unique restaurants, giving an extremely sparse user-item matrix.
Only ~229K of a possible ~53 x 10^7 ratings exist, so the vast majority of pairs are unrated.
Part 1 uses a 3-column ratings table; Part 2 adds review text (5 columns) for content-based modeling.
Data filtered to active users and frequently-rated restaurants to keep modeling computationally efficient.

Exploratory Analysis

Looking at the ratings shows most diners only post when they love a place, and a handful of restaurants dominate the reviews.

Ratings are heavily skewed toward 4 and 5 stars; very few users give 1-3, so people mostly rate places they like.
Most-reviewed restaurant (businessid hW0NeHTHEAgGF1rAdmR-g) was rated 844 times, yet 45,137 users still haven't seen it.
Most-active user (fczQCSmaWF78toLEmb0Zsw) rated 588 restaurants, leaving 10,949 unvisited to recommend.
High-interaction restaurants skew toward 3-4 stars, showing popularity does not guarantee high ratings.
Rank-based model recommends top restaurants by average rating with a minimum-interaction threshold for cold start.

Collaborative Filtering

These models predict how a user would rate a restaurant by learning patterns from similar users and similar restaurants.

Built with the scikit-surprise library: KNNBasic user-user and item-item similarity models plus SVD matrix factorization.
Baseline user-user KNN reached ~42% recall and ~80% precision; evaluated with precision@10/recall@10 at a 3.5 threshold.
GridSearchCV tuned k, similarity metric, and SVD's n_epochs, lr_all, and reg_all to lift performance and lower RMSE.
Tuned item-item and SVD models clearly beat their baselines; SVD gave the best result at ~51% F1 score.
Corrected ratings (subtracting 1/sqrt(n)) re-rank top picks so confidence in popular restaurants is rewarded.

Content-Based (TF-IDF / NLP)

This approach reads the actual review text to recommend restaurants that are described in similar ways.

Uses review text as the feature; recommends businesses whose reviews read most similarly to a chosen restaurant.
Text cleaned with nltk: lowercasing, removing stopwords, punctuation, and non-ASCII characters to reduce noise.
TF-IDF vectorization extracted 9,729 features from the review corpus.
Cosine similarity over the TF-IDF vectors ranks the most similar restaurants for a given business.
For top-rated Asian Cafe Express, most recommendations were 4-5 star restaurants, confirming the system works.

Key Takeaways

Combining what diners rate with what they write about produces stronger, more flexible restaurant recommendations.

Built five recommenders: rank-based, user-user KNN, item-item KNN, SVD matrix factorization, and co-clustering.
SVD matrix factorization was the best collaborative model (~51% F1); tuning consistently improved item-item and SVD.
Content-based TF-IDF handles cold-start items and adds recommendations that ratings-only models cannot.
Collaborative and content-based approaches are complementary and can be blended into a hybrid recommender.
Built with: pandas, numpy, matplotlib, seaborn, scikit-learn, scikit-surprise, nltk.

Tech Stack

pandas — data wrangling and tabular manipulation
numpy — fast numerical arrays
scikit-learn — modeling, pipelines, and evaluation
seaborn — statistical visualization
matplotlib — plotting
scikit-surprise — collaborative-filtering recommenders
nltk — text tokenization & stopwords

Attribution

This project was completed as part of the MIT Applied Data Science Program (MIT IDSS / Great Learning). The program provided the case-study scaffolding; the analysis, code, and results are my own. Published with permission, for portfolio use only.