Overview
We wanted to predict whether a graduate-school applicant is likely to be accepted based on their academic profile.
- Applicants and admissions teams need an early signal of how strong a candidate's chances really are.
- Each student's chance of admission was reframed into a yes/no call using an 80% threshold (above 80% = Admit).
- Goal: train a feed-forward neural network to classify applicants as admitted or not from their scores and ratings.
- Success means accurately separating likely admits from unlikely ones on applicants the model has never seen.
Methodology
flowchart LR A[Raw Data] --> B[Clean & Encode] B --> C[EDA] C --> D[Train/Test Split] D --> E["CNN"] E --> F["Tune (Cross-Validation)"] F --> G["Evaluate: R2 / RMSE"]
The Data
We used a clean dataset of 500 past applicants, each described by their test scores and academic ratings.
- 500 applicant records with 8 columns, all numeric, and no missing values to clean up.
- Features include GRE, TOEFL, University Rating, SOP, LOR, CGPA, and research experience.
- Average GRE was ~316 of 340 and average TOEFL ~107 of 120, with some students scoring full marks.
- Serial number was dropped, and the continuous Chance of Admit was converted into the binary Admit target.
- Data was split 80:20 into training and test sets, with numeric features scaled and ratings one-hot encoded.
Exploratory Analysis
We charted the data to see how test scores and ratings relate to each other and to admission.
- GRE and TOEFL scores move together: students who score high on one tend to score high on the other.
- As GRE and TOEFL rise, the strength of the statement of purpose (SOP) also tends to increase.
- Scatter plots show a visible separation between students who were admitted and those who were not.
- Higher university ratings line up with higher CGPA and stronger overall admission chances.
- Boxplots confirm admitted students carry higher CGPA than those who were not admitted.


Key Drivers of Admission
A few academic measures stood out as the strongest signals of who gets in.
- CGPA is the clearest differentiator: admitted students consistently show higher grades.
- Strong, correlated GRE and TOEFL scores mark the most competitive applicants.
- University rating, SOP strength, and CGPA all rise together, reinforcing one another.
- Correlation heatmap confirms CGPA, GRE, and TOEFL are the most tightly linked to admission.
- Research experience and recommendation strength add supporting, secondary signal.

Modeling & Results
We built and tuned a neural network until it reliably predicted admission about 95% of the time.
- Started with a feed-forward network of 2 hidden layers plus an output layer, training 9,857 parameters.
- Compiled with binary cross-entropy loss and tuned optimizers, layer count, neurons, and learning rate.
- Accuracy-vs-epoch curves showed training accuracy rising and stabilizing well after 150 epochs.
- The final tuned model reached a generalized 95% accuracy on both training and validation data.
- On unseen test data it held 95% accuracy, with most precision and recall metrics above 90%.


Key Takeaways
The model gives a dependable, automated read on which applicants are likely to be admitted.
- A tuned neural network classifies admission likelihood with 95% accuracy on completely unseen applicants.
- CGPA, GRE, and TOEFL scores are the dominant drivers behind a strong admission profile.
- Reframing a continuous chance into a clear yes/no makes the output easy for stakeholders to act on.
- Careful scaling, encoding, and hyper-parameter tuning were key to reaching generalized performance.
- Built with: Python, pandas, NumPy, Matplotlib, Seaborn, scikit-learn, TensorFlow, Keras
More Visualizations

Tech Stack
- pandas — data wrangling and tabular manipulation
- numpy — fast numerical arrays
- scikit-learn — modeling, pipelines, and evaluation
- seaborn — statistical visualization
- matplotlib — plotting
- tensorflow — deep-learning framework
- keras — high-level neural-network API
Attribution
This project was completed as part of the MIT Applied Data Science Program (MIT IDSS / Great Learning). The program provided the case-study scaffolding; the analysis, code, and results are my own. Published with permission, for portfolio use only.