← All Projects

Celestial Object Detection

Classifying stars, galaxies, and quasars from Sloan Digital Sky Survey photometry

Overview

We taught a computer to look at telescope measurements and tell whether each object in the sky is a star, a galaxy, or a quasar.

Methodology

flowchart LR
  A[Raw Data] --> B[Clean & Encode]
  B --> C[EDA]
  C --> D[Train/Test Split]
  D --> E["Random Forest / Decision Tree / KNN / PCA"]
  E --> F["Tune (Cross-Validation)"]
  F --> G["Evaluate: Recall / F1 / ROC"]
  G --> H[Interpret]

The Data

We used a quarter-million real sky observations from a major astronomical survey, each described by 17 measured features.

Exploratory Analysis

We charted each measurement to see how the three object types differ and which features actually carry useful signal.

Key Drivers / Features

One measurement, redshift, turned out to matter far more than any other for telling the objects apart.

Modeling & Results

A simple decision tree read the data almost perfectly, beating the alternative method by a wide margin.

Key Takeaways

A fast, easy-to-explain model can sort sky objects with near-perfect accuracy, making it a practical second opinion for astronomers.

More Visualizations

Tech Stack

Attribution

This project was completed as part of the MIT Applied Data Science Program (MIT IDSS / Great Learning). The program provided the case-study scaffolding; the analysis, code, and results are my own. Published with permission, for portfolio use only.