Evaluation of Final Project (Unit 11) vs. Initial Proposal (Unit 6)

Unit 6 – Group Project

During Unit 6, we formed a team to work collaboratively on a real-world dataset related to Airbnb listings in New York City. The aim was to explore price determinants and patterns in listing behaviors through Exploratory Data Analysis (EDA), linear regression, and K-Means clustering.

The group jointly conducted:

Preprocessing and cleaning of the dataset (removal of zero-price entries and outliers).
EDA with violin plots, histograms, correlation matrices, and boxplots.
Regression analysis to test if variables like reviews, location, and host activity impacted price.
K-Means clustering to identify distinct market segments (Luxury, Mid-range, Budget).

I contributed specifically to the K-Means clustering section, which later became a key insight in the group's final report. The visual outputs and cluster profiling I produced were integrated into our submission.

Unit 11 – Individual Project

For Unit 11, I worked independently on a different problem: object recognition using Convolutional Neural Networks (CNNs) on the CIFAR-10 dataset. The focus shifted from unsupervised learning in business contexts to deep learning for image classification.

Key individual tasks included:

Designing two CNN architectures (baseline and improved).
Implementing regularization techniques such as dropout, batch normalization, data augmentation, and L2.
Tracking learning curves, validating performance, and analyzing results through confusion matrices and F1-scores.
Delivering a full pipeline including preprocessing, model training, tuning, and result interpretation.

Comparison and Evaluation

Aspect	Unit 6 (Group)	Unit 11 (Individual)
Topic	Airbnb pricing and market segmentation	Image classification with CNNs
Dataset	Tabular data (Airbnb NYC)	Image data (CIFAR-10)
Model Type	Linear Regression, K-Means Clustering	Deep Neural Networks (CNNs)
Collaboration	Team-based (shared EDA, distributed model work)	Independent work
My Role	Clustering analysis and visuals using K-Means	Full design, training, and evaluation of CNN pipeline
Key Learning	Clustering logic, feature scaling, working in Jupyter teams	Deep learning concepts, architecture tuning, regularization
Output Format	Word report + shared Jupyter notebook	Presentation (PPTX) + annotated Jupyter notebook

Reflection

This comparison highlights my growth in both collaborative and independent machine learning contexts. While Unit 6 developed my ability to work across roles, share tasks, and merge findings into one cohesive story, Unit 11 pushed me to take end-to-end ownership of a complex model pipeline.

I learned to adapt my thinking from business analysis to technical computer vision challenges. Although both projects were very different, they reinforced the importance of data understanding, iterative testing, and clear communication—whether with teammates or within my own workflow.