Data Science Projects

Data Science Projects
In May 2022, I worked on several data science projects as part of my Lectures in Data Science / ML. Here’s a collection of the different algorithms and approaches I experimented with:
Naive Bayes Classification
I used the MNIST dataset of handwritten digits to experiment with Naive Bayes classification. The project involved comparing different supervised classification algorithms to see how they perform on image recognition tasks. I looked at how well each method could identify the handwritten digits and what trade-offs existed between the different approaches.

Random Forest Regression
This project looked at predicting New York taxi fares using Random Forest Regression. I tried to find the best parameters for the model by testing different combinations. The goal was to accurately predict fares based on trip data like distance, time of day, and passenger count. I experimented with different model settings to see what worked best.

Movie Recommender System
For this project, I built a movie recommender system using the MovieLens database. I tried different approaches to collaborative filtering, mainly focusing on how to recommend movies based on similar users’ ratings. The system looks at what movies are similar to each other based on how people rated them and tries to suggest new movies you might like.

Image Classification with CNNs
I experimented with building and training the AlexNet CNN architecture for image classification using the CIFAR-10 dataset. The project involved setting up the different layers of the neural network and tweaking the learning parameters to improve accuracy. I played around with various settings to see how they affected the model’s ability to correctly identify the different image categories.

All of these projects were part of my data science learning process. The full code and notebooks are available in my DataScience-Lectures GitHub repository if you want to check them out.
Technologies Used
- Python
- pandas
- numpy
- matplotlib
- sklearn
- TensorFlow
- Google Colab