Projects done as a part of Springboard's Data Science Intensive curriculum.
An attempt to model the highly unpredictable English Premier League and predict the results of each match.
Do home teams really have an advantage in football? Is the effect of this advantage reducing in the English Premier League? How predictable are football leagues anyway? Data to the rescue!
Practise on cleaning up messy data using pandas - XML, JSON, raw text and working with databases.
Useful inferential statistics for drawing conclusions and predicting outcomes. Contains three miniprojects :
- Human Body Temperature - hypothesis testing, confidence intervals, and statistical significance
- Examining Racial Discrimination - does race have a significant impact on the rate of callbacks?
- Reducing Hospital Readmissions - statistical analysis to reduce readmissions to hospitals.
To learn various machine learning models, their advantages and limitations. Contains the following miniprojects :
- Boston House Pricing - predicting housing prices in Boston using linear regression
- Heights and Weights - using logistic regression to classify gender
- Predicting Movie Ratings - use naive bayes algorithm to accurately predict movie ratings based on their reviews
- Customer Segmentation - employ k-means clustering and associated accuracy metrics to partitioning problems