Skip to content

huahuang95/Supervised-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Supervised-Machine-Learning

Melbourne Housing Price Prediction

Team project for Supervised-Machine-Learning (BA810)

Data Source

Project Objectives

  • Build several supervised machine learning models and see which one performs the best in terms of Melbourne housing price prediction.
  • Find important variables that influence the prices the most and provide advice to buyers and sellers.

Methods

  • Language: R
  • Linear Regression
  • Ridge Regression
  • Lasso Regression
  • Bagging (Bootstrap aggregating)
  • Random Forest
  • Boosting

Project Summary

  • According to the exploratory data analysis, we learned that some patterns or relationships exist among the location’s features and house prices, such as Regionname; and the houses far from CBD tend to have lower prices. After we execute several models, variable CouncilArea does strongly affect the house prices.
  • After comparing all the models, the results indicate that CouncilArea has a stronger effect on Melbourne’s housing price than other features.
  • Comparing all the models with test MSE, the decision tree has the highest test MSE, followed by Ridge, Lasso, linear regression, and bagging. Comparatively, random forests and boosting have much lower test MSE, indicating better performance in prediction.
  • If we compare the last two models, random forests and boosting, they excel in different areas. Random forest tends to perform better with data with a lot of statistical noise. While boosting tends to perform with unbalanced data, which was reduced due to our scaling.

By Chiebuka Onwuzurike, Tzu-Hua Huang, Yangyang Zhou, Yichi Zhang