Optimized Housing Price Prediction Based on XGBoost

doi:10.19907/j.0490-6756.2022.037001

Home > Archive>Volume 59, Issue 3, 2022 >037001. DOI:10.19907/j.0490-6756.2022.037001

Optimized Housing Price Prediction Based on XGBoost
DOI:
                        10.19907/j.0490-6756.2022.037001
                    
Author:
                        
                        
                    
Affiliation:University of Auckland
Clc Number:F299
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Abstract:

Objectively, housing prices are restricted by many factors and because of this, house price prediction remains a very classical and challenging problem in data analysis. In response to the redundancy of house price data, which makes it difficult to identify important features in practical scenarios, this paper proposes an innovative approach to data pre-processing and data prediction by means of double model iterative fitting. The initial data is pre-processed in terms of data meaning, data form and data relevance, then suitable models are selected for training. In traditional machine learning, Random Forest (RF) and XGBoost (XGB) are two commonly used methods. The RF model is able to accurately judge "redundant" features through its Bagging process. The XGB model, while improving prediction, is also limited by its reduced generalisation ability and cannot stably reflect the importance of features. Therefore, this paper uses the RF model to process redundant data and uses the XGB model to fit new data sets to improve the prediction results. In this paper, experiments were conducted on the Kaggle competition dataset ("House PricesAdvanced Regression Techniques") and the test results showed that the final regression accuracy R2 of the XGB regression model was 87%, while the R2 of the single RF model and the single XGB model were 79.2% and 78.7%, respectively. The experiment proves that the data prediction method can significantly improve the effect of housing price prediction. To fully reflect the model fitting effect and prediction ability, the authors change the "house price" to discrete variable which has two categories of "high" and "low", and get the Confusion Matrix with an precision of 93% and a recall of 93%.

Reference

Cited by

Get Citation

Cite this article as: TAO Ran. Optimized Housing Price Prediction Based on XGBoost [J]. J Sichuan Univ: Nat Sci Ed, 2022, 59: 037001.

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:November 04,2021
Revised:January 05,2022
Adopted:January 12,2022
Online: June 01,2022
Published:

Home

About journal

Authors

Referees

Editors

Readers

Contact us

Get Citation

Share

Article Metrics

History