Dataset for: The Spatially-Conscious Machine Learning Model
datasetposted on 06.11.2019 by Timothy J. Kiely, Nathaniel D. Bastian
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Successfully predicting gentrification could have many social and commercial applications; however, real estate sales are difficult to predict because they belong to a chaotic system comprised of intrinsic and extrinsic characteristics, perceived value, and market speculation. Using New York City real estate as our subject, we combine modern techniques of data science and machine learning with traditional spatial analysis to create robust real estate prediction models for both classification and regression tasks. We compare several cutting edge machine learning algorithms across spatial, semi-spatial and non-spatial feature engineering techniques, and we empirically show that spatially-conscious machine learning models outperform non-spatial models when married with advanced prediction techniques such as random forests, generalized linear models, gradient boosting machines, and artificial neural networks.