Dataset for: The Spatially-Conscious Machine Learning Model

2019-11-06T16:25:18Z (GMT) by Timothy J. Kiely Nathaniel D. Bastian
Successfully predicting gentrification could have many social and commercial applications; however, real estate sales are difficult to predict because they belong to a chaotic system comprised of intrinsic and extrinsic characteristics, perceived value, and market speculation. Using New York City real estate as our subject, we combine modern techniques of data science and machine learning with traditional spatial analysis to create robust real estate prediction models for both classification and regression tasks. We compare several cutting edge machine learning algorithms across spatial, semi-spatial and non-spatial feature engineering techniques, and we empirically show that spatially-conscious machine learning models outperform non-spatial models when married with advanced prediction techniques such as random forests, generalized linear models, gradient boosting machines, and artificial neural networks.