Dataset for: A Data-driven Approach to Detecting Change Points in Linear Regression Models
datasetposted on 01.08.2019 by Vyacheslav Lyubchich, Tatiana V. Lebedeva, Jeremy M. Testa
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Change points appear in various types of environmental data---from univariate time series to multivariate data structures---and need to be accounted for in proper analysis and inference. The analysis of change points is challenging when no exact information about their number and locations is available, and statistical tests developed under such conditions often have low power identifying the change points. This paper provides a powerful, data-driven procedure for detecting at-most-$m$ change points in linear regression models by adapting a sieve bootstrap approach for a modified cumulative sum statistic. The new procedure does not assume a particular dependence structure nor a particular distribution of regression residuals. It employs a data-driven selection of the order of an autoregressive model and a robust estimation of the model coefficients. Our simulation studies show a competitive performance of the new bootstrap-based procedure compared with its asymptotic counterpart. We apply the new testing procedure to address an important environmental problem in Chesapeake Bay---severe oxygen depletion---and detect two change points in the relationship between the volume of low-oxygen waters and nutrient inputs to the bay during 1985--2017.