Dataset for: Non-stationary spatiotemporal Bayesian data fusion for pollutants in the near-road environment

posted on 01.08.2019 by Owais Gilani, Veronica J Berrocal, Stuart A. Batterman
Concentrations of near-road air pollutants (NRAPs) have increased to very high levels in many urban centers around the world, particularly in developing countries. The adverse health effects of exposure to NRAPs are greater when the exposure occurs in the near-road environment as compared to background levels of pollutant concentration. Therefore, there is increasing interest in monitoring pollutant concentrations in the near-road environment. However, due to various practical limitations, monitoring pollutant concentrations near roadways and traffic sources is generally rather difficult and expensive. As an alternative, various deterministic computer models that provide predictions of pollutant concentrations in the near-road environment, such as the Research Line-source dispersion model (RLINE), have been developed. A common feature of these models is that their outputs typically display systematic biases and need to be calibrated in space and time using observed pollutant data. In this paper, we present a non-stationary Bayesian data fusion model that uses a novel dataset on monitored pollutant concentrations (NOx and PM2.5) in the near-road environment and, combining it with the RLINE model output, provides predictions at unsampled locations. The model can also be used to evaluate whether including the RLINE model output leads to improved pollutant concentration predictions, and whether the RLINE model output captures the spatial dependence structure of NRAP concentrations in the near-road environment. A defining characteristic of our model is that we model the non-stationarity in the pollutants concentrations by using a recently developed approach that includes covariates, postulated to be the driving force behind the non-stationary behavior.