posted on 2017-05-22, 02:17authored byKerrie Mengersen, Erin Peterson, Samuel Clifford, Nan Ye, June Kim, Tomasz Bednarz, Ross Brown, Allan James, Julie Vercelloni, Alan R Pearse, Jacqueline Davis, Vanessa Hunter
There is growing awareness about the potential benefit of harnessing
citizen science for research, particularly in the biological and
environmental sciences. Data quality is a major constraint in the
use of citizen-science data; in particular, imperfect
observations. In this paper we fit species distribution models
(SDMs) to presence-only data (presences and counts, with no absences
observed) by exploiting the uncertainty in reported presences, instead
of generating pseudo-absences as is common in previous presence-only
studies. This approach allowed us to extend the suite of models to
include those commonly fit to presence/absence and abundance
data. We fit several models to a case study dataset of jaguar encounters
reported by citizens in the Peruvian Amazon. The true species
distribution for the case study data is unknown, and so we also
undertake an extensive simulation study to evaluate model
performance. We analyze the sources of error by studying the bias
and variance of the models, and also discuss the predictive
performance of each model and its ability to
recover the true species distribution. The simulation study shows that
although several approaches are capable of recovering the species
distribution, the choice of a modelling approach is a
complex one, and depends on factors such as inferential aim,
model complexity, sample size and computational resources. This
study also addresses some issues in dealing with compound-imperfect
observations arising from citizen-science data, and we discuss
further steps needed in this research area.