Supplement 1, Revision 1. Matlab code for testing the hypothesis of time homogeneity by parametric bootstrap. Submitted 12 December 2005; Published 6 January 2006.
Supplement 1, original
All files at once
Supplement 1, Revision 1
Matlab code for the bootstrap analysis in Spencer and Susko, Continuous-time Markov models for species interactions. Tested under Matlab release 14. Requires the optimization toolbox.
The main function, ML_Q_est5.m, tests the hypothesis that the data can be explained by a homogeneous continuous-time Markov model against the more general alternative that rates are not homogeneous. It first estimates parameters for a homogeneous continuous-time Markov model, then calculates the likelihood ratio between this model and the maximum likelihood discrete-time model. It then generates parametric bootstrap samples from the estimated homogeneous continuous-time model to generate a distribution of the likelihood ratio under the hypothesis of time homogeneity. For detailed instructions on usage, type help ML_Q_est5 in Matlab. Typical usage:
The inputs are:
P is the observed transition matrix for time step of 1 unit.
obs_csum is the number of observed transitions from each state.
reps is number of parametric bootstrap reps to run.
ntimes is number of time periods sampled in original data.
nsites is number of sites sampled in original data.
The outputs are:
Q is the estimated homogeneous instantaneous rate matrix, with all off-diagonal elements constrained to be positive.
negl is partial negative log likelihood for this Q matrix.
nobs is an array of number of observations of each transition (with some rounding if the P matrix wasn't reported exactly).
neglobs is partial negative log likelihood if we use observed transition frequencies (which are the ML estimates in a discrete-time model, not requiring homogeneity in continuous time).
neglz is partial negative log likelihood if we set any tiny elements of Q (<2*eps, where eps is machine precision) to zero. This is usually almost identical to negl.
twodelta is twice difference in partial log likelihoods between the Q matrix and the empirical P matrix: first row is observed, remaining reps rows are from parametric bootstrap. First col is for the Q matrix as estimated, second is with tiny elements of Q set to zero.
p is proportion of reps with twodelta >= observed (first col Q as estimated, second col with tiny elements of Q set to zero).
timetaken is clock time used.
The other functions (mlq_ufun3.m, get_Q_like.m, Qmatrix_bootstrap.m, stationary_probt.m, sorted_eigs.m, Qfromest.m, plike.m, multinomial.m) are called by ML_Q_est5.m.
example_data.mat contains the data described in the paper (Matlab format, compatible with version 6 and later). The original source for these data is Tanner, J., T. Hughes, and J. Connell. 1994. Species coexistence, keystone species, and succession: a sensitivity analysis. Ecology 75:2204–2219 (Exposed Crest site, their Table 2, with additional information from J. Tanner, personal communication)