PK ¬d}X Áø Áø appendix-C.htmUT ¶f¶f
Appendix C. Control analyses for robustness against missing trait values.
Allowing missing trait values for calculating a plot mean might affect the selection of sites for analysis and the trait-trait and trait-environment relationships. In this appendix it is shown that (i) the selected sites are not a biased selection from all sites available, (ii) the average number of species to calculate is in fact much higher than the lower bound set (50% and 20% respectively), (iii) the slopes of the regression lines are not significantly affected by introducing missing trait values, the increase in the uncertainty of the slopes is smaller than the increase of the percentage of missing trait data, and finally, (iv) the structure and significance of the SEM is not affected by the RGR values used. As a result, we conclude that our main conclusions that nutrient availability and disturbance partly affect the same suit of traits and that trait-trait constraints play an important rolestill hold. Step 1 and 2 are done for both the simplified model (Fig. 3) and the complex model of Appendix F. Step 3 is done for the model in Appendix F.
1. A biased subset of the total available number of sites?
The six available data sources represented 19 different vegetation types. The selected set of sites for which trait information was available was not biased compared to all available sites, because selected sites covered all 19 vegetation types. Additionally, the sites not included in the analysis were randomly spread over these 19 vegetation types: The correlation between the number of sites per vegetation type for the selected sites and the total number of available sites was 0.94 (on 10log transformed data to fulfill homogeneity of variance). Therefore the selection criteria have not led to an unbalanced data set.
2. A biased estimate of trait plot means?
Within our data set, plots had been chosen that had sufficient information for at least of 50% of the species (or 20% in case of RGR and LPC). However, might this selection and incomplete information have led to biased results? If missing data are non-randomly distributed, then this can lead to a biased estimate of the plot mean and of biased environment-trait and trait-trait relationships and thus to a different conclusion about the relative contribution of environmental drivers and trait-trait constraints. We performed a three step analysis to test this crucial issue. First, we calculated the actual percentage of species with trait information that were used to calculate plot means. Next, we tested whether the slopes of the paths of our SEM are significantly affected when allowing trait plot means to be based on incomplete data. Finally, we incorporated modelled RGR values in a SEM (of Appendix F) and tested whether the structure still holds.
Step 1: Was the percentage of species with trait information to calculate plot means really that low?
Our selection criterion set a minimum to the availability trait information in order to include the majority of the species per plot. This threshold was 50% of the species, assuming that these species give a good estimate of the true plot trait mean; this minimum was lowered to > 20% for LPC and RGR as these traits were less well covered in the database but were core traits and essential to the analysis. In fact the average percentage of species used to calculate a plot mean was much higher than this minimum percentage (Fig. C1). In reality, the chances for a potential bias (if, in addition, species selection would have been selective and not random) are therefore much smaller than might have been concluded based on only this threshold stated in the manuscript. The only exception is for by RGR, for which on average indeed only 50% of the species had trait information. Given that we had already combined all available trait databases we tested as a second step whether correlations (and thus paths in our SEM) could have been affected by the non-complete trait information.
FIG. C1. The percentage of species used to calculate of plot mean for different traits (abbreviations of traits: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), 10log seed mass of the germinule (SM_g), 10log seed mass of the dispergule (SM_d), 10log maximum canopy height (maxCH), Growth form (GF), seedling relative growth rate (RGR), 10log germination onset (GO), flowering onset FO)).
Step 2: The comparison of the slopes of a complete set and the available data set and estimation of the uncertainty in the slopes
Our claim that ‘regenerative and establishment traits are linked’ was based both on the overall fit of the SEM and on the significance of those path coefficients linking the two groups of traits. We deal with the overall fit in the next section. Here we consider how missing data may affect the significance of the relevant path coefficients in our models. Since the path coefficients in a SEM are conceptually similar to the slopes of the equivalent regressions, the SEM model should be robust against missing trait values if the slopes of the relevant single regressions obtained from our data set (as used in this paper) and a smaller subset of the data set for which trait information is available for all species are not significantly different. Additionally, if the slopes of these regression lines are not significantly different for these two data sets, then the relative contribution of environmental drivers to trait selection and the significance of traits to trait selection will by definition remain unchanged.
To test for this, we used two data sets. The subset with which we compared our data set was defined by selecting those sites that had more than 90% of the species trait data available. Setting this criterium at 90% (and 70% for RGR analyses) ensured that at least 10 sites (average 42 sites) were available for the regression analysis. Note that this 90% means that on average only 1.4 species per plot were missing (a plot contained on average 18 species). The full set was defined as all 156 sites used in the SEM.
A significance test between the slopes of the two regression lines was performed as follows: a dummy variable (0 and 1 for the two data sets, respectively) was included in the regression: Y = a × X + b + c × group + d × group × X. If the slope of the subset is significantly different from the full set, then the parameter d will be significantly different from zero. Running these regressions for all environment-trait and trait-trait-trait relationships of the SEM model presented in Fig. 3 of the manuscript and the model presented in Appendix F (Fig. F1) showed that none of the regressions of the full set differed significantly from the subset (P > 0.05); in other words, the slopes of the regression were not significantly affected by allowing missing trait values up to a maximum of 80% for LPC and RGR and 50% for the other traits (See step 1, Fig. C1). This implies that the SEM would have the same slopes and the same cause-effect relationships if it would have been based on the subset (but with much less power, given the fewer degrees of freedom). Our claim that ‘regenerative and establishment traits are linked’ thus holds. In Table C1 we present the P values and estimates of the slopes.
TABLE C1. Comparison of slopes between subset and full set for all relationships used in the SEM (including the number of sites used (N), estimates of the parameters and P values). Non-significant parameters are indicated in bold.
Independent | Dependent | N | Model | Estimate | P |
log10 Soil C/N | LNC | 13 | Intercept | 28.619 | 0.0000 |
log10 Soil CN | -4.967 | 0.0000 | |||
group (0 = full set, 1 = subset) | 5.531 | 0.5020 | |||
group × log10 Soil CN | -3.950 | 0.5230 | |||
log10 Soil C/N | SLA | 50 | Intercept | 24.695 | 0.0000 |
log10 Soil CN | -2.549 | 0.0749 | |||
group (0 = full set, 1 = subset) | 1.104 | 0.7698 | |||
group × log10 Soil CN | -1.629 | 0.5815 | |||
log10 Soil C/N | LPC | 13 | Intercept | 2.6463 | 0.0000 |
log10 Soil CN | -0.6484 | 0.0007 | |||
group (0 = full set, 1 = subset) | -0.0324 | 0.9810 | |||
group × log10 Soil CN | -0.0474 | 0.9627 | |||
log10 Soil C/N | SM_g | 20 | Intercept | -0.4638 | 0.0208 |
log10 Soil CN | -0.5799 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.0017 | 0.9979 | |||
group × log10 Soil CN | -0.1915 | 0.7088 | |||
log10 Soil C/P | LPC | 70 | Intercept | 3.493 | 0.0000 |
log10 Soil CP | -0.751 | 0.0000 | |||
group (0 = full set, 1 = subset) | -1.352 | 0.0559 | |||
group × log10 Soil CP | 0.546 | 0.0803 | |||
log10 Soil C/P | LNC | 13 | Intercept | 32.782 | 0.0000 |
log10 Soil CP | -4.696 | 0.0000 | |||
group (0 = full set, 1 = subset) | -3.873 | 0.3790 | |||
group × log10 Soil CP | 1.706 | 0.3860 | |||
log10 Soil C/P | SLA | 50 | Intercept | 29.025 | 0.0000 |
log10 Soil CP | -3.408 | 0.0000 | |||
group (0 = full set, 1 = subset) | -1.170 | 0.7210 | |||
group × log10 Soil CP | 0.093 | 0.9490 | |||
log10 Soil C/P | GO | 30 | Intercept | 0.263 | 0.0278 |
log10 Soil CP | 0.124 | 0.0203 | |||
group (0 = full set, 1 = subset) | -0.123 | 0.6444 | |||
group × log10 Soil CP | 0.010 | 0.9311 | |||
log10 Soil C/P | FO | 139 | Intercept | 26.505 | 0.0000 |
log10 Soil CP | 0.542 | 0.0203 | |||
group (0 = full set, 1 = subset) | 0.544 | 0.4795 | |||
group × log10 Soil CP | -0.218 | 0.5280 | |||
log10 Soil C/P | SM_g | 20 | Intercept | 0.431 | 0.0316 |
log10 Soil CP | -0.313 | 0.0006 | |||
group (0 = full set, 1 = subset) | -1.216 | 0.0457 | |||
group × log10 Soil CP | 0.449 | 0.0943 | |||
log10 Soil C/P | RGR | 11 | Intercept | 0.1599 | 0.0000 |
log10 Soil CP | -0.0041 | 0.5790 | |||
group (0 = full set, 1 = subset) | 0.0316 | 0.6530 | |||
group × log10 Soil CP | -0.0352 | 0.3020 | |||
TSD | SLA | 50 | Intercept | 20.936 | 0.0000 |
TSD | 0.037 | 0.0015 | |||
group (0 = full set, 1 = subset) | -0.419 | 0.5361 | |||
group × maxCH | -0.038 | 0.2293 | |||
TSD | maxCH | 42 | Intercept | -0.172 | 0.0000 |
TSD | 0.012 | 0.0000 | |||
group (0 = full set, 1 = subset) | -0.035 | 0.4850 | |||
group × TSD | 0.002 | 0.3250 | |||
TSD | SM_d | 44 | Intercept | -0.438 | 0.0000 |
TSD | 0.018 | 0.0000 | |||
group (0 = full set, 1 = subset) | -0.045 | 0.6020 | |||
group × TSD | 0.007 | 0.1290 | |||
TSD | FO | 139 | Intercept | 28.003 | 0.0000 |
TSD | -0.019 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.052 | 0.6940 | |||
group × TSD | -0.001 | 0.8470 | |||
TSD | RGR | 11 | Intercept | 0.1634 | 0.0000 |
TSD | -0.0008 | 0.0000 | |||
group (0 = full set, 1 = subset) | -0.0305 | 0.0680 | |||
group × TSD | 0.0002 | 0.6210 | |||
maxCH | SLA | 27 | Intercept | 21.509 | 0.0000 |
maxCH | 1.302 | 0.0932 | |||
group (0 = full set, 1 = subset) | -2.148 | 0.0054 | |||
group × maxCH | -1.950 | 0.3016 | |||
maxCH | LNC | 13 | Intercept | 22.4026 | 0.0000 |
maxCH | 3.059 | 0.0000 | |||
group (0 = full set, 1 = subset) | -1.139 | 0.2730 | |||
group × maxCH | -0.456 | 0.7680 | |||
maxCH | LPC | 12 | Intercept | 1.837 | 0.0000 |
maxCH | 0.246 | 0.0211 | |||
group (0 = full set, 1 = subset) | -0.339 | 0.2561 | |||
group × maxCH | 0.034 | 0.9276 | |||
maxCH | SM_g | 14 | Intercept | -0.270 | 0.0000 |
maxCH | 0.819 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.275 | 0.2143 | |||
group × maxCH | 1.277 | 0.1947 | |||
maxCH | FO | 39 | Intercept | 27.721 | 0.0000 |
maxCH | -1.451 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.035 | 0.8360 | |||
group × maxCH | -0.022 | 0.9616 | |||
maxCH | GO | 16 | Intercept | 0.184 | 0.0000 |
maxCH | -0.153 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.000 | 0.9782 | |||
group × maxCH | -0.004 | 0.8721 | |||
maxCH | GF | 42 | (Intercept) | 0.1649 | 0.0000 |
maxCH | 0.6159 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.0083 | 0.5051 | |||
group × maxCH | -0.0328 | 0.3329 | |||
GF | FO | 139 | (Intercept) | 28.1282 | 0.0000 |
GF | -2.4638 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.0416 | 0.7442 | |||
group × GF | 0.0128 | 0.9763 | |||
GF | GO | 30 | (Intercept) | 0.6737 | 0.0000 |
GF | -0.7824 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.0042 | 0.9176 | |||
group × GF | -0.0095 | 0.9249 | |||
GF | SM_d | 44 | (Intercept) | -0.5324 | 0.0000 |
GF | 2.2220 | 0.0000 | |||
group (0 = full set, 1 = subset) | -0.0261 | 0.6891 | |||
group × GF | 0.3868 | 0.1680 | |||
GF | RGR | 11 | (Intercept) | 0.1670 | 0.0000 |
GF | -0.0919 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.0098 | 0.6116 | |||
group × GF | -0.0179 | 0.5623 | |||
SM_g | SM_d | 15 | Intercept | 0.191 | 0.0000 |
SM_g | 1.302 | 0.0000 | |||
group (0 = full set, 1 = subset) | -0.165 | 0.4473 | |||
group × SM_g | -0.242 | 0.5420 | |||
SM_g | FO | 19 | Intercept | 27.362 | 0.0000 |
SM_g | -1.305 | 0.0000 | |||
group (0 = full set, 1 = subset) | 0.831 | 0.0589 | |||
group × SM_g | 0.734 | 0.3793 | |||
LNC | RGR | 11 | (Intercept) | 1.1357 | 0.0000 |
LNC | 0.0007 | 0.4140 | |||
group (0 = full set, 1 = subset) | -0.2021 | 0.0175 | |||
group × LNC | 0.0065 | 0.0566 | |||
SLA | RGR | 11 | (Intercept) | 1.1280 | 0.0000 |
SLA | 0.0011 | 0.1392 | |||
group (0 = full set, 1 = subset) | -0.1301 | 0.0195 | |||
group × SLA | 0.0042 | 0.0954 | |||
RGR | maxCH | 11 | (Intercept) | 7.5344 | 0.0000 |
RGR | -6.5316 | 0.0000 | |||
group (0 = full set, 1 = subset) | 3.5674 | 0.2805 | |||
group × RGR | -2.8471 | 0.3369 |
Although for a SEM it is much more important to test to what extent the slopes of the relationships are significantly affected by missing trait data, we additionally investigated the role of missing trait data on the uncertainty of the slope estimates. To estimate the effect of missing trait data on the standard error of the slope, we ran a rarefying method which makes the number of trait data increasingly sparse. However, running the rarefying method and putting the newly calculated trait averages in the SEM for 500 or 1000 times would be a huge effort. Therefore, in analogue to the robustness test before, we ran the rarefying method for the bivariate trait-trait relationships which occur in the SEM. The proportion of missing trait data in the dependent variable was stepwise increased in steps of 5% up to 35% relative to the currently available data for that trait. Then new trait means were calculated for the plots and a regression was run on all plots to determine the slope and its standard error. Next, the standard error of the slope was calculated relative to the standard error of the slope of the bivariate relationships with the current number of available trait-data. This allows us to compare the increase in standard error among the bivariate relationships. This procedure was repeated 500 times to get a robust estimate of the standard error. The results are shown in Table C2. In all cases the standard error of the slope increases with increasing number of missing trait data. The results show that on average the standard error increases with 7% if 10% of the trait data is deleted. Particularly RGR is sensitive to missing trait data, but this is probably due to the already relative low availability of this trait. Also germination onset (GO) is sensitive to omissions of trait data. Although only 22% of the trait-data is missing, we think that this is because of the ordinal three point scale of this trait.
Based on these results, we think that the slope estimates are relatively robust against missing trait data, as the relative increase in the standard error of the slope is for most traits much less than the relative increase in missing trait data. Although the relative increase in the SE of GO is larger than 10% with an increase of 10% of missing trait data, we have the feeling that this does not really affect the SEM because GO is only affected by traits and not a parent of any other trait and because the number of trait data available for GO is among the highest of the traits (see table 1 of the manuscript), so the actual bias is relatively small. The increase in SE of the slope for RGR is also larger than 10%, with 10% more trait data missing. In the next section the effect of missing trait data on RGR has been analyzed in more detail.
TABLE C2. Relationship between the % missing trait data on the standard error of the slope for bivariate relationships. Slope indicates the increase of the standard error with increasing number of missing species. The last column indicates the % increase in standard error given a 10% loss of species trait data.
X | Y | Intercept | Slope |
% increase in st.error for 10% |
maxCH | SLA | 0.99 | 0.0016 | 0.05 |
maxCH | LPC | 1.00 | 0.0012 | 0.04 |
maxCH | LNC | 1.00 | 0.0011 | 0.02 |
maxCH | SM_g | 1.00 | 0.0013 | 0.05 |
maxCH | FO | 0.99 | 0.0013 | 0.05 |
maxCH | GO | 0.99 | 0.0036 | 0.12 |
maxCH | GF | 1.00 | 0.0017 | 0.06 |
GF | FO | 0.99 | 0.0014 | 0.05 |
GF | GO | 0.99 | 0.0035 | 0.12 |
GF | SM_d | 0.99 | 0.0014 | 0.05 |
GF | RGR | 1.00 | 0.0042 | 0.14 |
SM_g | SM_d | 0.98 | 0.0019 | 0.07 |
SM_g | FO | 0.99 | 0.0016 | 0.05 |
LNC | RGR | 1.00 | 0.0024 | 0.08 |
SLA | RGR | 1.00 | 0.0024 | 0.08 |
RGR | maxCH | 1.00 | 0.0005 | 0.02 |
Step 3: Test of SEM with modelled RGR values
In contrast to other relationships from the full data set vs. the subset, the relationships of the leaf traits vs. RGR were close to being significantly different for the two data sets. Also the standard error of the slopes was relatively large compared to the other traits. This probably means that the plot means of RGR deviated to some extent from the ‘real’ plot mean.
To test whether the structure and significance of the SEM was affected by the deviating estimates of RGR, we ran an additional SEM (tested on the extended model Appendix F only) that included better estimates of the RGR plot means. We did not run a SEM for only those plots for which we had sufficient trait information for RGR, as this would have led to too few degrees of freedom to run this SEM model. Instead, we fitted a multiple regression model in which RGR was predicted based on growth form, LNC and SLA for the subset with known unbiased estimates of plot means for RGR (at least 70% of the species cover available). The parameter estimates of the multiple regression were used to predict the RGR values for the remaining sites with insufficient trait information. To avoid over-fitting, a random number was added to the predicted values (drawn from a normal distribution with a mean of zero and a standard deviation equal to the standard deviation of the residuals of the multiple regression). This procedure ensured that relations between RGR and growth form, LNC and SLA were not made stronger than in the default model. These predicted RGR values replaced the original RGR values and were used in the SEM (everything else kept equal – Fig. F1). This procedure was repeated multiple times, because the numbers are randomly drawn from a normal distribution and thus can lead to an over- or underestimation of the fit, and showed that neither the validity of the full model (P values remained equal), nor the structure of the full model, or the significance of any individual path was different from the original model. Additionally, for all traits, the dominant drivers and the dominant trait-trait constraints remained unchanged. Furthermore, the relative contribution of the traits and the environmental drivers remained equal. There was only a slight increase in the role of the leaf traits in determining RGR and the explained variance of RGR (from 0.12 to 0.14 and from 0.49 to 0.56 respectively – compare Table C3 below and Table F8 in manuscript) and the explained variance of SLA and maxCH increased slightly. Therefore, the plot mean RGR values as calculated in the paper did not change the interpretation of the results and the conclusions about the contribution of the environmental drivers and the role of trait-trait constraints in trait assembly (See Table C3).
TABLE C3. The effect of environmental constraints (cause; columns) on the selection of individual traits (effect; rows) relative to the effect of trait-trait constraints with the modeled RGR values. In the most right column the explained variance of the SEM with the plot mean RGR values as used in the manuscript.
Cause | Environmental constraint | Trait–trait constraints |
Dominant driver |
Dominant trait |
R2: final model RGR modeled* |
R2: final model* |
|||||
Effect |
Nutrient availability |
TSD | DE > IE |
Leaf traits |
Allometric traits |
Seed traits |
Relative growth rate |
||||
LNC | 1.00 | 0.00 | yes | 0.00 | 0.00 | 0.00 | 0.00 | Nutrients | 0.82 | 0.82 | |
SLA | 0.31 | 0.02 | 0.06 | 0.50 | 0.00 | 0.11 | Nutrients | Allometric traits | 0.74 | 0.71 | |
LPC | 0.67 | 0.13 | yes | 0.02 | 0.15 | 0.00 | 0.03 | Nutrients | Allometric traits | 0.97 | 0.97 |
RGR | 0.12 | 0.23 | 0.14 | 0.44 | 0.00 | 0.07 | Disturbance | Allometric traits | 0.56 | 0.49 | |
maxCH | 0.06 | 0.51 | yes | 0.07 | 0.23 | 0.00 | 0.13 | Disturbance | Allometric traits | 0.73 | 0.69 |
GF | 0.04 | 0.47 | 0.05 | 0.37 | 0.00 | 0.08 | Disturbance | Allometric traits | 0.94 | 0.94 | |
SM_g | 0.23 | 0.30 | 0.04 | 0.35 | 0.00 | 0.08 | Disturbance | Allometric traits | 0.63 | 0.63 | |
SM_d | 0.10 | 0.33 | 0.03 | 0.25 | 0.24 | 0.05 | Disturbance | Seed traits | 0.92 | 0.93 | |
GO | 0.08 | 0.35 | 0.04 | 0.45 | 0.00 | 0.07 | Disturbance | Allometric traits | 0.69 | 0.69 | |
FO | 0.16 | 0.10 | yes | 0.01 | 0.61 | 0.09 | 0.02 | Nutrients | Allometric traits | 0.57 | 0.56 |
Appendix F. Results of the PCA, RDA and RDA with latents with 10 traits, the extended SEM, path coefficients and justification.
Here, we determine whether the conclusion that disturbance and nutrient availability do not act on a separate suite of traits, still holds when the model is extended with three more traits (Table F1).
To account for the fact that an increase in canopy height often leads to shift in other traits that are caused by a shift in growth form and not by height per se, we recorded whether a species had a woody stem (woody / non-woody). This simple division distinguishes mainly investments in structural biomass. In this paper we will refer to a shift towards woodiness as GF approaches one. In addition we also included seed mass of the dispergule (SM_d in mg; including the mass of the germinule) and a phenology trait flowering onset (FO in months).
TABLE F1. Extra traits used for the analysis of the extended model, the number of species involved and its literature sources. In total 346 species were present in the plots.
Trait category |
Trait (acronym) | Scale and units |
No. species |
Source |
Allometric traits |
Growth form (GF) | Ordinal (0: non-woody, 1: woody) | 346 | 1 |
Seed traits |
Seed mass with dispergule included (SM_d) |
Log 10 Continuous (mg) | 276 | 2 |
Phenology traits |
Flowering onset (FO) | Ordinal (Months, 1: Jan. – 12: Dec.) | 342 | 1 |
Sources. 1. BioBase 2003, Centraal Bureau voor de Statistiek, Voorburg/Heerlen.
2. see Douma et al. (in press)
The covariance among trait averages of species assemblages was analyzed first without explicitly defining possible underlying causes of common axes of variability between plots by submitting 156 plots × 10 traits to a principal component analysis. The results are shown in Table F2. Subsequently, we explicitly constrained the multivariate structure in traits by environmental data, but still without imposing any causal hypotheses, using a redundancy analysis (RDA; ter Braak 1987) based on 3 environmental variables (Soil C/P ratio, Soil C/N ratio, ‘time since disturbance’). The results are shown in Table F3.
TABLE F2. Results of Principal Component Analysis (PCA) of the site-trait matrix (156 sites × 10 traits). The explained variance is shown as well as the trait scores. Abbreviations of traits: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), seed mass of the germinule (SM_g), seed mass of the dispergule (SM_d), maximum canopy height (maxCH), Growth form (GF), relative growth rate (RGR), germination onset (GO), flowering onset (FO).
PCA 1 | PCA 2 | |
Cumulative Explained Variance |
0.55 | 0.79 |
Scores | ||
SLA | -0.46 | 1.55 |
LNC | -0.86 | 1.39 |
LPC | -0.64 | 1.69 |
RGR | 0.74 | 1.25 |
maxCH | -1.20 | -0.60 |
GF | -1.20 | -0.73 |
SM_g | -1.17 | 0.14 |
SM_d | -1.27 | -0.23 |
GO | 1.11 | 0.45 |
FO | 0.97 | -0.18 |
TABLE F3. Results of the RDA with the relevé-traits (156 sites × 10 traits) constrained by 2 environmental factors (measured TSD and measured Soil C/P and Soil C/N). The cumulative explained variance of the constrained axis, and the scores of the environmental constraints are shown.
RDA1 | RDA2 | RDA3 | |
Cumulative explained variance |
0.35 | 0.41 | 0.42 |
LogC/N | 0.176 | -0.5874 | -0.78997 |
LogC/P | 0.358 | -0.9011 | 0.24455 |
TSD | -0.9564 | -0.2889 | -0.04322 |
The model developed in this appendix differs from the way the environmental drivers affect the traits. Since exact measurements of disturbance frequency and nutrient availability at temporal and spatial scales relevant to plants are rarely directly available; i.e. they are “latent†variables, one usually has only indirect and imperfect measurements. This is particularly true for nutrient availability, which is highly dynamic in space and time. Therefore soil nutrient concentrations do not necessary reflect nutrient availability as experienced by plants in the long run (Ordoñez et al. 2010). SEM allows us to explicitly incorporate the uncertainty by introducing a latent variable (Shipley 2002), which is estimated as the common variance of the soil nutrient parameters and the traits associated with the latent variable. Based on two latents a new SEM was developed, shown in Fig. F1. Note that the latent specified in this model takes the common variance of the leaf traits and the soil C/P ratio (a model including soil C/N did not fit) and measured time since disturbance and maxCH, GF, SM_d and GO. It would be better to have to obtain multiple and independent estimates of the error associated with soil fertility and disturbance but that this is not yet possible. Alternatively, one could build more explicit measurement models of environmental drivers that include more or better indicators and test them with independent data. This is not possible with our data
The model as presented in Fig. F1 fit the data (χ2 = 47.68, df = 35, P = 0.07) The covariance matrix and the modeled covariance matrices are given in Table F4 and F5. Parameter estimates and standard errors of the model are given in Table F6. The fit increased significantly if a free covariance between LNC and SM_g was added (χ2 = 31.89, df = 34, P = 0.57). Because RGR is central to the model and we lowered the minimum amount of species needed to calculate a plot mean, we ran a control analysis which revealed that the structure of the SEM, the significance and sign of the paths remained unchanged for modeled values of RGR (see Appendix C for details).
FIG. F1. Standardized path coefficients, explained variance (in squares) and significance (between parentheses) of the final model of nutrient availability, disturbance and their related traits (χ2 = 47.68, df = 35, P = 0.07, only significant paths shown). Note that the relationship between nutrient availability and Soil C/P ratio is negative.
TABLE F1. Covariance matrix of the variables used in the model of F1.
10Log C/P | TSD | maxCH | LNC | SLA | SM_G | SM_D | LPC | RGR | GO | FO | GF | |
10Log C/P | 0.135 | |||||||||||
TSD | -0.855 | 630.625 | ||||||||||
maxCH | -0.024 | 7.365 | 0.14 | |||||||||
LNC | -0.638 | 20.15 | 0.465 | 10.379 | ||||||||
SLA | -0.451 | 20.216 | 0.096 | 6.814 | 12.042 | |||||||
SM_G | -0.042 | 6.069 | 0.12 | 0.854 | 0.413 | 0.191 | ||||||
SM_D | -0.048 | 11.959 | 0.209 | 1.063 | 0.567 | 0.249 | 0.413 | |||||
LPC | -0.102 | 1.676 | 0.037 | 1.417 | 1.257 | 0.095 | 0.108 | 0.26 | ||||
RGR | -0.001 | -0.489 | -0.007 | 0.006 | 0.018 | -0.006 | -0.012 | 0.003 | 0.001 | |||
GO | 0.005 | -1.152 | -0.021 | -0.086 | -0.032 | -0.018 | -0.033 | -0.007 | 0.001 | 0.005 | ||
FO | 0.036 | -4.624 | -0.078 | -0.593 | -0.349 | -0.104 | -0.158 | -0.087 | 0.005 | 0.014 | 0.163 | |
GF | -0.012 | 4.992 | 0.087 | 0.249 | 0.081 | 0.072 | 0.135 | 0.015 | -0.005 | -0.014 | -0.056 | 0.059 |
TABLE F5. Modeled covariance matrix of the model Fig. F1.
10Log C/P |
TSD (measured) |
maxCH | LNC | SLA | SM_G | SM_D | LPC | RGR | GO | FO | GF |
Latent Nutrient Availability |
Latent TSD |
|
10Log C/P | 0.135 | |||||||||||||
TSD (measured) |
-1.627 | 630.625 | ||||||||||||
maxCH | -0.032 | 7.320 | 0.140 | |||||||||||
LNC | -0.596 | 23.206 | 0.461 | 10.379 | ||||||||||
SLA | -0.484 | 21.227 | 0.102 | 6.798 | 11.955 | |||||||||
SM_G | -0.051 | 6.415 | 0.120 | 0.742 | 0.439 | 0.191 | ||||||||
SM_D | -0.065 | 12.251 | 0.209 | 0.930 | 0.573 | 0.250 | 0.414 | |||||||
LPC | -0.100 | 1.919 | 0.038 | 1.421 | 1.242 | 0.097 | 0.102 | 0.259 | ||||||
RGR | 0.000 | -0.484 | -0.007 | 0.009 | 0.017 | -0.006 | -0.012 | 0.004 | 0.001 | |||||
GO | 0.006 | -1.172 | -0.021 | -0.080 | -0.034 | -0.019 | -0.033 | -0.008 | 0.001 | 0.005 | ||||
FO | 0.042 | -4.509 | -0.078 | -0.600 | -0.443 | -0.103 | -0.154 | -0.085 | 0.004 | 0.014 | 0.162 | |||
GF | -0.016 | 4.976 | 0.086 | 0.238 | 0.087 | 0.072 | 0.135 | 0.015 | -0.005 | -0.014 | -0.056 | 0.058 | ||
Latent Nutrient Availability |
0.204 | -7.960 | -0.155 | -2.915 | -2.367 | -0.252 | -0.316 | -0.488 | -0.002 | 0.027 | 0.205 | -0.080 | 1.000 | |
Latent TSD |
-1.627 | 497.212 | 7.320 | 23.206 | 21.227 | 6.415 | 12.251 | 1.919 | -0.484 | -1.172 | -4.509 | 4.976 | -7.960 | 497.212 |
Mathematical specification of the model (Fig. F1):
10log Soil C/P = 0.204 × latent_NA†+ 1.000 e_10log Soil C/P
TSD = 1.000 latent_TSD‡ + 1.000 e_TSD
maxCH = 3.735 × V44 + 0.018 × latent_TSD + 1.000 e_maxCH
LNC = -2.915 × latent_NA + 1.000 e_LNC
SLA = -10.585 × maxCH -2.778 × latent_NA + 0.154 × latent_TSD + 1.000 e_SLA
SM_g = 0.724 × maxCH - 0.140 × latent_NA + 1.000 e_SM_g
SM_d = 0.801 × SM_g + 0.650 × GF + 0.008 × latent_TSD + 1.000 e_SM_d
LPC = -0.323 × maxCH - 0.538 × latent_NA + 1.000 e_LPC
RGR = 0.002 × LNC + 0.003 × SLA - 0.117 × GF + 1.000 E44
GO = -0.059 × maxCH - .138 × GF + 0.007 × latent_NA + 1.000 e_GO
FO = 1.159 × maxCH - .260 × SM_g - 2.809 × GF + 0.156 × latent_NA + 0.008 × latent_TSD + 1.000 e_FO
†latent_NA is latent Nutrient Availability, ‡latent_TSD is latent Time since disturbance
Free covariances:
latent_NA - latent_TSD: -7.96
latent_NA - e_GF: 0.017
SM_g - RGR: -0.001
TABLE F6. Unstandardized path coefficients of full model (Fig. 3c main text). Standard error given in brackets, error variances with standard error in diagonal (calculated with robust estimates). Traits in rows are cause and traits in columns are effects. Correlational relationships in italics. Abbreviations of variables: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), seed mass of the germinule (SM_g), seed mass of the dispergule (SM_d), maximum canopy height (maxCH), Growth form (GF), relative growth rate (RGR), germination onset (GO), flowering onset (FO). Time since disturbance latent (TSD_l), Soil CP ratio (Soil CP), Time since disturbance measured (TSD_m).
Nutrient availability |
TSD_l | SoilCP | TSD_m | LNC | SLA | LPC | RGR | maxCH | GF | SM_g | SM_d | GO | FO | |
Nutrient availability |
1 | 7.960 (2.021) | -0.204 (0.027) | 2.915 (0.202) | 2.778 (0.272) | 0.538 (0.034) | -0.017 (0.005) | 0.140 (0.025) | -0.007 (0.004) | -0.156 (0.030) | ||||
TSD_l | 0.026 | 497.212 (72.414) | 1 | 0.154 (0.028) | 0.018 (0.002) | 0.003 (0.001) | 0.008 (0.003) | 0.008 (0.004) | ||||||
SoilCP | 0.094 (0.011) | |||||||||||||
TSD_m | 133.413 (23.910) | |||||||||||||
LNC | 1.880 (0.286) | 0.002 (0.001) | ||||||||||||
SLA | 3.466 (0.601) | 0.003 (0.001) | ||||||||||||
LPC | 0.009 (0.006) | |||||||||||||
RGR | 0.001 (0.000) | 3.735 (1.480) | -0.001 (0.001) | |||||||||||
maxCH | -10.581 (1.665) | -0.323 (0.062) | 0.043 (0.014) | 0.462 (0.048) | 0.724 (0.066) | -0.059 (0.030) | 1.159 (0.223) | |||||||
GF | -0.117 (0.010) | 0.003 (0.001) | 0.649 (0.237) | -0.138 (0.044) | -2.809 (0.472) | |||||||||
SM_g | 0.072 (0.008) | 0.801 (0.047) | -0.260 (0.082) | |||||||||||
SM_d | 0.031 (0.004) | |||||||||||||
GO | 0.001 (0.000) | |||||||||||||
FO | 0.071 (0.009) |
An alternative SEM that was consistent with the data (χ2= 49.93, df = 36, P = 0.06, CFI = 0.99) is nested in Fig. F1 and differed in a few aspects: the causal direction between maxCH and GF is reversed, the path from nutrient-availability to GF was removed, a free covariance between nutrient availability and maxCH was added (also possible with a path from nutrient availability to maxCH), and the path from TSD to maxCH was removed. The path from SLA to RGR was not significant anymore. Reversing the path from maxCH to SLA lead to a model that was not consistent with the data (P = 0.0003). Although the mdoel is consistent with the data, it is ecologically less likely as maxCH is not driven at all by ‘time since disturbance’. Additionally, RGR is not significantly affected by SLA. Therefore based on ecological considerations we prefer the model presented in Fig. F1.
Replacing the measured estimates by the latent estimates of nutrient availability and disturbance (Fig. F1) to a RDA, increased the explained variance of the environmental drivers to 70%, i.e. 89% of the maximally explained variation (Table F7), Fig. F2.
TABLE F7. Results of the RDA with the relevé-traits (156 sites × 10 traits) constrained by 2 environmental factors (latent TSD and latent Nutrient availability (Nu.availability)). The cumulative explained variance of the constrained axis, and the scores of the environmental constraints are shown.
RDA1 | RDA2 | |
Cumulative explained variance |
0.50 | 0.70 |
Nutrient availability |
-0.72 | 0.70 |
TSD | -0.90 | -0.44 |
FIG. F2. RDA with the relevé-traits (156 sites × 10 traits) constrained by 2 environmental factors (latent TSD and latent Nutrient availability (abbreviated by Nu.avail)). Abbreviations of traits and environmental drivers: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), seed mass of the germinule (SM_g), seed mass of the dispergule (SM_d), maximum canopy height (maxCH), Growth form (GF), relative growth rate (RGR), germination onset (GO), flowering onset (FO).
The relative effects of environmental drivers on traits were calculated in the same way as the model from Fig. 3 (of the manuscript). These calculations showed that nutrient availability predominantly constrained leaf traits, such as SLA, LNC and LPC and that ‘time since disturbance’predominantly affected allometric, seed traits and relative growth rate, constraining maxCH, GF, RGR, SM_g, SM_d and GO (Table F8). However, the effect of both drivers was not simply restricted to one suite of traits, but affected both suites of traits simultaneously. For example SM_g and FO were almost equally affected by nutrient availability and ‘time since disturbance’. The constraining effects of environmental drivers on traits were only in 5 out of 10 traits stronger than trait-trait constraints. Allometric traits predominantly constrained other traits, in particular seed- and phenology traits.
TABLE F8. The effect of environmental constraints (cause; columns) on the selection of individual traits (effect; rows) relative to the effect of trait-trait constraints. The total effects of the two environmental drivers and the trait-trait constraints add to one. The effect of the environmental drivers on traits is decomposed in both direct effects (DE) and indirect effects (IE; effects transmitted via other traits; Fig. F1). Traits-trait constraints were grouped into four categories: leaf traits (LNC, LPC and SLA), allometric traits (maxCH and GF), seed traits (SM_g and SM_d) and relative growth rate (RGR). Additionally, the dominant environmental driver and the dominant trait-trait constraints, as well as the explained variance of the traits.
Cause | Environmental constraint | Trait–trait constraint |
Dominant driver |
Dominant trait |
R2 sub-model 2: disturbance* |
|||||
Effect |
Nutrient availability |
Time since disturbance |
DE > IE |
Leaf traits |
Allometric traits |
Seed traits |
Relative growth rate |
|||
LNC | 1.00 | 0.00 | yes | 0.00 | 0.00 | 0.00 | 0.00 | Nutrients | ||
SLA | 0.30 | 0.01 | 0.06 | 0.49 | 0.00 | 0.13 | Nutrients | Allometric traits | ||
LPC | 0.66 | 0.13 | yes | 0.02 | 0.15 | 0.00 | 0.04 | Nutrients | Allometric traits | |
RGR | 0.10 | 0.25 | 0.12 | 0.45 | 0.00 | 0.08 | Disturbance | Allometric traits | ||
maxCH | 0.05 | 0.48 | yes | 0.07 | 0.26 | 0.00 | 0.15 | Disturbance | Allometric traits | 0.83 |
GF | 0.04 | 0.46 | 0.04 | 0.36 | 0.00 | 0.10 | Disturbance | Allometric traits | 0.96 | |
SM_g | 0.22 | 0.30 | 0.04 | 0.35 | 0.00 | 0.09 | Disturbance | Allometric traits | 0.53 | |
SM_d | 0.10 | 0.34 | 0.03 | 0.23 | 0.24 | 0.06 | Disturbance | Seed traits | 0.93 | |
GO | 0.08 | 0.34 | 0.04 | 0.45 | 0.00 | 0.09 | Disturbance | Allometric traits | 0.69 | |
FO | 0.16 | 0.10 | yes | 0.01 | 0.60 | 0.09 | 0.03 | Nutrients | Allometric traits | 0.47 |
* see Fig. F1
LITERATURE CITED
Douma, J. C., Aerts, R., Witte, J. P. M., Bekker, R. M., Kunzmann, D., Metselaar, K., and van Bodegom, P. M. (2011) A combination of functionally different plant traits provides a means to quantitatively predict a broad range of species assemblages in NW Europe. Ecography. DOI: 10.1111/j.1600-0587.2011.07068.x (in press)
Ordoñez, J. C., P. M. van Bodegom, J. P. M. Witte, R. P. Bartholomeus, J. R. van Hal, and R. Aerts. 2010. Plant strategies in relation to resource supply in mesic to wet environments: does theory mirror nature? American Naturalist 175:225–239.
Shipley, B. 2002, Cause and Correlation in Biology - a user's guide to path analysis, structural equations and causal inference. Cambridge, Cambridge University Press.
ter Braak, C. J. F. 1987. Ordination in R. H. G. Jongman, and O. F. R. van Tongeren, eds. Data analysis in community and landscape ecology. Wageningen, Pudoc.