Appendix C. Control analyses for robustness against missing trait values.
Allowing missing trait values for calculating a plot mean might affect the selection of sites for analysis and the traittrait and traitenvironment relationships. In this appendix it is shown that (i) the selected sites are not a biased selection from all sites available, (ii) the average number of species to calculate is in fact much higher than the lower bound set (50% and 20% respectively), (iii) the slopes of the regression lines are not significantly affected by introducing missing trait values, the increase in the uncertainty of the slopes is smaller than the increase of the percentage of missing trait data, and finally, (iv) the structure and significance of the SEM is not affected by the RGR values used. As a result, we conclude that our main conclusions that nutrient availability and disturbance partly affect the same suit of traits and that traittrait constraints play an important rolestill hold. Step 1 and 2 are done for both the simplified model (Fig. 3) and the complex model of Appendix F. Step 3 is done for the model in Appendix F.
1. A biased subset of the total available number of sites?
The six available data sources represented 19 different vegetation types. The selected set of sites for which trait information was available was not biased compared to all available sites, because selected sites covered all 19 vegetation types. Additionally, the sites not included in the analysis were randomly spread over these 19 vegetation types: The correlation between the number of sites per vegetation type for the selected sites and the total number of available sites was 0.94 (on ^{10}log transformed data to fulfill homogeneity of variance). Therefore the selection criteria have not led to an unbalanced data set.
2. A biased estimate of trait plot means?
Within our data set, plots had been chosen that had sufficient information for at least of 50% of the species (or 20% in case of RGR and LPC). However, might this selection and incomplete information have led to biased results? If missing data are nonrandomly distributed, then this can lead to a biased estimate of the plot mean and of biased environmenttrait and traittrait relationships and thus to a different conclusion about the relative contribution of environmental drivers and traittrait constraints. We performed a three step analysis to test this crucial issue. First, we calculated the actual percentage of species with trait information that were used to calculate plot means. Next, we tested whether the slopes of the paths of our SEM are significantly affected when allowing trait plot means to be based on incomplete data. Finally, we incorporated modelled RGR values in a SEM (of Appendix F) and tested whether the structure still holds.
Step 1: Was the percentage of species with trait information to calculate plot means really that low?
Our selection criterion set a minimum to the availability trait information in order to include the majority of the species per plot. This threshold was 50% of the species, assuming that these species give a good estimate of the true plot trait mean; this minimum was lowered to > 20% for LPC and RGR as these traits were less well covered in the database but were core traits and essential to the analysis. In fact the average percentage of species used to calculate a plot mean was much higher than this minimum percentage (Fig. C1). In reality, the chances for a potential bias (if, in addition, species selection would have been selective and not random) are therefore much smaller than might have been concluded based on only this threshold stated in the manuscript. The only exception is for by RGR, for which on average indeed only 50% of the species had trait information. Given that we had already combined all available trait databases we tested as a second step whether correlations (and thus paths in our SEM) could have been affected by the noncomplete trait information.
FIG. C1. The percentage of species used to calculate of plot mean for different traits (abbreviations of traits: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), 10log seed mass of the germinule (SM_g), 10log seed mass of the dispergule (SM_d), 10log maximum canopy height (maxCH), Growth form (GF), seedling relative growth rate (RGR), 10log germination onset (GO), flowering onset FO)).
Step 2: The comparison of the slopes of a complete set and the available data set and estimation of the uncertainty in the slopes
Our claim that â€˜regenerative and establishment traits are linkedâ€™ was based both on the overall fit of the SEM and on the significance of those path coefficients linking the two groups of traits. We deal with the overall fit in the next section. Here we consider how missing data may affect the significance of the relevant path coefficients in our models. Since the path coefficients in a SEM are conceptually similar to the slopes of the equivalent regressions, the SEM model should be robust against missing trait values if the slopes of the relevant single regressions obtained from our data set (as used in this paper) and a smaller subset of the data set for which trait information is available for all species are not significantly different. Additionally, if the slopes of these regression lines are not significantly different for these two data sets, then the relative contribution of environmental drivers to trait selection and the significance of traits to trait selection will by definition remain unchanged.
To test for this, we used two data sets. The subset with which we compared our data set was defined by selecting those sites that had more than 90% of the species trait data available. Setting this criterium at 90% (and 70% for RGR analyses) ensured that at least 10 sites (average 42 sites) were available for the regression analysis. Note that this 90% means that on average only 1.4 species per plot were missing (a plot contained on average 18 species). The full set was defined as all 156 sites used in the SEM.
A significance test between the slopes of the two regression lines was performed as follows: a dummy variable (0 and 1 for the two data sets, respectively) was included in the regression: Y = a Ã— X + b + c Ã— group + d Ã— group Ã— X. If the slope of the subset is significantly different from the full set, then the parameter d will be significantly different from zero. Running these regressions for all environmenttrait and traittraittrait relationships of the SEM model presented in Fig. 3 of the manuscript and the model presented in Appendix F (Fig. F1) showed that none of the regressions of the full set differed significantly from the subset (P > 0.05); in other words, the slopes of the regression were not significantly affected by allowing missing trait values up to a maximum of 80% for LPC and RGR and 50% for the other traits (See step 1, Fig. C1). This implies that the SEM would have the same slopes and the same causeeffect relationships if it would have been based on the subset (but with much less power, given the fewer degrees of freedom). Our claim that â€˜regenerative and establishment traits are linkedâ€™ thus holds. In Table C1 we present the P values and estimates of the slopes.
TABLE C1. Comparison of slopes between subset and full set for all relationships used in the SEM (including the number of sites used (N), estimates of the parameters and P values). Nonsignificant parameters are indicated in bold.
Independent  Dependent  N  Model  Estimate  P 
log10 Soil C/N  LNC  13  Intercept  28.619  0.0000 
log10 Soil CN  4.967  0.0000  
group (0 = full set, 1 = subset)  5.531  0.5020  
group Ã— log10 Soil CN  3.950  0.5230  
log10 Soil C/N  SLA  50  Intercept  24.695  0.0000 
log10 Soil CN  2.549  0.0749  
group (0 = full set, 1 = subset)  1.104  0.7698  
group Ã— log10 Soil CN  1.629  0.5815  
log10 Soil C/N  LPC  13  Intercept  2.6463  0.0000 
log10 Soil CN  0.6484  0.0007  
group (0 = full set, 1 = subset)  0.0324  0.9810  
group Ã— log10 Soil CN  0.0474  0.9627  
log10 Soil C/N  SM_g  20  Intercept  0.4638  0.0208 
log10 Soil CN  0.5799  0.0000  
group (0 = full set, 1 = subset)  0.0017  0.9979  
group Ã— log10 Soil CN  0.1915  0.7088  
log10 Soil C/P  LPC  70  Intercept  3.493  0.0000 
log10 Soil CP  0.751  0.0000  
group (0 = full set, 1 = subset)  1.352  0.0559  
group Ã— log10 Soil CP  0.546  0.0803  
log10 Soil C/P  LNC  13  Intercept  32.782  0.0000 
log10 Soil CP  4.696  0.0000  
group (0 = full set, 1 = subset)  3.873  0.3790  
group Ã— log10 Soil CP  1.706  0.3860  
log10 Soil C/P  SLA  50  Intercept  29.025  0.0000 
log10 Soil CP  3.408  0.0000  
group (0 = full set, 1 = subset)  1.170  0.7210  
group Ã— log10 Soil CP  0.093  0.9490  
log10 Soil C/P  GO  30  Intercept  0.263  0.0278 
log10 Soil CP  0.124  0.0203  
group (0 = full set, 1 = subset)  0.123  0.6444  
group Ã— log10 Soil CP  0.010  0.9311  
log10 Soil C/P  FO  139  Intercept  26.505  0.0000 
log10 Soil CP  0.542  0.0203  
group (0 = full set, 1 = subset)  0.544  0.4795  
group Ã— log10 Soil CP  0.218  0.5280  
log10 Soil C/P  SM_g  20  Intercept  0.431  0.0316 
log10 Soil CP  0.313  0.0006  
group (0 = full set, 1 = subset)  1.216  0.0457  
group Ã— log10 Soil CP  0.449  0.0943  
log10 Soil C/P  RGR  11  Intercept  0.1599  0.0000 
log10 Soil CP  0.0041  0.5790  
group (0 = full set, 1 = subset)  0.0316  0.6530  
group Ã— log10 Soil CP  0.0352  0.3020  
TSD  SLA  50  Intercept  20.936  0.0000 
TSD  0.037  0.0015  
group (0 = full set, 1 = subset)  0.419  0.5361  
group Ã— maxCH  0.038  0.2293  
TSD  maxCH  42  Intercept  0.172  0.0000 
TSD  0.012  0.0000  
group (0 = full set, 1 = subset)  0.035  0.4850  
group Ã— TSD  0.002  0.3250  
TSD  SM_d  44  Intercept  0.438  0.0000 
TSD  0.018  0.0000  
group (0 = full set, 1 = subset)  0.045  0.6020  
group Ã— TSD  0.007  0.1290  
TSD  FO  139  Intercept  28.003  0.0000 
TSD  0.019  0.0000  
group (0 = full set, 1 = subset)  0.052  0.6940  
group Ã— TSD  0.001  0.8470  
TSD  RGR  11  Intercept  0.1634  0.0000 
TSD  0.0008  0.0000  
group (0 = full set, 1 = subset)  0.0305  0.0680  
group Ã— TSD  0.0002  0.6210  
maxCH  SLA  27  Intercept  21.509  0.0000 
maxCH  1.302  0.0932  
group (0 = full set, 1 = subset)  2.148  0.0054  
group Ã— maxCH  1.950  0.3016  
maxCH  LNC  13  Intercept  22.4026  0.0000 
maxCH  3.059  0.0000  
group (0 = full set, 1 = subset)  1.139  0.2730  
group Ã— maxCH  0.456  0.7680  
maxCH  LPC  12  Intercept  1.837  0.0000 
maxCH  0.246  0.0211  
group (0 = full set, 1 = subset)  0.339  0.2561  
group Ã— maxCH  0.034  0.9276  
maxCH  SM_g  14  Intercept  0.270  0.0000 
maxCH  0.819  0.0000  
group (0 = full set, 1 = subset)  0.275  0.2143  
group Ã— maxCH  1.277  0.1947  
maxCH  FO  39  Intercept  27.721  0.0000 
maxCH  1.451  0.0000  
group (0 = full set, 1 = subset)  0.035  0.8360  
group Ã— maxCH  0.022  0.9616  
maxCH  GO  16  Intercept  0.184  0.0000 
maxCH  0.153  0.0000  
group (0 = full set, 1 = subset)  0.000  0.9782  
group Ã— maxCH  0.004  0.8721  
maxCH  GF  42  (Intercept)  0.1649  0.0000 
maxCH  0.6159  0.0000  
group (0 = full set, 1 = subset)  0.0083  0.5051  
group Ã— maxCH  0.0328  0.3329  
GF  FO  139  (Intercept)  28.1282  0.0000 
GF  2.4638  0.0000  
group (0 = full set, 1 = subset)  0.0416  0.7442  
group Ã— GF  0.0128  0.9763  
GF  GO  30  (Intercept)  0.6737  0.0000 
GF  0.7824  0.0000  
group (0 = full set, 1 = subset)  0.0042  0.9176  
group Ã— GF  0.0095  0.9249  
GF  SM_d  44  (Intercept)  0.5324  0.0000 
GF  2.2220  0.0000  
group (0 = full set, 1 = subset)  0.0261  0.6891  
group Ã— GF  0.3868  0.1680  
GF  RGR  11  (Intercept)  0.1670  0.0000 
GF  0.0919  0.0000  
group (0 = full set, 1 = subset)  0.0098  0.6116  
group Ã— GF  0.0179  0.5623  
SM_g  SM_d  15  Intercept  0.191  0.0000 
SM_g  1.302  0.0000  
group (0 = full set, 1 = subset)  0.165  0.4473  
group Ã— SM_g  0.242  0.5420  
SM_g  FO  19  Intercept  27.362  0.0000 
SM_g  1.305  0.0000  
group (0 = full set, 1 = subset)  0.831  0.0589  
group Ã— SM_g  0.734  0.3793  
LNC  RGR  11  (Intercept)  1.1357  0.0000 
LNC  0.0007  0.4140  
group (0 = full set, 1 = subset)  0.2021  0.0175  
group Ã— LNC  0.0065  0.0566  
SLA  RGR  11  (Intercept)  1.1280  0.0000 
SLA  0.0011  0.1392  
group (0 = full set, 1 = subset)  0.1301  0.0195  
group Ã— SLA  0.0042  0.0954  
RGR  maxCH  11  (Intercept)  7.5344  0.0000 
RGR  6.5316  0.0000  
group (0 = full set, 1 = subset)  3.5674  0.2805  
group Ã— RGR  2.8471  0.3369 
Although for a SEM it is much more important to test to what extent the slopes of the relationships are significantly affected by missing trait data, we additionally investigated the role of missing trait data on the uncertainty of the slope estimates. To estimate the effect of missing trait data on the standard error of the slope, we ran a rarefying method which makes the number of trait data increasingly sparse. However, running the rarefying method and putting the newly calculated trait averages in the SEM for 500 or 1000 times would be a huge effort. Therefore, in analogue to the robustness test before, we ran the rarefying method for the bivariate traittrait relationships which occur in the SEM. The proportion of missing trait data in the dependent variable was stepwise increased in steps of 5% up to 35% relative to the currently available data for that trait. Then new trait means were calculated for the plots and a regression was run on all plots to determine the slope and its standard error. Next, the standard error of the slope was calculated relative to the standard error of the slope of the bivariate relationships with the current number of available traitdata. This allows us to compare the increase in standard error among the bivariate relationships. This procedure was repeated 500 times to get a robust estimate of the standard error. The results are shown in Table C2. In all cases the standard error of the slope increases with increasing number of missing trait data. The results show that on average the standard error increases with 7% if 10% of the trait data is deleted. Particularly RGR is sensitive to missing trait data, but this is probably due to the already relative low availability of this trait. Also germination onset (GO) is sensitive to omissions of trait data. Although only 22% of the traitdata is missing, we think that this is because of the ordinal three point scale of this trait.
Based on these results, we think that the slope estimates are relatively robust against missing trait data, as the relative increase in the standard error of the slope is for most traits much less than the relative increase in missing trait data. Although the relative increase in the SE of GO is larger than 10% with an increase of 10% of missing trait data, we have the feeling that this does not really affect the SEM because GO is only affected by traits and not a parent of any other trait and because the number of trait data available for GO is among the highest of the traits (see table 1 of the manuscript), so the actual bias is relatively small. The increase in SE of the slope for RGR is also larger than 10%, with 10% more trait data missing. In the next section the effect of missing trait data on RGR has been analyzed in more detail.
TABLE C2. Relationship between the % missing trait data on the standard error of the slope for bivariate relationships. Slope indicates the increase of the standard error with increasing number of missing species. The last column indicates the % increase in standard error given a 10% loss of species trait data.
X  Y  Intercept  Slope 
% increase in st.error for 10% 
maxCH  SLA  0.99  0.0016  0.05 
maxCH  LPC  1.00  0.0012  0.04 
maxCH  LNC  1.00  0.0011  0.02 
maxCH  SM_g  1.00  0.0013  0.05 
maxCH  FO  0.99  0.0013  0.05 
maxCH  GO  0.99  0.0036  0.12 
maxCH  GF  1.00  0.0017  0.06 
GF  FO  0.99  0.0014  0.05 
GF  GO  0.99  0.0035  0.12 
GF  SM_d  0.99  0.0014  0.05 
GF  RGR  1.00  0.0042  0.14 
SM_g  SM_d  0.98  0.0019  0.07 
SM_g  FO  0.99  0.0016  0.05 
LNC  RGR  1.00  0.0024  0.08 
SLA  RGR  1.00  0.0024  0.08 
RGR  maxCH  1.00  0.0005  0.02 
Step 3: Test of SEM with modelled RGR values
In contrast to other relationships from the full data set vs. the subset, the relationships of the leaf traits vs. RGR were close to being significantly different for the two data sets. Also the standard error of the slopes was relatively large compared to the other traits. This probably means that the plot means of RGR deviated to some extent from the â€˜realâ€™ plot mean.
To test whether the structure and significance of the SEM was affected by the deviating estimates of RGR, we ran an additional SEM (tested on the extended model Appendix F only) that included better estimates of the RGR plot means. We did not run a SEM for only those plots for which we had sufficient trait information for RGR, as this would have led to too few degrees of freedom to run this SEM model. Instead, we fitted a multiple regression model in which RGR was predicted based on growth form, LNC and SLA for the subset with known unbiased estimates of plot means for RGR (at least 70% of the species cover available). The parameter estimates of the multiple regression were used to predict the RGR values for the remaining sites with insufficient trait information. To avoid overfitting, a random number was added to the predicted values (drawn from a normal distribution with a mean of zero and a standard deviation equal to the standard deviation of the residuals of the multiple regression). This procedure ensured that relations between RGR and growth form, LNC and SLA were not made stronger than in the default model. These predicted RGR values replaced the original RGR values and were used in the SEM (everything else kept equal â€“ Fig. F1). This procedure was repeated multiple times, because the numbers are randomly drawn from a normal distribution and thus can lead to an over or underestimation of the fit, and showed that neither the validity of the full model (P values remained equal), nor the structure of the full model, or the significance of any individual path was different from the original model. Additionally, for all traits, the dominant drivers and the dominant traittrait constraints remained unchanged. Furthermore, the relative contribution of the traits and the environmental drivers remained equal. There was only a slight increase in the role of the leaf traits in determining RGR and the explained variance of RGR (from 0.12 to 0.14 and from 0.49 to 0.56 respectively â€“ compare Table C3 below and Table F8 in manuscript) and the explained variance of SLA and maxCH increased slightly. Therefore, the plot mean RGR values as calculated in the paper did not change the interpretation of the results and the conclusions about the contribution of the environmental drivers and the role of traittrait constraints in trait assembly (See Table C3).
TABLE C3. The effect of environmental constraints (cause; columns) on the selection of individual traits (effect; rows) relative to the effect of traittrait constraints with the modeled RGR values. In the most right column the explained variance of the SEM with the plot mean RGR values as used in the manuscript.
Cause  Environmental constraint  Traitâ€“trait constraints 
Dominant driver 
Dominant trait 
R^{2}: final model RGR modeled* 
R^{2}: final model* 

Effect 
Nutrient availability 
TSD  DE > IE 
Leaf traits 
Allometric traits 
Seed traits 
Relative growth rate 

LNC  1.00  0.00  yes  0.00  0.00  0.00  0.00  Nutrients  0.82  0.82  
SLA  0.31  0.02  0.06  0.50  0.00  0.11  Nutrients  Allometric traits  0.74  0.71  
LPC  0.67  0.13  yes  0.02  0.15  0.00  0.03  Nutrients  Allometric traits  0.97  0.97 
RGR  0.12  0.23  0.14  0.44  0.00  0.07  Disturbance  Allometric traits  0.56  0.49  
maxCH  0.06  0.51  yes  0.07  0.23  0.00  0.13  Disturbance  Allometric traits  0.73  0.69 
GF  0.04  0.47  0.05  0.37  0.00  0.08  Disturbance  Allometric traits  0.94  0.94  
SM_g  0.23  0.30  0.04  0.35  0.00  0.08  Disturbance  Allometric traits  0.63  0.63  
SM_d  0.10  0.33  0.03  0.25  0.24  0.05  Disturbance  Seed traits  0.92  0.93  
GO  0.08  0.35  0.04  0.45  0.00  0.07  Disturbance  Allometric traits  0.69  0.69  
FO  0.16  0.10  yes  0.01  0.61  0.09  0.02  Nutrients  Allometric traits  0.57  0.56 
Appendix F. Results of the PCA, RDA and RDA with latents with 10 traits, the extended SEM, path coefficients and justification.
Here, we determine whether the conclusion that disturbance and nutrient availability do not act on a separate suite of traits, still holds when the model is extended with three more traits (Table F1).
To account for the fact that an increase in canopy height often leads to shift in other traits that are caused by a shift in growth form and not by height per se, we recorded whether a species had a woody stem (woody / nonwoody). This simple division distinguishes mainly investments in structural biomass. In this paper we will refer to a shift towards woodiness as GF approaches one. In addition we also included seed mass of the dispergule (SM_d in mg; including the mass of the germinule) and a phenology trait flowering onset (FO in months).
TABLE F1. Extra traits used for the analysis of the extended model, the number of species involved and its literature sources. In total 346 species were present in the plots.
Trait category 
Trait (acronym)  Scale and units 
No. species 
Source 
Allometric traits 
Growth form (GF)  Ordinal (0: nonwoody, 1: woody)  346  1 
Seed traits 
Seed mass with dispergule included (SM_d) 
Log 10 Continuous (mg)  276  2 
Phenology traits 
Flowering onset (FO)  Ordinal (Months, 1: Jan. â€“ 12: Dec.)  342  1 
Sources. 1. BioBase 2003, Centraal Bureau voor de Statistiek, Voorburg/Heerlen.
2. see Douma et al. (in press)
The covariance among trait averages of species assemblages was analyzed first without explicitly defining possible underlying causes of common axes of variability between plots by submitting 156 plots Ã— 10 traits to a principal component analysis. The results are shown in Table F2. Subsequently, we explicitly constrained the multivariate structure in traits by environmental data, but still without imposing any causal hypotheses, using a redundancy analysis (RDA; ter Braak 1987) based on 3 environmental variables (Soil C/P ratio, Soil C/N ratio, â€˜time since disturbanceâ€™). The results are shown in Table F3.
TABLE F2. Results of Principal Component Analysis (PCA) of the sitetrait matrix (156 sites Ã— 10 traits). The explained variance is shown as well as the trait scores. Abbreviations of traits: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), seed mass of the germinule (SM_g), seed mass of the dispergule (SM_d), maximum canopy height (maxCH), Growth form (GF), relative growth rate (RGR), germination onset (GO), flowering onset (FO).
PCA 1  PCA 2  
Cumulative Explained Variance 
0.55  0.79 
Scores  
SLA  0.46  1.55 
LNC  0.86  1.39 
LPC  0.64  1.69 
RGR  0.74  1.25 
maxCH  1.20  0.60 
GF  1.20  0.73 
SM_g  1.17  0.14 
SM_d  1.27  0.23 
GO  1.11  0.45 
FO  0.97  0.18 
TABLE F3. Results of the RDA with the relevÃ©traits (156 sites Ã— 10 traits) constrained by 2 environmental factors (measured TSD and measured Soil C/P and Soil C/N). The cumulative explained variance of the constrained axis, and the scores of the environmental constraints are shown.
RDA1  RDA2  RDA3  
Cumulative explained variance 
0.35  0.41  0.42 
LogC/N  0.176  0.5874  0.78997 
LogC/P  0.358  0.9011  0.24455 
TSD  0.9564  0.2889  0.04322 
The model developed in this appendix differs from the way the environmental drivers affect the traits. Since exact measurements of disturbance frequency and nutrient availability at temporal and spatial scales relevant to plants are rarely directly available; i.e. they are â€œlatentâ€ variables, one usually has only indirect and imperfect measurements. This is particularly true for nutrient availability, which is highly dynamic in space and time. Therefore soil nutrient concentrations do not necessary reflect nutrient availability as experienced by plants in the long run (OrdoÃ±ez et al. 2010). SEM allows us to explicitly incorporate the uncertainty by introducing a latent variable (Shipley 2002), which is estimated as the common variance of the soil nutrient parameters and the traits associated with the latent variable. Based on two latents a new SEM was developed, shown in Fig. F1. Note that the latent specified in this model takes the common variance of the leaf traits and the soil C/P ratio (a model including soil C/N did not fit) and measured time since disturbance and maxCH, GF, SM_d and GO. It would be better to have to obtain multiple and independent estimates of the error associated with soil fertility and disturbance but that this is not yet possible. Alternatively, one could build more explicit measurement models of environmental drivers that include more or better indicators and test them with independent data. This is not possible with our data
The model as presented in Fig. F1 fit the data (Ï‡^{2} = 47.68, df = 35, P = 0.07) The covariance matrix and the modeled covariance matrices are given in Table F4 and F5. Parameter estimates and standard errors of the model are given in Table F6. The fit increased significantly if a free covariance between LNC and SM_g was added (Ï‡^{2} = 31.89, df = 34, P = 0.57). Because RGR is central to the model and we lowered the minimum amount of species needed to calculate a plot mean, we ran a control analysis which revealed that the structure of the SEM, the significance and sign of the paths remained unchanged for modeled values of RGR (see Appendix C for details).
FIG. F1. Standardized path coefficients, explained variance (in squares) and significance (between parentheses) of the final model of nutrient availability, disturbance and their related traits (Ï‡^{2} = 47.68, df = 35, P = 0.07, only significant paths shown). Note that the relationship between nutrient availability and Soil C/P ratio is negative.
TABLE F1. Covariance matrix of the variables used in the model of F1.
10Log C/P  TSD  maxCH  LNC  SLA  SM_G  SM_D  LPC  RGR  GO  FO  GF  
10Log C/P  0.135  
TSD  0.855  630.625  
maxCH  0.024  7.365  0.14  
LNC  0.638  20.15  0.465  10.379  
SLA  0.451  20.216  0.096  6.814  12.042  
SM_G  0.042  6.069  0.12  0.854  0.413  0.191  
SM_D  0.048  11.959  0.209  1.063  0.567  0.249  0.413  
LPC  0.102  1.676  0.037  1.417  1.257  0.095  0.108  0.26  
RGR  0.001  0.489  0.007  0.006  0.018  0.006  0.012  0.003  0.001  
GO  0.005  1.152  0.021  0.086  0.032  0.018  0.033  0.007  0.001  0.005  
FO  0.036  4.624  0.078  0.593  0.349  0.104  0.158  0.087  0.005  0.014  0.163  
GF  0.012  4.992  0.087  0.249  0.081  0.072  0.135  0.015  0.005  0.014  0.056  0.059 
TABLE F5. Modeled covariance matrix of the model Fig. F1.
10Log C/P 
TSD (measured) 
maxCH  LNC  SLA  SM_G  SM_D  LPC  RGR  GO  FO  GF 
Latent Nutrient Availability 
Latent TSD 

10Log C/P  0.135  
TSD (measured) 
1.627  630.625  
maxCH  0.032  7.320  0.140  
LNC  0.596  23.206  0.461  10.379  
SLA  0.484  21.227  0.102  6.798  11.955  
SM_G  0.051  6.415  0.120  0.742  0.439  0.191  
SM_D  0.065  12.251  0.209  0.930  0.573  0.250  0.414  
LPC  0.100  1.919  0.038  1.421  1.242  0.097  0.102  0.259  
RGR  0.000  0.484  0.007  0.009  0.017  0.006  0.012  0.004  0.001  
GO  0.006  1.172  0.021  0.080  0.034  0.019  0.033  0.008  0.001  0.005  
FO  0.042  4.509  0.078  0.600  0.443  0.103  0.154  0.085  0.004  0.014  0.162  
GF  0.016  4.976  0.086  0.238  0.087  0.072  0.135  0.015  0.005  0.014  0.056  0.058  
Latent Nutrient Availability 
0.204  7.960  0.155  2.915  2.367  0.252  0.316  0.488  0.002  0.027  0.205  0.080  1.000  
Latent TSD 
1.627  497.212  7.320  23.206  21.227  6.415  12.251  1.919  0.484  1.172  4.509  4.976  7.960  497.212 
Mathematical specification of the model (Fig. F1):
10log Soil C/P = 0.204 Ã— latent_NA^{â€ } + 1.000 e_10log Soil C/P
TSD = 1.000 latent_TSD^{â€¡} + 1.000 e_TSD
maxCH = 3.735 Ã— V44 + 0.018 Ã— latent_TSD + 1.000 e_maxCH
LNC = 2.915 Ã— latent_NA + 1.000 e_LNC
SLA = 10.585 Ã— maxCH 2.778 Ã— latent_NA + 0.154 Ã— latent_TSD + 1.000 e_SLA
SM_g = 0.724 Ã— maxCH  0.140 Ã— latent_NA + 1.000 e_SM_g
SM_d = 0.801 Ã— SM_g + 0.650 Ã— GF + 0.008 Ã— latent_TSD + 1.000 e_SM_d
LPC = 0.323 Ã— maxCH  0.538 Ã— latent_NA + 1.000 e_LPC
RGR = 0.002 Ã— LNC + 0.003 Ã— SLA  0.117 Ã— GF + 1.000 E44
GO = 0.059 Ã— maxCH  .138 Ã— GF + 0.007 Ã— latent_NA + 1.000 e_GO
FO = 1.159 Ã— maxCH  .260 Ã— SM_g  2.809 Ã— GF + 0.156 Ã— latent_NA + 0.008 Ã— latent_TSD + 1.000 e_FO
^{â€ }latent_NA is latent Nutrient Availability, ^{â€¡}latent_TSD is latent Time since disturbance
Free covariances:
latent_NA  latent_TSD: 7.96
latent_NA  e_GF: 0.017
SM_g  RGR: 0.001
TABLE F6. Unstandardized path coefficients of full model (Fig. 3c main text). Standard error given in brackets, error variances with standard error in diagonal (calculated with robust estimates). Traits in rows are cause and traits in columns are effects. Correlational relationships in italics. Abbreviations of variables: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), seed mass of the germinule (SM_g), seed mass of the dispergule (SM_d), maximum canopy height (maxCH), Growth form (GF), relative growth rate (RGR), germination onset (GO), flowering onset (FO). Time since disturbance latent (TSD_l), Soil CP ratio (Soil CP), Time since disturbance measured (TSD_m).
Nutrient availability 
TSD_l  SoilCP  TSD_m  LNC  SLA  LPC  RGR  maxCH  GF  SM_g  SM_d  GO  FO  
Nutrient availability 
1  7.960 (2.021)  0.204 (0.027)  2.915 (0.202)  2.778 (0.272)  0.538 (0.034)  0.017 (0.005)  0.140 (0.025)  0.007 (0.004)  0.156 (0.030)  
TSD_l  0.026  497.212 (72.414)  1  0.154 (0.028)  0.018 (0.002)  0.003 (0.001)  0.008 (0.003)  0.008 (0.004)  
SoilCP  0.094 (0.011)  
TSD_m  133.413 (23.910)  
LNC  1.880 (0.286)  0.002 (0.001)  
SLA  3.466 (0.601)  0.003 (0.001)  
LPC  0.009 (0.006)  
RGR  0.001 (0.000)  3.735 (1.480)  0.001 (0.001)  
maxCH  10.581 (1.665)  0.323 (0.062)  0.043 (0.014)  0.462 (0.048)  0.724 (0.066)  0.059 (0.030)  1.159 (0.223)  
GF  0.117 (0.010)  0.003 (0.001)  0.649 (0.237)  0.138 (0.044)  2.809 (0.472)  
SM_g  0.072 (0.008)  0.801 (0.047)  0.260 (0.082)  
SM_d  0.031 (0.004)  
GO  0.001 (0.000)  
FO  0.071 (0.009) 
An alternative SEM that was consistent with the data (Ï‡^{2}= 49.93, df = 36, P = 0.06, CFI = 0.99) is nested in Fig. F1 and differed in a few aspects: the causal direction between maxCH and GF is reversed, the path from nutrientavailability to GF was removed, a free covariance between nutrient availability and maxCH was added (also possible with a path from nutrient availability to maxCH), and the path from TSD to maxCH was removed. The path from SLA to RGR was not significant anymore. Reversing the path from maxCH to SLA lead to a model that was not consistent with the data (P = 0.0003). Although the mdoel is consistent with the data, it is ecologically less likely as maxCH is not driven at all by â€˜time since disturbanceâ€™. Additionally, RGR is not significantly affected by SLA. Therefore based on ecological considerations we prefer the model presented in Fig. F1.
Replacing the measured estimates by the latent estimates of nutrient availability and disturbance (Fig. F1) to a RDA, increased the explained variance of the environmental drivers to 70%, i.e. 89% of the maximally explained variation (Table F7), Fig. F2.
TABLE F7. Results of the RDA with the relevÃ©traits (156 sites Ã— 10 traits) constrained by 2 environmental factors (latent TSD and latent Nutrient availability (Nu.availability)). The cumulative explained variance of the constrained axis, and the scores of the environmental constraints are shown.
RDA1  RDA2  
Cumulative explained variance 
0.50  0.70 
Nutrient availability 
0.72  0.70 
TSD  0.90  0.44 
FIG. F2. RDA with the relevÃ©traits (156 sites Ã— 10 traits) constrained by 2 environmental factors (latent TSD and latent Nutrient availability (abbreviated by Nu.avail)). Abbreviations of traits and environmental drivers: Leaf nitrogen content (LNC), leaf phosphorus content (LPC), specific leaf area (SLA), seed mass of the germinule (SM_g), seed mass of the dispergule (SM_d), maximum canopy height (maxCH), Growth form (GF), relative growth rate (RGR), germination onset (GO), flowering onset (FO).
The relative effects of environmental drivers on traits were calculated in the same way as the model from Fig. 3 (of the manuscript). These calculations showed that nutrient availability predominantly constrained leaf traits, such as SLA, LNC and LPC and that â€˜time since disturbanceâ€™predominantly affected allometric, seed traits and relative growth rate, constraining maxCH, GF, RGR, SM_g, SM_d and GO (Table F8). However, the effect of both drivers was not simply restricted to one suite of traits, but affected both suites of traits simultaneously. For example SM_g and FO were almost equally affected by nutrient availability and â€˜time since disturbanceâ€™. The constraining effects of environmental drivers on traits were only in 5 out of 10 traits stronger than traittrait constraints. Allometric traits predominantly constrained other traits, in particular seed and phenology traits.
TABLE F8. The effect of environmental constraints (cause; columns) on the selection of individual traits (effect; rows) relative to the effect of traittrait constraints. The total effects of the two environmental drivers and the traittrait constraints add to one. The effect of the environmental drivers on traits is decomposed in both direct effects (DE) and indirect effects (IE; effects transmitted via other traits; Fig. F1). Traitstrait constraints were grouped into four categories: leaf traits (LNC, LPC and SLA), allometric traits (maxCH and GF), seed traits (SM_g and SM_d) and relative growth rate (RGR). Additionally, the dominant environmental driver and the dominant traittrait constraints, as well as the explained variance of the traits.
Cause  Environmental constraint  Traitâ€“trait constraint 
Dominant driver 
Dominant trait 
R^{2} submodel 2: disturbance* 

Effect 
Nutrient availability 
Time since disturbance 
DE > IE 
Leaf traits 
Allometric traits 
Seed traits 
Relative growth rate 

LNC  1.00  0.00  yes  0.00  0.00  0.00  0.00  Nutrients  
SLA  0.30  0.01  0.06  0.49  0.00  0.13  Nutrients  Allometric traits  
LPC  0.66  0.13  yes  0.02  0.15  0.00  0.04  Nutrients  Allometric traits  
RGR  0.10  0.25  0.12  0.45  0.00  0.08  Disturbance  Allometric traits  
maxCH  0.05  0.48  yes  0.07  0.26  0.00  0.15  Disturbance  Allometric traits  0.83 
GF  0.04  0.46  0.04  0.36  0.00  0.10  Disturbance  Allometric traits  0.96  
SM_g  0.22  0.30  0.04  0.35  0.00  0.09  Disturbance  Allometric traits  0.53  
SM_d  0.10  0.34  0.03  0.23  0.24  0.06  Disturbance  Seed traits  0.93  
GO  0.08  0.34  0.04  0.45  0.00  0.09  Disturbance  Allometric traits  0.69  
FO  0.16  0.10  yes  0.01  0.60  0.09  0.03  Nutrients  Allometric traits  0.47 
* see Fig. F1
LITERATURE CITED
Douma, J. C., Aerts, R., Witte, J. P. M., Bekker, R. M., Kunzmann, D., Metselaar, K., and van Bodegom, P. M. (2011) A combination of functionally different plant traits provides a means to quantitatively predict a broad range of species assemblages in NW Europe. Ecography. DOI: 10.1111/j.16000587.2011.07068.x (in press)
OrdoÃ±ez, J. C., P. M. van Bodegom, J. P. M. Witte, R. P. Bartholomeus, J. R. van Hal, and R. Aerts. 2010. Plant strategies in relation to resource supply in mesic to wet environments: does theory mirror nature? American Naturalist 175:225â€“239.
Shipley, B. 2002, Cause and Correlation in Biology  a user's guide to path analysis, structural equations and causal inference. Cambridge, Cambridge University Press.
ter Braak, C. J. F. 1987. Ordination in R. H. G. Jongman, and O. F. R. van Tongeren, eds. Data analysis in community and landscape ecology. Wageningen, Pudoc.