Ecological Archives M074-011-A1

Mark W. Denny, Brian Helmunth, George H. Leonard, Christopher D. G. Harley, Luke J. H. Hunt, and Elizabeth K. Nelson. 2004. Quantifying scale in ecology: lessons from a wave-swept shore. Ecological Monographs 74:513–532.

Appendix A. A brief primer on spectral analysis.

Consider the hypothetical variable y(x) depicted in Fig.1B. The location, x, along a particular spatial path is shown on the abscissa and the magnitude of the variable is shown on the ordinate. For example, if we moved a thermometer along a horizontal path through the intertidal zone and continuously measured both x, the total distance traveled, and y, the temperature of the rock surface, a graph similar to Fig. 1B would likely result.

The fluctuation in y is traditionally measured by its variance. In practice, y is sampled at a discrete set of locations, and the sample variance is calculated as

(A.1)

Here, y(xj) is the magnitude of the function at the jth location, n is the total number of samples, and my is the mean of the sampled values. 

The variance describes the overall amount of fluctuation in a variable, but it does not describe how this fluctuation is divided among measurement scales. It is evident from Fig. 1B that in this particular case y tends to vary periodically along its path of measurement. This periodicity is not perfect (e.g., sometimes there is a valley in y where a peak would be expected), but a dominant spatial pattern can be discerned. What is the scale of this pattern, and how can we calculate it?

A classic theorem (due to J. B. J. Fourier, 1768 – 1830) tells us that the spatial variation in y as a function of x can be quantified through the following procedure. The total length of our measured path, xmax, defines the largest scale about which we have any data (the extent of the data). Note that if all n samples are spaced Dx apart, the total length over which we have measured y is xmax:

(A.2)

Traditionally this extent is expressed not as xmax itself, but rather as the fundamental spatial frequency, ff:

(A.3)

A low spatial frequency corresponds to a large spatial extent, a high spatial frequency corresponds to a small spatial extent. Spatial frequency has units of m-1.

Given the fundamental spatial frequency (which is set by the total path length), we describe the variation in y at a series of additional frequencies, fi, each a harmonic of the fundamental. In other words, we examine spatial frequencies

(A.4)

where i is an integer greater than 0. 

At any given point x, the magnitude of y is:

(A.5)

This is one form of the classic Fourier series (Bendat and Piersol 1986). In essence, we sum a series of harmonic sinusoidal waves (each with its own particular amplitude, bi, and phase,i) to approximate the overall pattern of variation in y. This process is shown in Fig. 1C–L. The ten waves shown there (each a harmonic of the fundamental) sum to the overall waveform plotted in Fig. 1B. Note that each harmonic has an integral number of wave lengths in the interval xmax. For simplicity in this example we set xmax to 1. Note that the third, seventh, and tenth harmonics of the fundamental have the largest amplitudes.

Having expressed the function y(x) as the sum of a series of sine waves, we can now use the amplitude of each of these waves to calculate what fraction of the overall variance in y is associated with each of the spatial scales described by the harmonics of the fundamental spatial frequency. The details of the calculation need not concern us here (see Bendat and Piersol 1986); it is the end result that matters. The scale-specific (= frequency-specific) contribution to the overall variance in y is quantified by the autospectral density function, S(f). If a large fraction of the overall variance in y occurs with a periodicity that falls in a particular small range in spatial frequencies, the S for that frequency range is large. Conversely, if little of the overall variation in y corresponds to a particular spatial frequency, the S for that frequency is small. The pattern of variation among frequencies is the autospectrum shown in Fig. 2A, in this case for the process shown in Fig. 1B. Note that there are peaks in this autospectrum at the third, seventh, and tenth harmonics of the fundamental frequency, corresponding to those harmonics that have large amplitudes (Fig. 1C–L).

It is important to note several properties of the autospectrum. First, there is an upper limit, i = g, to the harmonic of ff at which we can discern any variation in the process y, and this limit is set by the spacing, x,between our samples.  If y varies at a scale smaller than 2x, this variation cannot be reliably measured directly by our technique.  The spatial frequency corresponding to a scale of 2x (known as the Nyquist frequency, fg) is equal to 1/(2x).  Working through the algebra, we find that g = (n-1)/2. The Nyquist frequency is equivalent to the lower limit of the spatial detail we can discern (the grain of our measurements). Note that variation at frequencies above the Nyquist frequency is not excluded from our measurements.  Instead, through a process known as aliasing, high frequency fluctuations appear in the form of spurious, "scrambled" variation at frequencies below the Nyquist frequency (see, for example, Press et al. 1992), and can thereby potentially affect our measurements. 

Second, the integral of S(f) across a particular range of frequencies is equal to the variance associated with those frequencies:

(A.6)

This is the basis for calculating the variance scale (Eqs. 7, B.10) and for the calculations shown in Fig. 9. By extension, the integral of S(f)between the fundamental and Nyquist frequencies is equal to the overall measurable variance in y

(A.7)

In other words, the total area under the autospectrum is equal to the overall variance.

Third, the units of S(f) take into account the units of both y and x. For example, in the case described here y is measured in ºC and x is measured in meters (so that spatial frequency has units of m-1). The variance in temperature, 2(the area under the curve), has units ºC2. As a result, S must have the units ºC2m.

Because the spectrum is calculated only for harmonics of the fundamental frequency (and are therefore orthogonal [see Priestley 1981]), the individual spectral estimates are independent. In other words, the spectrum calculated as described above provides the minimum number of sinusoidal waves required to exactly reproduce the sample data.

It is common that the magnitude of individual points in the autospectrum varies across a large range. When this is true, the fine structure of the spectrum may be dwarfed by the major peaks, and can thereby escape notice. As a remedy, spectra are often plotted on log-log axes, and this convention is used here in Appendix E. Note that areas under spectra can be visually distorted by the log transformation.

We have seen here that if a variable is characterized by periodic fluctuation at one particular spatial scale, this tendency will be exposed by the existence of a peak in the autospectrum. This does not imply, however, that all variables are periodic, nor that all periodic variables are characterized by single spectral peaks. A pertinent example is shown in Fig. A1. Here the data are clearly periodic (Fig. A1A), but not sinusoidal: in this case, all negative values of a sine wave have been set to zero, while the positive values are unchanged. The hypothetical variable shown here is very similar to measurements of environmental light intensity in which it is uniformly dark at night, but solar irradiance varies approximately sinusoidally throughout the day (Fig. E1). The spectrum of this process (Figs. A1B and C) is characterized by a series of peaks. The peak with the lowest spatial frequency corresponds to the overall periodicity of the variable. The peaks at higher spatial frequencies are an indication of the additional waveforms required to give the data its nonsinusoidal shape. In this case, only the lowest-frequency peak should be used to characterize the peak scale.

This rule applies to ecological data as well. For example, consider the predator density along a hypothetical intertidal transect. The maximum density is constant for 1 m, then abruptly shifts to a higher density for 1 m, then back to the original density for 1 m. This "square wave"pattern is repeated along the shore. Thus, this hypothetical shoreline has a predation scale of 1 m. The spatial autospectrum of this square wave would look qualitatively similar to that of Fig. A1B, C, with its primary peak at 0.5 m-1 (although the density is constant for 1 m, the pattern of density repeats itself every 2 m).

We now return to the problem of aliasing. Consider the situation shown in Fig. A2A. Two sine waves of different frequencies are sampled with a fixed spacing, x. In every case, the sampled points fall on both sine waves. In other words, given this sampling regime, we have no way of discerning whether our measured variation is associated with wave #1 (at a frequency below the Nyquist frequency) or wave #2 (at a frequency above the Nyquist frequency). Indeed, an infinite number of sine waves at still higher frequencies could yield the same measurements given the spacing x. Only if we were to sample at a smaller interval would we be able to discern exactly what frequency (or frequencies) contribute to the measured variation. This observation is the essence of the fact noted earlier: we cannot accurately discern the frequency of variation for fluctuations above the Nyquist frequency. Fig. A2 has further implications, however. If, in reality it is wave #2 that is present in our data (rather than wave #1), the variation due to this wave is still recorded in our measurements even though its frequency lies above the Nyquist frequency. In other words, just because fluctuations occur at a frequency above that which we can accurately discern, the variation associated with these frequencies still appears in our data. The variance associated with frequencies above the Nyquist frequency is aliased to frequencies below the Nyquist frequency. 

The mechanism by which aliasing occurs is shown in Fig. A2B. A fluctuation at a frequency above fg is sampled at a spacing x where x is not a precise multiple of the wavelength of the fluctuation. As a result, if the first sample falls at a peak of the fluctuation, the second sample will be taken slightly off peak, the third sample farther off peak, and so forth. The sampled points trace out a sinusoidal fluctuation at a low frequency, a frequency well below the Nyquist frequency. This is the aliased signal.  It can be shown (e.g., Bendat and Piersol 1986) that a frequency f below the Nyquist frequency can potentially be "contaminated" by fluctuations at , , , etc. Thus, if there is variance above the Nyquist frequency, the measured spectrum may deviate substantially from reality. 

In practice, the problem of aliasing is controlled by filtering the measured data prior to analysis. For example, our measurements of species diversity, mussel density, mussel disturbance, and predation intensity are made using a quadrat 0.21 m × 0.30 m. In essence, the values obtained from these measurements are averages over this area, and variation at smaller scales is thereby filtered out. Similarly, in the wave-force measurements made in this study, the size of the apparatus is such that measured forces are averaged over a scale xmin of approximately 20 cm. Forces with a smaller spatial scale cannot affect the apparatus, and variation associated with them therefore cannot be aliased. Measurements of larval recruitment are made using settlement plates of finite dimension, xmin.  Each measurement is thus an average value tied to this minimum scale, and higher frequency (= smaller scale) variations are filtered from the data. In all real-world measurements, the properties of the measuring apparatus place an upper limit, fmax = 1/xmin,on the frequency that can be measured, and it is thus only those frequencies between fg and  fmax that can be aliased. Although we cannot definitely rule aliasing out as a factor in our analyses, the shape of all of our spectra (in which the variance decreases rapidly with increased frequency within the range we have sampled), and the relatively narrow band of frequencies between fg and  fmax suggests that aliasing effects are negligible.

The brief description here of spectral analysis has been couched in terms of spatial variation. The technique can be equally well applied to variation through time.  In this case, the abscissa of Fig. 1 would measure time, t, and the scale of variation would be characterized by harmonics of the fundamental temporal frequency, ff:

(A.8)

The process is sampled at equally spaced intervals of time, t, and the total length of the time series is (n-1)t = tmax. In a temporal spectrum, a peak at a high frequency corresponds to a variable that fluctuates with only a short interval between occurrences, whereas a peak at low frequencies corresponds to a variable that fluctuates with a long interval between occurrences. 

Nuts and Bolts

Each of our data series was divided into four segments (each with an even number of data points) with a 50% overlap among segments. A Hanning window was applied to each segment to avoid spectral artifacts due to any abrupt deviation of the data from the mean at the ends of the segments, and the spectral estimates were calculated for each segment and adjusted for the decrease in variance due to the Hanning window. See Bendat and Piersol (1986) or Priestley (1981) for a discussion of spectral windows. The overall spectral estimate at each harmonic is the average of the estimates for the four segments. 

The choice of four segments was determined by the following factors. We desired to examine our data at as broad a range of scales as possible, and the maximal scale (the minimum spatial or temporal frequency) is set by the length of the data series. The fewer the number of segments we used, the longer each segment could be, and the larger the scale we could examine. However, there is a practical lower limit to the number of segments. The statistical confidence in each spectral estimate decreases as the number of segments decreases (Bendat and Piersol 1986, Priestley 1981). We found that the use of four segments yielded results that were consistent with those using higher numbers of segments, whereas the use of three or fewer segments yielded results that were unacceptably noisy. Thus, for our data the use of four segments represents the optimal trade off between record length and statistical confidence. Common, alternative methods for increasing statistical confidence (band averaging or running averaging the spectrum from the full time series) result in the loss of high-frequency spectral estimates and (for the same statistical confidence) do not retain any additional information about low frequencies.

Note that the use of multiple segments affects the number of data points necessary to examine a process. If x is the time or distance between measurements, xg (= 2x) is the smallest time or distance at which frequency-specific information is available (the Nyquist scale, the grain), xmax is the largest scale for which information is desired (the extent), and the number of samples in a single segment encompassing all the data is (2xmax/xg)+1.  If, however, d segments are needed for the analysis (d > 1), and an overlap of 50% between segments is assumed (as used here), the total number of samples required is:

(A.9)

Thus, when d = 4 (as we have used), approximately 2.5 times as many samples are required as one might naively assume to examine variation at a given maximum scale.

The confidence limits on each spectral estimate are determined by the number of degrees of freedom associated with that estimate. When estimates are based on the average of n segments (as they are here), there are 2n degrees of freedom (Bendat and Piersol 1986). The 95% confidence limits are (Bendat and Piersol 1986, p. 286):

(A.10)

Lastly, we note that spectral analysis provides the same information as the analysis of autocovariance, a form of analysis with which ecologists may be more familiar (see Appendix C). The Wiener-Kinchine relationships show that the autocovariance of a process is the inverse Fourier transform of the autospectrum (Bendat and Piersol 1986). Thus, for example, the measurements of spatial autocorrelation in intertidal snails made by Underwood and Chapman (1996) provide the same type of spatial information as the spectral measurements made in this study.

 

LITERATURE CITED

Bendat, J. S., and A. G. Piersol. 1986. Random data: analysis and measurement procedures (Second Edition). John Wiley and Sons, New York, New York, USA.

Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. 1992.  Numerical recipes in Fortran. Second Edition. Cambridge University Press, Cambridge, UK.

Priestley, M. B. 1981. Spectral analysis and time series. Academic Press. New York, New York, USA.

Underwood, A. J., and M. G. Chapman. 1996. Scales of spatial patterns of distribution of intertidal snails. Oecologia 107:212–224.

 

 
   FIG. A1. (A) A periodic signal (a truncated sine wave) similar to the signal of solar irradiance. (B) The spectrum of the signal shown in panel A. Note the existence of a dominant peak. The secondary peak at the first harmonic of the primary peak is a result of the nonsinusoidal shape of the signal. (C) The spectrum of panel B plotted on log-log axes.

 

 
   FIG. A2. (A) The measurements taken at the dots could be due either to the low-frequency wave 1, or the-high frequency wave 2. (B) Sampling a high-frequency wave at too low a frequency leads to an aliased signal. The variation due to the high-frequency wave (the solid line) appears to occur at a much lower frequency (the dashed line).



[Back to M074-011]