Diversity |
Diversity statistics
Typical application | Assumptions | Data needed |
Quantifying alpha diversity in samples | Representative samples | One or more columns, each containing counts of individuals of different taxa down the rows |
These statistics apply to association data, where number of individuals are tabulated in rows (taxa) and possibly several columns (associations). The available statistics are as follows, for each association:
Many of these indices are explained in Harper (1999).
Approximate confidence intervals for all these indices can be computed with a bootstrap procedure. 1000 random samples are produced (200 prior to version 0.87b), each with the same total number of individuals as in each original sample. The random samples are taken from the total, pooled data set (all columns). For each individual in the random sample, the taxon is chosen with probabilities according to the original, pooled abundances. A 95 percent confidence interval is then calculated. Note that the diversity in the replicates will often be less than, and never larger than, the pooled diversity in the total data set.
Since these confidence intervals are all computed with respect to the pooled data set, they do not represent confidence intervals for the individual samples. They are mainly useful for identifying samples where the given diversity index falls outside the confidence interval. Bootstrapped comparison of diversity indices in two samples is provided in the Compare diversities module.
Quadrat richness
Typical application | Assumptions | Data needed |
Estimating species richness from several quadrat samples | Representative, random quadrats of equal size | Two or more columns, each containing presence/absence (1/0) of different taxa down the rows |
Four non-parametric species richness estimators are included in PAST: Chao 2, first- and second-order jackknife, and bootstrap. All of these require presence-absence data in two or more sampled quadrats of equal size. Colwell & Coddington (1994) reviewed these estimators, and found that the Chao2 and the second-order jackknife performed best.
Beta diversity
Typical application | Assumptions | Data needed |
Quantifying overall beta diversity in a set of samples | Representative samples | Two or more rows (samples) of presence-absence (0/1) data, with taxa in columns |
The beta diversity module in Past can be used for any number of samples (not limited to only two samples). The eight measures available are described in Koleff et al. (2003), and the table below refers to their notation:
Past | Koleff et al. |
Whittaker | b_{w} |
Harrison | b_{-1} |
Cody | b_{c} |
Routledge | b_{I} |
Wilson-Shmida | b_{t} |
Mourelle | b_{me} |
Harrison 2 | b_{-2} |
Williams | b_{-3} |
Taxonomic distinctness
Typical application | Assumptions | Data needed |
Quantifying taxonomical distinctness in samples | Representative samples | One or more columns, each containing counts of individuals of different taxa down the rows. In addition, the leftmost row(s) must contain names of genera/families etc. (see below). |
Taxonomic diversity and taxonomic distinctness as defined by Clarke & Warwick (1998), including confidence intervals computed from 200 random replicates taken from the pooled data set (all columns). Note that the "global list" of Clarke & Warwick is not entered directly, but is calculated internally by pooling (summing) the given samples.
These indices depend on taxonomic information also above the species level, which has to be entered for each species as follows. Species names go in the name column (leftmost, fixed column), genus names in column 1, family in column 2 etc. Species counts follow in the columns thereafter. The program will ask for the number of columns containing taxonomic information above the species level.
For presence-absence data, taxonomic diversity and distinctness will be valid but equal to each other.
Individual rarefaction
Typical application | Assumptions | Data needed |
Comparing taxonomical diversity in samples of different sizes | When comparing samples: Samples should be taxonomically similar, obtained using standardised sampling and taken from similar 'habitat'. | One or more columns of counts of individuals of different taxa (each column must have the same number of values) |
Given one or more columns of abundance data for a number of taxa, this module estimates how many taxa you would expect to find in a sample with a smaller total number of individuals. With this method, you can compare the number of taxa in samples of different size. Using rarefaction analysis on your largest sample, you can read out the number of expected taxa for any smaller sample size (including that of the smallest sample). The algorithm is from Krebs (1989). An example application in paleontology can be found in Adrain et al. (2000).
Let N be the total number of individuals in the sample, s the total number of species, and Ni the number of individuals of species number i. The expected number of species E(S_{n}) in a sample of size n and the variance V(S_{n}) are then given by
Standard errors (square roots of variances) are given by the program. In the graphical plot, these standard errors are converted to 95 percent confidence intervals.
Sample rarefaction (Mao tau)
Typical application | Assumptions | Data needed |
Computing species accumulation curves as a function of number of samples | Similar to individual-based rarefaction | A matrix of presence-absence data (abundances treated as presences), with taxa in rows and samples in columns. |
Sample-based rarefaction (also known as the species accumulation curve) is applicable when a number of samples are available, from which species richness is to be estimated as a function of number of samples. PAST implements the analytical solution known as "Mao tau", with standard deviation. In the graphical plot, the standard errors are converted to 95 percent confidence intervals.
See Colwell et al. (2004) for details.
Diversity curves
Typical application | Assumptions | Data needed |
Plotting diversity curves from occurrence data | None | Abundance or presence/absence matrix with samples in rows (lowest sample at bottom) and taxa in columns |
Found in the 'Strat' menu, this simple tool allows plotting of diversity curves from occurrence data in a stratigraphical column. Note that samples should be in stratigraphical order, with the uppermost (youngest) sample in the uppermost row. Data are subjected to the range-through assumption (absences between first and last appearance are treated as presences). Originations and extinctions are in absolute numbers, not percentages.
The 'Endpoint correction' option counts a FAD or LAD in a sample as 0.5 instead of 1 in that sample. Both FAD and LAD in the sample counts as 0.33.
Compare diversities
Typical application | Assumptions | Data needed |
Comparing diversities in two samples of abundance data | Equal sampling conditions | Two columns of abundance data with taxa down the rows |
This module computes a number of diversity indices for two samples, and then compares the diversities using two different randomisation procedures as follows.
Bootstrapping
The two samples A and B are pooled. 1000 random pairs of samples
(A_{i},B_{i}) are then taken from this pool, with the same numbers
of individuals as in the original two samples. For each replicate pair, the diversity
indices div(A_{i}) and div(B_{i}) are computed. The number of times
|div(A_{i})-div(B_{i})| exceeds or equals |div(A)-div(B)|
indicates the probability that the observed difference could have occurred by
random sampling from one parent population as estimated by the pooled sample.
A small probability value p(same) then indicates a significant difference in diversity index between the two samples.
Permutation
1000 random matrices with two columns (samples) are generated, each with
the same row and column totals as in the original data matrix. The p
value is computed as for the boostrap test.
Diversity t test
Typical application | Assumptions | Data needed |
Comparing Shannon diversities in two samples of abundance data | Equal sampling conditions | Two columns of abundance data with taxa down the rows |
Comparison of the Shannon diversities (entropies) in two samples, using a t test described by Poole (1974). This is an alternative to the randomization test available in the Compare diversities module.
Note that the Shannon indices here include a bias correction term (Poole 1974), and may diverge slightly from the uncorrected estimates calculated elsewhere in PAST, at least for small samples.
Diversity profiles
Typical application | Assumptions | Data needed |
Comparing diversities in two samples of abundance data | Equal sampling conditions | Two columns of abundance data with taxa down the rows |
The validity of comparing diversities in two samples can be criticized because of arbitrary choice of diversity index. One sample may for example contain a larger number of taxa, while the other has a larger Shannon index. It may therefore be a good idea to try a number of diversity indices in order to make sure that the diversity ordering is robust. A formal way of doing this is to define a family of diversity indices, dependent upon a single continuous parameter (Tothmeresz 1995).
PAST uses the exponential of the so-called Renyi index, which depends upon a parameter alpha. For alpha=0, this function gives the total species number. alpha=1 gives an index proportional to the Shannon index, while alpha=2 gives an index which behaves like the Simpson index.
The program plots two such diversity profiles together. If the profiles cross, the diversities are non-comparable.
Next: Time series analysis | PAST home page |