Commentary
A new picture of life's history on Earth
Mark Newman*
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501
When most people think of paleontology, they picture the field
paleontologist digging up fossils with a toothbrush and publishing
descriptions of the anatomies of uncovered specimens in the trade
literature. New discoveries of dinosaurs even make the national headlines.
For many decades the discovery and description of a previously unknown
fossil species was the calling card that gained would-be paleontologists
professional acceptance. However, in the last quarter of a century
or so,
many of the most intriguing new results in paleontology have come not
from
field studies, but from the compilation and analysis of large-scale
databases of fossil species. These databases have provided us with
quantitative pictures of the pattern and size of mass extinction events,
the rate at which new species have appeared, and crucially the number
of
species on the planet through time, the so-called standing diversity.
In an
article appearing in this issue of PNAS (1), John Alroy, Charles Marshall,
and a large group of distinguished collaborators report on the creation
of
a new database that catalogs fossils at the level of individual
collections. Preliminary analysis of this database reveals interesting
results, calling into question some fundamental ideas about the history
of
life on Earth.
John Alroy, Charles Marshall, and a large group of distinguished
collaborators report on the creation of a new database that catalogs
fossils at the level of individual collections.
There are three principal features worthy of note in the paper by Alroy
et
al. First, the paper announces the creation of the new database. Second,
the authors describe new methods of data analysis made possible by
the
database that help to eliminate biases inherent in previous studies
as a
result of variations in patterns of fossil preservation and collection.
Third, these new methods raise doubts about the long-held belief that
biodiversity has increased dramatically in the last 250 million years;
it
may in fact be that diversity has been roughly constant, although no
firm
verdict has been reached yet on this point. Statistical analyses
of
species turnover and diversity in the fossil record have been dominated
in
the past by the work of one man, Jack Sepkoski, who from the early
eighties
until his death in May 1999 worked single-handedly on the compilation
of an
encyclopedic database of occurrences of marine invertebrates in the
fossil
record, using journal publications as his primary source (2, 3). Other
compilations also have been published (4, 5), but Sepkoski's database
has
received more attention by far than any other. Sepkoski's database
was
simple in structure: it recorded the first and last known occurrences
in
the fossil record of more than 30,000 marine invertebrate genera in
about
4,000 families. Marine invertebrates have been the focus of most
statistical studies, because preservation is much more reliable in
marine
environments and invertebrates are much more numerous than vertebrates.
Time was measured in stratigraphic stages, uneven intervals defined
by
using a variety of geological and paleontological markers. Many features
of
the fossil record have been deduced from Sepkoski's data. One of the
most
famous is shown in Fig. 1, which is a plot of the total number of genera
in
the database as a function of time during the Phanerozoic approximately
the
last 540 million years, from the so-called "Cambrian explosion" of
metazoan
diversity until the present day. The shape of this curve mirrors the
accepted view of life's history on the planet: a burst of diversification
in the Cambrian and Ordovician, followed by a rough plateau in diversity
for about 200 million years in the latter half of the Paleozoic, until
the
dip in the center of the figure, which represents the massive late-Permian
extinction event. Following this extinction, it appears that diversity
first recovered and then increased substantially during the Mezozoic
and
Cenozoic, rising to a present-day level two or more times higher than
any
seen during the Paleozoic. Sepkoski's database, although extensive
and
thorough, has a number of shortcomings. In particular, it records only
first and last occurrences of taxa anywhere in the world, and no other
data, such as how commonly taxa occur or where. Thus very widely occurring
taxa are accorded exactly the same status as ones that are found rarely.
Also, by the very fact that the database is as exhaustive as possible,
substantial biases are introduced. For example, it is quite feasible
that
the increase in diversity toward recent times seen in Fig. 1 is a result
primarily of the greater volume of rock available from recent times,
and
the greater amount of effort that has been put into studying these
rocks. A
number of studies over the years have presented evidence showing that
apparent diversity is closely correlated with the intensity with which
different periods of geologic time have been sampled. The new database
compiled by Alroy and coworkers (one of whom is the same Jack Sepkoski
mentioned above) attempts to correct some of these problems by including
more comprehensive data about fossil taxa, in particular dividing data
into
collections groups of fossils recovered from specific locales by specific
workers or teams with repeated occurrences of taxa at different times
and
places explicitly noted. Like the database of Sepkoski, the new database
focuses on marine invertebrates, and is at present incomplete work
is still
continuing on the compilation. Currently it covers two time periods
of
about 150 million years each, one in the middle part of the Paleozoic,
during the plateau seen in Fig. 1, and one from the mid-Mesozoic to
the
mid-Cenozoic, the central portion of the diversity increase in the
right-hand part of the figure. Because of the division of the database
into
collections, Alroy and coworkers have been able to compensate for biases
in
the intensity of sampling of different time intervals, and to some
extent
for varying quality of fossil preservation in their data, and so make
more
accurate estimates of diversity (although, as they are the first to
emphasize, biases are still present). Their technique of analysis involves
breaking the data down in two ways. First, they divide the data into
roughly equal time intervalsmore uniform in length than the intervals
used
by Sepkoski. Second, within each interval they attempt to choose a
constant
number of actual fossil specimens, as if the intensity of sampling
across
different times and places had been uniform, rather than widely varying
as
it in fact is. Unfortunately, only the number of taxa is recorded for
many
of their collections and not the number of specimens, so it is not
possible
to fix specimen number directly. Instead therefore, they have used
a
variety of different proxy techniques to simulate uniform sampling.
The
simplest such technique is to take a fixed number of collections (or
"lists" as the authors call them), being careful that the ones chosen
come
from geographically distributed localities. This method works well
if the
sampling intensity is roughly the same from one collection to another.
This, however, may not be the case, so they also use several other
techniques that weight lists according to their length, and they report
separate results for each of the different methods used. Clearly in
the
absence of more detailed information about which weighting is correct,
only
results that are robust across different methods should be considered
to
have strong support. The diversity counts given by Alroy et al. are
the
total numbers of taxa seen across all collections sampled, taken variously
either during the time intervals of study, or at the boundaries of
those
intervals. It is important to notice that these counts are not expected
to
be directly proportional to actual diversity (which is, in any case,
not
well defined). However, the counts should increase monotonically with
increasing real diversity, and two intervals that have the same total
count
can be expected to have approximately equal real diversities. That
is, the
results are comparable between different geologic times. To some
extent,
the principal new contributions of the present study are the database
itself and the sampling-standardized methods for measuring diversity.
However, the preliminary results also offer some interesting suggestions
of
what is to come in this field. The authors make a host of different
observations about the results of their calculations, but perhaps the
most
interesting is that most of their measures of biodiversity are found
to
give approximately equal figures for diversity in the two time periods
studied. Recall that in the curve of Fig. 1, derived from the earlier
work
of Sepkoski, the two periods showed very different behavior, the first
having a rough plateau in diversity, the second showing a marked diversity
increase. This increase is not clearly visible in the new results,
suggesting that the supposed post-Paleozoic diversification of marine
fauna
may be merely an artifact of biases in the Sepkoski database. It should
be
emphasized however, that these results are by no means final, and it
is too
early to draw any firm conclusions from the data.
The creation of this new database of the fossil record may well have
far-reaching effects. The mere fact that most of the previous work
in this
area has made use of just a single source of datathe Sepkoski
compilationmakes the creation of an independent database an important
and
worthwhile enterprise. However, the inclusion in this new database
of far
more detailed information on frequency of occurrence of taxa opens
the way
for statistically superior analyses of fossil biodiversity and other
quantities, which have not been possible before. The paper appearing
in
this issue represents only the first effort in this direction, and
we can
hope to see many new and interesting results emerging as the database
and
the analytical methods applied to it mature.
POWRÓT