Www.WorldHistory.Biz
Login *:
Password *:
     Register

 

13-08-2015, 10:18

The Sampling Paradox and the Archaeological Research Paradox

The goal of sampling is to obtain data that is representative of a given target population (s) of research interest. Of course, if the population were completely known, then there would be no need for collecting additional information. This is what is known as the sampling paradox. (The guideline that the best sampling designs are those that maximize prior information is also part of the sampling paradox.) Properly speaking, representativeness refers to characteristics of a population including its central tendency (means, mode, median) and variability (the nature of its dispersion from central tendencies). It is easy to describe the central, most common characteristics of a population (e. g., ‘mostly bifaces in test unit 5’ or ‘polychrome is dominant at scatter 101N, 13S’) because centrality can be summarized as univariate. Dispersion, on the other hand, is more difficult to describe because it is more complex in nature.

For years the model of SRS was an overly simplistic, thought-free, data-free way of assuring representativeness of the sample. SRS has also governed the thinking of archaeologists using other kinds of probabilistic sampling (e. g., stratified, the cluster scheme). Under SRS, each unit has to have an equal probability of selection of elements (‘epsem’, in sociological sampling terms). This epsem was operationalized in archaeology by forming equal-sized, nonoverlapping sampling units. Epsem is no longer thought to be a requirement. For example, with the ‘cluster sampling scheme’, units in the selected cluster have a greater chance of being selected than those outside the cluster(s). This easing of epsem was brought to light by the recent introduction of adaptive sampling.

Adaptive sampling basically is a procedure for selecting additional sampling units based on the results of certain variables observed during a pretest or an earlier phase of the project. It is an extension of the principle that the best sampling designs result from maximizing prior information and that data collection results must be constantly evaluated. It is also a refinement of the pretest/pilot project and of Redman’s multistage strategies and is designed to add to variability in the sample. Initially applied in biological and environmental situations, it has been combined with the cluster sampling scheme, is based on a neighborhood concept, and can be summarized as follows: prior to survey, a neighborhood of units is defined on the basis of spatial position compared to an initially selected sampling unit. After the results of the initially selected unit have been completed, additional units that meet certain empirical conditions (e. g., more than six artifact scatters of 20 or more pieces of debitage per transect) and the original positional requirement are added to the sample and examined. Any unit not meeting the empirical and locational conditions will stop the sampling process - no further units are examined. This simplified example excludes the cluster and network concepts that are more fully explained by Orton, as well as by Thompson and Seber. This example also shows that every unit does not have an equal probability of selection (epsem); those units closer to an initially selected unit have a greater chance of selection.

Nested and cluster sampling are two nearly synonymous terms related to the hierarchical nature of archaeological methods and of archaeological data. The terms refer to the scale of the archaeological record, data, and their context. For example, regions are subdivided into townships and then into sections that are selected to survey for sites and scatters. Sites themselves are subdivided into behaviorally meaningful zones (e. g., food preparation, residential, trade, trash, etc.) that are redivided into arbitrary grid units that are selected for excavation. The site contents include three-dimensional soil strata and features that themselves contain pollen profiles, faunal evidence, and artifacts. Soil strata, features, and artifacts each contain attributes/variables of observation (at the nominal, ordinal, interval, or ratio scales) that are initially selected for study in the laboratory and then subsequently selected for description and inclusion in the monograph reporting on a project.

However, the recent recognition of the cluster sampling approach (as opposed to the cluster sampling scheme discussed in this article) may have affected the existing body of archaeological knowledge. The cluster sampling approach refers to the fact that archaeologists are sampling space during the research process; during survey, methodologically convenient quarter sections, for example, are surveyed to discover and recover information about cultural and environmental variables to which we ascribe prehistoric meaning. During at least the initial stages of excavation, grid units are also being examined for cultural and environmental variables. Prior to field investigations, we do not know information about sites, features, and artifacts; we must discover this information by sampling space. Sociological statisticians have long realized (since the 1950s at least) cluster sampling as the approach when the unit of sampling differs from the unit of observation and measurement. They have also used a different set of statistics and formulas to interpret data collected by cluster sampling.

In archaeology, fieldwork is a sample of space, because detailed information (as specified in the research design) about the artifacts, scatters, and sites are unknown. In the big picture, it is possible that there are some false positives in the accumulated body of archaeological knowledge; some statements that have been accepted as true are actually false, a type II error. (The reason for this is that the proper statistical tests have not been used. Both the cluster sampling approach and the cluster sampling scheme requires its own set of statistical formulas for population estimation and for hypothesis testing. The conventional, normally used formulas for simple random samples and epsem do not apply to either the cluster sampling approach or to the cluster sampling scheme.) This distortion in archaeological knowledge probably most often goes unrecognized at smaller scales where variability would be understated. In other words, populations that are determined to be statistically different are really statistically the same. This distortion also goes unrecognized when areal (sites per environmental stratum) or volumetric (artifacts per soil or feature stratum) ratios are calculated. For these examples, propertied space (not arbitrary space) loses its methodological characteristic and becomes substantive, part of prehistoric meaning. Another ramification is that smaller sample sizes may be needed to distinguish between two samples collected.



 

html-Link
BB-Link