Archaeologists employ many media to record and analyze their observations, including handwritten field notes, computer spreadsheets, videotapes, and digital images. However, the sometimes massive amounts of data that result from the sorting, grouping, and measuring activities of archaeologists today call for complex systems to organize them. Typically these consist of relational databases. In some cases, archaeologists use spatially referenced databases in geographic information systems (GIS) and, increasingly, they use hypermedia, including web sites, to organize and present their results.
A flat-file database is a simple form of database that, like an old-fashioned card-file system, consists of a number of ‘records’, each of which, much like a paper form, is used to document various kinds of information in a set of ‘fields’. For example, in a file for recording information about stone tools, each record would be used to describe a particular tool, and each field would be used to record a different attribute, such as the tool’s length or position of retouch.
A relational database similarly employs records and fields, but is more complex and useful because it also has connections, or ‘relations’, between files. By controlling redundancy, relational databases are more efficient than flat-file ones because information of different kinds (and therefore requiring quite different fields) is partitioned into different files, while the relations allow reference to information in two or more files at once. To illustrate with an example, it would be inefficient to document details of contextual information in a file describing stone tools. If, during excavation, we happened to find, say, 50 stone artifacts in a particular context (layer or pit), each of these artifacts would have identical contextual information. It would not make sense to record this information repeatedly on every artifact record. A relational database more sensibly records the contextual information in one file, the information on lithic artifacts in another, and uses a relation to tie the artifact records in the latter to the contextual records in the former. Figure 2 depicts a structure chart that illustrates this example. Both files have key attributes, here Context number and Artifact number, that uniquely identify each record. Each of the context records in the Context file, contains information about a particular context, organized among
Figure 2 Example of a structure chart for the files “Contexts” and ‘‘Lithics’’ in a relational database. The codes next to each field label indicate the data type (e. g., ‘A’ for alphanumeric, ‘B’ for boolean, ‘C’ for character, and ‘N’ for numeric fields).
A number of fields, such as ‘sediment texture’ and ‘grid coordinates’. Each record in Lithics similarly records information about a particular artifact in fields such as ‘length’, ‘width’, and ‘platform type’. In addition, the Lithics file has a special field, called an attribute pointer, that connects it to the key attribute, Context number, in the Context file. All of the records describing artifacts from the same context would have the same context number in their key attribute field. This allows the software to refer back to the Context file if we need information about the context of any of the artifacts, or want to search for all specimens with particular contextual characteristics. For example, we might want to find all the bone fragments that were found in pits, or all the lithics found below layer 6 (Figure 2).
Today, archaeologists have a number of software options for their relational databases, and most of these have tools to help them design useful and efficient ones. However, a common mistake is to sit down in front of a computer and begin defining files and fields without giving adequate attention to the database’s design. It will save a lot of time and frustration in trying to fix inadequacies of the database if you instead think carefully about your database requirements in advance, and plan it out on paper.