Jan SCHULTZE / Ingrid BECKMANN / Holger TÜRK

(Fakultät für Informatik, TU Dortmund, Germany)

Outline: Only the use of standardized technologies and well known data models can ensure long term conservation of digital archaeological data.

The ArcheoInf project aims at providing long term conservation and accessibility of digital archaeological data. We are dealing especially with raw data about finds.
Currently non-standardized commercial software like Paradox, Access or Filemaker is used in combination with self-made data models. This causes two problems: Foremost, vendors of databases can discontinue their support or even cease to exist, thus they cannot guarantee long term data accessibility. Hence, after a while it might prove difficult to access the data in these non-standardized files. Additionally, documentation of the self-made data models is almost always neglected and, in the long run, personnel fluctuations make it difficult to ask the creator of the schema about its semantics.
The most promising strategy to ensure long term data accessibility is twofold. Firstly, it consists of using standardized technologies such as SQL and RDF. Secondly, it encompasses using well known ontologies such as CIDOC CRM. This approach is suitable for newly created databases, but what about existing legacy databases?
ArcheoInf converts data: Partly automated migration of legacy databases to SQL databases. Manual mapping of legacy data schemata to a model based on CIDOC CRM. Creation of a user interface for combined access to all databases, which displays the spatial distribution of archaeological finds. Creation and integration of a thesaurus of common vocabulary as a digital “Rosetta Stone”.
ArcheoInf conserves data: The university libraries of Bochum and Dortmund will provide the service infrastructure to ensure data accessibility in the long run. This reflects the insight of modern library sciences that information has to be conserved not only on paper, but also in electronic form.
This research was partially supported by the Deutsche Forschungsgemeinschaft as part of DFG-Forschungsprojekt “ArcheoInf” (0421011010).

Keywords: long term conservation, legacy database migration, open standards, thesaurus, CIDOC-CRM