In the Metadata Corner for this issue, Michael Day reports from the Working Meeting on Electronic Records Research, held in Pittsburgh, Pennsylvania from May 29-31, 1997. This article appears in the Web, and not the print, version of Ariadne.
Archivists and records managers share an interest in the archival management and preservation of what are today known as electronic records. Recognition of important issues related to the archival management of electronic records dates back to the early 1970s when archivists began to investigate the accessioning of what were then known as machine-readable data files. It has long been recognised that the archival community and the library community have shared concerns in this area, and this was demonstrated by the recently published report of a US Task Force on Archiving of Digital Information commissioned by the Commission on Preservation and Access and the Research Libraries Group [1]. These shared concerns mean that other information professionals, including librarians, information scientists and computing scientists, will have an potential interest in the archival community's response to electronic recordkeeping.
The Working Meeting on Electronic Records Research was organised by the Pittsburgh based Archives & Museum Informatics [2] and sponsored by the Centre for Electronic Recordkeeping & Archival Research (CERAR) at the University of Pittsburgh [3]. There were around fifty invited participants at the meeting and, as it was a working meeting, virtually all of them at some time were detailed to be either presenting papers or leading break-out group sessions. The meeting was held at the Embassy Suites Hotel close to Pittsburgh International Airport but twelve miles from the downtown area. This isolation was intentional as there would be less potential distraction for participants in the meeting. Half of the participants came from the United States, the remainder representing Canadian, Australian or European organisations. The intention of the Working Meeting was to identify areas for future research and implementation.
David Bearman of Archives and Museum Informatics introduced the meeting with a brief contextual paper describing the previous ten years of electronic records research and practice. In 1987 the archival profession's interest was largely focused on appraisal techniques and on media longevity issues. Throughout the next ten years further technological development combined with an added emphasis on functional requirements led to a significant change in focus. Interest in media longevity and 'refreshing' techniques, for example, has developed into a concern with data migration in an environment of software-dependence. Bearman briefly described the various international meetings and conferences which had taken place over the last ten years and then identified the five general subjects which were to be discussed throughout the rest of the meeting.
This report will not attempt to describe the proceedings of the meeting in detail but will pick up on particular themes and hopefully demonstrate some shared concerns between the archives and records management professions, on one hand, and the library and information professions, on the other.
Wendy Duff (University of Toronto) and Richard J. Cox (University of Pittsburgh) represented the University of Pittsburgh Electronic Records Project [6]. Their presentations at the Working Meeting elaborated on the concept of "literary warrant", which can be defined as the mandate from outside the archives profession - from law, professional best-practice and other social sources - which requires the creation and maintenance of records. It is thought that the concept of warrant might be helpful in fostering the understanding of records within an organisation and might, in addition, provide the authority necessary for records professionals to perform their important role within it.
The other project looking at this general area concerned "The Preservation of the Integrity of Electronic Records" and was based at the University of British Columbia (UBC). The methodological approach of the UBC project was to determine whether the general premises about the nature of records in diplomatics and archival science were relevant and useful in an electronic environment. Diplomatics has been defined as a body of concepts and methods, dating from the seventeenth and eighteenth centuries with the purpose of "proving the reliability and authenticity of documents" [7] . At the Working Meeting, Marcia Guercio (Ufficio Centrale Beni Archivistici, Rome) and Luciana Duranti (University of British Columbia) outlined the contribution that they felt archival science and diplomatics could give to a better understanding of electronic records.
The UBC project has adopted the concepts of reliability and authenticity from diplomatics. Duranti has defined both of these terms as follows [8]:
The UBC project has also been concerned with preserving the concept of "archival bond" in electronic records. Archival bond refers to what Duranti and MacNeil call "the link that every record has with the previous and subsequent one in the conceptual net of relationships among the records produced in the course of the same activity" [11]. It is interesting to speculate whether links in a hypertext database or Web page pose the same type of conceptual problem.
The UBC project elaborated the idea of a "record profile" which would contain "all the elements of intellectual form necessary to identify uniquely a record and place it in relation to other records belonging in the same aggregation" [12]. The record profile is therefore essentially a type of annotation (or metadata) which would be linked to the record for its lifetime. The identification of relevant metadata and its capture was also a major preoccupation of the Pittsburgh project, and it is this subject that will be dealt with next.
Building on this, David Bearman's short paper at the Working Meeting on "Research issues in metadata" raised many important issues. Just a few will be outlined here:
Metadata linkage: how can the relevant metadata can be securely linked to the record content itself over time?
Metadata semantics: Bearman commented that records metadata "must be semantically homogenous" but it was also desirable that it should also be "syntactically heterogeneous".
Structural metadata and migration: in what way can metadata about the structure of a record ensure "least-loss" migration of evidence over time?
The discussion following Bearman's paper indicated that there was a need for what one working group described as "generic records metadata standards", and to instigate further analysis of metadata attributes and semantics. Interest was shown from some quarters in the Dublin Core Metadata Element set (DC) [18], and it was suggested that research could be carried out into the minimum elements which would need to be added to DC to make it useful in the records context. It was also recognised that resolution of many of these (and other) problems depended upon intelligent implementation in test environments and this seemed to be the immediate way forward.
Migration, in its widest sense, might also mean something more than periodic file conversion, and might include, for example: transfer to a human-readable medium like paper or microfilm; the use of software-independent formats; the creation of surrogates; and possibly the development of systems capable of emulating obsolete software and associated data [21]. These options need further research and it is unlikely that any single approach will be suitable for application to all types of electronic records.
The Working Meeting identified several areas of potential interest for research:
Defining acceptable data loss. Data migration is and will be a complex procedure, and is likely to result in some degree of data loss or degradation. What level of loss or degradation would be acceptable?
Documentation of the migration process. Subsequent users of records will need to determine which characteristics of a document were lost in each format conversion, the reasoning behind the migration strategy chosen and the authority responsible for implementing it.
The development of migration agents. "Self-migrating" records managed by artificial agents might be a long term goal, but any feasible system will have to be designed in collaboration with software engineers.
Cost models. Some research needs to be done into cost models for the different approaches to migration.
Professional collaboration. Hedstrom asked whether the requirements for the long-term preservation of electronic records is fundamentally different from the requirements for the preservation of other types of digital information. If not - and this is itself an area for legitimate research - what sort of collaborate activity would be appropriate, and with whom?
This visit was supported by grants from the UK Electronic Libraries Programme (eLib).
[2] Archives & Museum Informatics, Pittsburgh, PA,
http://www.archimuse.com/
[3] University of Pittsburgh, Centre for Electronic Recordkeeping and Archival Research,
http://www.lis.pitt.edu/~cerar/
[4] University of Pittsburgh, School of Information Sciences. Functional requirements for evidence in recordkeeping.
http://www.lis.pitt.edu/~nhprc/
[5] University of British Columbia, School of Library,
Archival and Information Studies.
The preservation of the integrity of electronic records,
http://www.slais.ubc.ca/users/duranti/intro.htm
[6] Duff, W. Ensuring the preservation of reliable evidence: a research project funded by the NHPRC. Archivaria, 42, Fall 1996, 28-45,
[7] Duranti, L. and MacNeil, H. The protection of the integrity of electronic records: an overview of the UBC-MAS Research Project. Archivaria, 42, Fall 1995, 46-67, p. 47.
[8] Duranti, L. Reliability and authenticity: the concepts and their implications. Archivaria, 39, Spring 1995, 5-10,
[9] Duranti and MacNeil. The protection of the integrity of electronic records, Archivaria, 42, Fall 1995, 46-67, p. 56.
[10] Graham, P.G. Intellectual preservation: electronic preservation of the third kind. Washington, D.C.: Commission on Preservation and Access, March 1994,
http://www-cpa.stanford.edu/cpa/reports/graham/intpres.html
[11] Duranti and MacNeil. The protection of the integrity of electronic records, Archivaria, 42, Fall 1995, 46-67, p. 53.
[12] Duranti and MacNeil. The protection of the integrity of electronic records, Archivaria, 42, Fall 1995, 46-67, p. 51.
[13] Wallace, D. Metadata and the archival management of electronic records: a review. Archivaria, 36, Autumn 1993, pp. 87-110.
[14] Wallace, D. Managing the present: metadata as archival description. Archivaria, 39, Spring 1995, 11-21.
[15] Bearman, D. Documenting documentation. Archivaria, 34, Summer 1992, 33-49.
[16] Hedstrom, M. Descriptive practices for electronic records: deciding what is essential and imagining what is possible. Archivaria, 36, Summer 1993, 53-63, p. 55.
[17] Bearman, D. and Sochats, K. Metadata requirements for evidence. 1996,
http://www.lis.pitt.edu/~nhprc/BACartic.html
[18] The Dublin Core Metadata Element Set home page,
http://purl.org/metadata/dublin_core
[19] Task Force on Archiving of Digital Information, Preserving digital information. Washington, D.C.: Commission on Preservation and Access, May 1996, p. 6.
[20] Michelson, A and Rothenberg, J. Scholarly communication and information technology: exploring the impact of changes in the research process on archives. American Archivist, 55, Spring 1992, 236-315, p. 298.
[21] Rothenberg, J. Ensuring the longevity of digital documents. Scientific American, 272 (1), January 1995, 24-29.
[22] Bearman, D. Archival data management to achieve organizational accountability for electronic records. Archives and Manuscripts, 21 (1), May 1993, 14-28.
Material on this page is copyright Ariadne/original authors. This article last updated/links checked on 12-Sep-1997