On October 8 & 9, 1999 the Oriental Institute of the University of Chicago sponsored a conference on the ways in which electronic text processing might increasingly serve the needs of scholars studying ancient Near Eastern texts, Charles E. Jones and David Schloen report on the conference and on the past and potential futures, and in particular on the potential for XML to provide a medium for acceptable standards.
![]()
The civilizations of the ancient Near East produced the world's first written texts. In both Egypt and Mesopotamia, recognizable texts begin to appear in the late fourth millennum B.C.[1] A well developed system of numerical tabulation combined with a varied and sophisticated repertoire of sealings and seal impression is evident even earlier across a wide geographical range in Western Asia[2] and evidence from recent archaeological discoveries in Egypt promises to push the origins of writing even further into antiquity.[3]
For the first two millennia or so of the world's written record Near Eastern texts were written in one of several varieties of cuneiform, or in Egyptian hieroglyphs and its cursive variant known as hieratic.[4] In the latter half of the second millennium B.C. scripts with recognizably alphabetical characteristics begin to appear, and rapidly spread among the languages and dialects of the Eastern Mediterranean world, eventually spawning a host of descendents and borrowings across Asia, Africa and Europe.[5]
Writing more than a half-century ago, A. T. Olmstead began his monumental study of the the Old Persian period with the memorable statement: "When Cyrus entered Babylon in 539 B.C., the world was old. More significant, the world knew its antiquity."[6] Ancient scholars and scribes collected and catologued historical and scientific records from their own immediate and distant pasts; they observed and organized natural phenomena; they abstracted medical, mathematical, astronomical and theological ideas; they sought to understand the world, and to preserve their understanding of it.[7] On a more mundane level, ancient scribes recorded the commercial transactions on behalf of individuals, organizations, and political entities; they recorded contracts deeds and legal proceedings; they wrote notes and letters; and they doodled in the margins. The roots of "Western" (not to say "modern") scholarship on the societies and cultures of the ancient Near East are ancient in themselves. Among the most celebrated literary compositions of western civilization are the editions, translations and interpretations of, and the commentaries on, Hebrew, Aramaic and Greek religious texts from the ancient Near East.[8] The communities which produced the scholarship on religious texts, and the societies in which they lived and flourished, maintained the languages of the Bible as living entities. A more ignominious fate befell the languages of Mesopotamia and Egypt. Texts of various sorts continued to be written in Akkadian and Egyptian into the first centuries of this era, but knowledge of them soon died out to the point where there was no longer even the recognition that the languages behind visible texts inscribed on the standing walls of ruined buildings were ancestral or cognate to living tongues, or that living tongues such as Coptic were related in any way to such ancient writings.[9]
Aside from occasional descriptions of monuments in early travellers' accounts, many of which connect the remains of ancient sites with descriptions in biblical and classical literature, and aside from the astonishing and fantastic speculations of scholars such as Athanasius Kircher,[10] almost nothing was learned of ancient Egyptian or Mesopotamian societies until the eighteenth century. The more careful observations and drawings made by such travellers as Carsten Niebuhr in Iran[11] and Robert Wood and James Dawkins in Syria,[12] eventually resulted in publications which were fundamental to the early decipherments of Palmyrene and Old Persian. French and English colonial adventures in Egypt at the turn of the eighteenth-nineteenth centuries resulted in the recovery of multilingual inscriptions - notably the Rosetta Stone[13] - leading directly to the decipherment of Egyptian. English speaking scholars, working with the Niebuhr's drawings of bi- and trilingual cuneiform texts, as well as with the raft-loads of inscribed monuments and tablets appearing from the excavations of Botta at Khorsabad and Layard at Nineveh in Assyria,[14] and which they shipped back to the Museums in Paris and London, competed with one another for the honor of being called "the decipherer" of the languages of these inscriptions.[15]
Even before there was universal acceptance that both Egyptian and Akkadian had been deciphered there was a growing corpus of secondary literature including text publications, editions, commentaries, catalogues, dictionaries and so on. From the start, hieroglyphs and cuneiform characters posed problems for the typographers charged with seeing manuscripts into print. Already in the first generation there were reasonably successful attempts to build typefonts capable of representing a wide variety of cuneiform characters. Such efforts were quickly followed by similar movements in Egyptological publication.[16] A parallel trend continued - and continues- in both Assyriological and Egyptological publishing - the use of hand written text to reproduce individual characters normalized to standard forms,[17] as well as the hand drawn facsimiles or copies of texts themselves.[18] There have never been universally accepted standards among either the Egyptological or the Assyriological communities on how to represent texts in transliteration or transcription. Particular fields have developed individual styles as have "schools" of scholarship which, for reasons well-known in academe, tend also to fall into groups according to nationalist criteria or language of scholarship.
Until the 1960's scholarship on ancient Near Eastern texts was conducted with the long established tools of the trade: the eye, the pen and the index card. Individual scholars, as well as collaborative projects, collected data with - for the most part - specific purposes in mind. The Assyrian Dictionary of the University of Chicago, for example, had adopted and modified the procedures of the Oxford English Dictionary for the collection of lexical data - to date they have collected nearly two million cards.[19] Modifications of card-based systems, such as needle-sorted punch-cards, existed, but it was the development of electronic text processing which offered the first real promise as a tool to sort large amounts of data in complex ways.
Encouraged by the success and usefulness for ancient Near Eastern studies of such projects as the Tuckerman Tables,[20] scholars began to experiment with how computers could be harnessed to process textual data. Many early projects were individualized and often conducted in relative isolation, but in 1965 Stanislav Segert and I. J. Gelb, working respectively on South Arabic and Amorite developed a mutually acceptable code for the representation of Semitic phonemes for use in text processing on an IBM mainframe.[21] Other projects, like those of the Sumerologists Gene Gragg and Miguel Civil,[22] exploited the big mainframes to considerable advantage in the analysis of textual corpora.
Despite such use, computers remained a tool which was relatively invisible from the point of view of the published results of research. Text had to be processed into one or another idiosyncratic machine-readable form for manipulation in the computer, and then re-rendered into forms acceptable to the reader's eye for publication. However, with the development of the personal computer in the 1970's; its wholesale adoption by the scholarly community by the end of the 1980's; and the gradual and (nearly) universal network wiring which began in the 1990's, the division between text processing, tool development and publication became less and less evident. The remarkable success of such largescale computerized text corpus projects as the State Archives of Assyria project in Helsinki,[23] and the increasing availability of inexpensive and highly effective off-the-shelf tools for desktop text processing, encouraged the development of multi-use filing systems and the incremental accumulation of "personal" text corpora by virtually every scholar. It is issues surrounding the development of these resources, the long-sought-for standardization of encoding, and consequent ability to communicate and collaborate more effectively, which we hoped would be addressed in the Chicago Conference in early October 1999.
With the technology and infrastructure in place, it is appropriate for ancient Near East specialists to begin considering what is involved in publishing their data on the Web in XML format. XML itself is merely a starting point because the very simplicity and flexibility require the development of specific tagging schemes appropriate to each domain of research. It was the intention of the organizers of the conference to bring together researchers who have begun working on electronic publication in various ways using such tools as SGML, HTML, and XML, or who are interested in exploring these techniques, and to foster collaboration in the development of specific XML/SGML tagging schemes, especially for cuneiform texts in which a number of the conference participants specialize. In addition, it was our intention to inaugurate a formal working group on cuneiform markup to provide an ongoing forum for communication and collaboration in this field. We stressed however, that the issues under discussion are not of interest only to cuneiformists - presentations and discussion concerning other ancient Near Eastern scripts and languages were encouraged and explicitly solicited.
Similarly, we recognized the need to present "texts in context": as archaeological artifacts among other artifacts. Archaeologists and philologists share the need for efficient and flexible electronic publication of complex data. In many cases also they have overlapping interests in terms of substantive historical questions. Indeed, it is likely that cooperation on the level of technical methodology in pursuit of effective electronic publication will have the beneficial effect of reducing the tendency toward balkanization among disciplines. An ancillary goal of our conference, therefore, was to stimulate interest in interdisciplinary research projects that involve both archaeological and philological data. By facilitating electronic access to philological data by archaeologists and vice versa, and by learning a common data representation technique such as XML, we can expect to generate new ways of representing or even conceiving of the conceptual relationships not just within but also between archaeological and philological datasets, which are so often considered in isolation. The potential to store these different kinds of datasets and their interrelationships in a commonly accepted, rigorous, formal framework offers exciting prospects for subsequent linguistic, socioeconomic, and historical research.
We have no doubt that electronic publication will play an essential role in future research on the ancient Near East. Philologists and archaeologists alike work with complex, highly structured datasets consisting of visual as well as textual information which call for "hyperlinks" among different kinds of data. But devising suitable forms of electronic publication is not a trivial matter and can only be done on a collaborative basis. Suitable electronic publications will represent in a standardized fashion the large number of internal and external cross-references among the many individual elements of each dataset and will capture the semantic diversity of the many possible types of such cross-references, representing, for example, various kinds of spatial, temporal, or linguistic relationship. Furthermore, the goal of such publication is not simply to facilitate human navigation of large and complex bodies of information but also to permit automated computer-aided analyses of data derived from many disparate sources. We believe that XML will be an important medium for this because Web publication using this format promises to be a simple and effective means of merging complex datasets from multiple sources for purposes of broader scale retrieval and analysis, avoiding the problems caused by the existing proprietary, limited, and inflexible data formats which have hindered electronic publication to date. XML is a non-proprietary, cross-platform, and fully internationalized standard that has been enthusiastically embraced by the software industry in general. For this reason our conference was announced as focussing specifically on the use of XML in the publication of ancient Near Eastern texts.
A major goal of our conference was to assess the prospects for establishing a formal international standards organization charged with setting technical standards for the interchange of Near Eastern data in digital form. Both the conference and the establishment of such an organization are timely in light of the recent development of internet-oriented data standards and software that now provide a common ground for cooperation among diverse philological and archaeological projects, which have heretofore adopted quite idiosyncratic approaches. This common ground, not just for academic research but in all areas of information exchange, is created by the Extensible Markup Language (XML) and a growing array of software tools that make use of XML to disseminate information on the Internet.
The XML Standard
As we noted in our original announcement of the conference, XML is a
nonproprietary "open" or public standardized data format which
provides a simple and extremely flexible "tag"-based syntax for
representing complex information as a stream of ASCII or Unicode text and
delivering it over the World Wide Web.
Furthermore, it is based on a proven approach because it is a subset of
the ISO-ratified Standard Generalized Markup Language (SGML), which has
been used for electronic publication worldwide for more than a decade. XML
therefore makes possible powerful and efficient forms of electronic
publication via the Internet, including academic publication of
philological and archaeological data. But XML itself is merely a starting
point, for its very simplicity and flexibility, which ensure its
widespread adoption, require the development of specific XML tagging
schemes or "markup languages" appropriate to each domain of
research. Such a tagging scheme expresses the abstract logical structure
of a particular kind of data in a rigorous and consistent fashion. Thus,
for example, chemists have already created a "Chemical Markup
Language" using XML to express the structure of molecules and
chemical reactions, so that the data they work with can be easily shared
and searched on the Web. Likewise, NASA has created an "Astronomical
Instrument Markup Language," biologists have created a "Biological
Markup Language," and so on. Once such tagging schemes exist, various
kinds of software can then be developed to present different views of
logically structured data for different purposes, or to create new sets of
data structured in a particular way, with the assurance that these data
structures can be created and viewed on any computer anywhere without
special conversions or translations.
For general reference see Robin Cover's The SGML/XML Web Page
http://www.oasis-open.org/cover/
Formation of a Working Group for Text Markup
There was a consensus among the conference participants that XML should be used as the basis for future electronic publication of Near Eastern data. The establishment of a formal working group for Near Eastern text markup was also strongly endorsed, as a vehicle for the collaborative development and dissemination of suitable XML tagging schemes and associated software. Stephen Tinney of the University of Pennsylvania, the editor of the Pennsylvania Sumerian Dictionary, who has substantial experience in electronic text processing and in the use of SGML and XML, in particular, was elected to be the chair of the working group.
The name and scope of the new standards organization remain to be decided. A number of conference participants emphasized the importance of including Near Eastern languages and texts of all periods within the scope of the text markup group, rather than arbitrarily limiting it to ancient Near Eastern texts in general or cuneiform texts in particular, because comparable issues arise in dealing with non-European scripts and languages regardless of their date. Similarly, several people expressed what seemed to be a generally held desire to find ways to include electronically published archaeological data within our standards-setting effort. This would ensure maximum interoperability of textual and archaeological datasets, so that it would be easy to obtain information about the spatial provenience and the material-cultural context of excavated or monumentally inscribed texts, and conversely so that it would be easy to obtain philological information about texts viewed as artifacts from an archaeological perspective.
In the opinion of the conference organizing committee, therefore, a
suitable name for the new standards organization would be "Organization
for Markup of Near Eastern Information" (OMNEI).
This
name emphasizes the central role of XML markup as well as the
organization's potentially wide scope in terms of Near Eastern information
of all kinds, including both primary data (philological, archaeological,
and geographical) and relevant secondary literature. Even restricting the
scope to "Near Eastern" information is rather arbitrary from a
technical standpoint, but this mirrors the scope of the existing academic
infrastructure of Near and Middle Eastern departments, institutes, and
centers to which members of this organization will in most cases already
belong. OMNEI would serve as an umbrella organization for various
standards-setting efforts necessary for the interchange of Near Eastern
information, beginning with a Working Group for Text Markup chaired by
Stephen Tinney. Eventually there could be a parallel Working Group for
Archaeological Markup whose efforts would be integrated with those of the
Text Markup group. Note that OMNEI's mission is not just to devise XML
tagging schemes but also to facilitate the development of well-documented
Web browser-based software that could be widely shared among Near Eastern
projects, and to coordinate training and professional development for
researchers who want to learn how to use these tagging schemes and
software. Thus at some point it might also be desirable to create a formal
Task Force for Training and Professional Development within the OMNEI
organization.
In the aftermath of the conference, discussion is underway concerning these details, including the name and the precise scope and mode of operation of our new international organization, as well as a schedule of future meetings. Decisions will be announced in the near future, but it is clear already that there is a widespread desire to make this organization as broadly based as possible so that it can facilitate the cooperative development of effective and widely accepted technical standards. Judging by the success of the recent conference, it seems likely that many leading Near and Middle Eastern departments and institutes worldwide can be enlisted in support of this venture. The Oriental Institute of the University of Chicago will continue to do everything possible to sponsor this effort and to support it with its reputation and resources, in collaboration with the University of Chicago's Department of Near Eastern Languages and Civilizations, Center for Middle Eastern Studies, and Committee on the Ancient Mediterranean World.
Topics Discussed
What follows is a brief summary of the main points touched on in the formal presentations and in the open discussion sessions. It is not an exhaustive account of everything that was said. For further details on the formal presentations, in particular, please contact the presenters individually. Following each section is a paragraph including links to on-line or other electronic publications pertinent to the issues discussed in the section.
Friday October 8th
Stephen
Tinney of the University of Pennsylvania led off the Friday morning
session with a presentation entitled "From Dictionary to
Superdocument: XML, the Pennsylvania Sumerian Dictionary, and the
Universe." Tinney surveyed some of the basic concepts underlying
XML and the "markup" approach to electronic text
representation, and then he outlined his ideas concerning the
implementation of a corpus-based lexicon such as the Pennsylvania
Sumerian Dictionary on the Internet using XML. He pointed out that such
a lexicon can and should transcend the limitations of existing printed
dictionaries. In particular, an electronic lexicon would not be a static
entity but would be the dynamic product of three types of interlinked
and constantly updated data, comprising primary text corpora,
grammatical analyses, and secondary literature. In other words, the same
data would be reusable in different contexts, and many possible views of
the data could be constructed for different users. One such "view,"
of course, is a printed or printable version of the lexicon in the
traditional format. Tinney concluded his talk by presenting and
commenting briefly on an XML "document type definition" (DTD)
which defines a set of element (tag) types and their attributes by means
of which a corpus-based lexicon, for any language, could be represented.
The Pennsylvania Sumerian DictionaryIn the discussion that followed Tinney's presentation, and in other discussions throughout the conference, the concern was expressed that electronic publications of the type he and others envisage would be evanescent and might become inaccessible because of the notoriously rapid obsolescence of digital media, the instability of the Web addresses (URLs) for electronic publications, and the dependence of the scholarly community on a few technologically expert colleagues, such as Tinney, whose eventual departure or retirement might orphan their brainchildren. Tinney and a number of other conference participants responded to these important concerns at various times during the conference by making the following points:
http://ccat.sas.upenn.edu:80/psd/
The Index to Sumerian Secondary Literature
http://ccat.sas.upenn .edu:80/psd/www/ISSL-form.html
The
second presentation on Friday morning was by Stephan Seidlmayer of the
Berlin-Brandenburg Academy of Sciences and Humanities, on the subject of
the "The Ancient Egyptian Dictionary Project: Data Exchange and
Publication on the Internet." Seidlmayer described the history of
the Ancient Egyptian Dictionary project and outlined the plan for taking
it onto the Internet using XML. The precomputer text corpus of the
Ancient Egyptian Dictionary was stored on a large number of handwritten
index cards, produced from the 1920s to the 1960s, as was typical of
dictionary projects of this kind. This information was used to produce
the twelve-volume Wörterbuch der ägyptischen Sprache, which is
now out-of-date and in need of revision. Much of the original material
has been digitized and updated, and the current text corpus of the
Ancient Egyptian Dictionary project is stored in a DB2 relational
database with local client-server access. Once a suitable XML markup
scheme has been developed, this information will be converted to XML
format and made available on the Internet, to facilitate international
cooperation in this dictionary project.
Altägyptisches Wörterbuch
http://www.bbaw.de/vh/aegypt/inde x.html
Das digitalisierte Zettelarchiv des Wörterbuchs der ägyptischen Sprache
http://aaew.bbaw.de:88/dzaInfo/dzaInfo.html
Participants engaged in lively and interesting duscussion of issues
relating to the on-line publication of text. Caution and concern was
raised about copyright and intellectual property right issues. Jeffrey
Rydberg-Cox from Perseus and Mark Olsen from ARTFL shared their own
experiences with the use of text over which individuals or organizations
claim ownership. It was evident that the law governing the re-use of
text is both unclear and incompletely understood by the participants.
There was a general sense that openness and collaboration were to be
encouraged, and indeed are essential, if large scale projects are to be
successful. Differences between commercial and non-commercial models of
publication and long term institutional support - whether from
commercial publishers or from universities or other non-commercial
institutions - seemed to present a source of anxiety for participants,
particularly as they have an impact on the long-term accessibility of
on-line electronic publications.
Perseus Project
http://www.perseus.tufts.edu/
Perseus Searching Tools
http://www.perseus.tufts.edu/s earches.html
Teaching with Perseus
http://www.perseus.tufts.edu/T eaching.html
The
next presentation was entitled "XML and Digital Imaging
Considerations for an Interactive Cuneiform Sign Database." It was
given in three parts by Theodoros Arvanitis and Sandra Woolley of the
School of Electronic and Electrical Engineering at the University of
Birmingham in Britain, and by Tom Davis, a forensic handwriting
specialist in the Department of English at the University of Birmingham.
Dr. Arvanitis read an introductory statement by Alasdair Livingstone,
the Assyriological member of this project, who unfortunately could not
be present. The Birmingham team described the objectives of their
collaborative project, the results of the first year's work, and their
plans for future work. A major goal of their project is to experiment
with various digital image representations of cuneiform signs in order
to determine which techniques for image capture, formatting, and
compression are most effective for disseminating detailed facsimiles of
cuneiform texts on the Web for research purposes. Another aspect of
their research involves the automated analysis and categorization of
cuneiform signs and scripts. The Birmingham team has also kindly offered
to host a future meeting of the new working group on Near Eastern text
markup.
Cuneiform Database Project, University of Birmingham
http://www.eee.bham.ac.uk/cuneiform/
XML and Digital Imaging Considerations for an Interactive Cuneiform Sign Database - A Powerpoint Presentation
http://www.eee.b ham.ac.uk/cuneiform/cuneiform_chicago.ppt
The
necessity of careful editorial oversight of electronic publications was
emphasized by several participants, in light of the ease of "self-publication"
on the Internet. On the other hand, it was recognized that the
electronic medium makes possible a variety of types of publications of
varying degrees of formality and completeness, ranging from the
equivalent of "privately circulated" manuscripts, by means of
which a group of colleagues informally shares ideas and data, to
official institutional publications corresponding to printed monographs
in peer-reviewed series or journals. Several participants stressed the
important role even of lightly edited individual publications on the
Web, which need not be regarded as the author's final word, and to which
electronic access might be restricted to those who understand their
limitations and can make best use of them. The line between what is "published"
versus "unpublished" is now somewhat blurred because all types
of Web publications are equally accessible from a technical standpoint.
Another point made during this discussion had to do with the role of publishers, which might seem to be threatened in the era of electronic publication. Jim Eisenbraun pointed out that printing, binding, and distributing printed books is not the major expense in publishing, in any case. The major expense is incurred at the editorial stage, and the traditional role of publishers in this and the associated expenses will not be diminished, regardless of the medium of distribution. The financial basis for Web-based electronic publication will be some kind of subscription system, however, rather than the purchase of physical media.
Recapitulating a point made in the morning discussion session, Patrick Durusau make an explicit call for open source development of resources and tools.
The Oriental Institute Web site
http://www-oi.uchicago.edu/OI/ default.html
Scholars Press
http://scholarspress.org/
Eisenbrauns
http://www.eisenbrauns.com/
Oriental Institute Publications
http://www-oi.uc hicago.edu/OI/DEPT/PUB/Publications.html
Achaemenid Royal Inscriptions Project
http://www-oi.uchicago.ed u/OI/PROJ/ARI/ARI.html
The Afroasiatic Index Project
http://www-oi.uchicag o.edu/OI/PROJ/CUS/AAindex.html
The
second evening presentation was by Hans van den Berg of the Center for
Computer-aided Egyptological Research at Utrecht University. In a talk
entitled "Egyptian Hieroglyphic Text Processing, XML, and the New
Millennium," van den Berg noted the substantial progress that has
been made in Egyptology in developing standardized character encodings
of hieroglyphic signs, to the point where there is now a proposal before
the Unicode consortium for a 16-bit character encoding system that
covers most of the known signs. The need for XML arises when
representing the palaeographic characteristics of hieroglyphic texts, in
terms of both character anomalies and specific positional information
(i.e., the juxtaposition or superposition of individual signs). Van den
Berg presented a set of XML tags that can represent such palaeographic
characteristics.
Centre for Computer-aided Egyptological Research (CCER)
http://www.ccer.ggl.ruu.nl/ccer /ccer.html
Project for American and French Research on the Treasury of the French Language, University of Chicago (ARTFL)
http://humanities.uchicag o.edu/ARTFL/ARTFL.html
ARTFL Experiments and Development Projects Page
http://tuna.uchicag o.edu/homes/ARTFL.experiments.html
The
first presentation on Saturday morning was given by Jeremy Black and
Eleanor Robson of the University of Oxford. They discussed "The
Electronic Text Corpus of Sumerian Literature," a Web-based project
whose aim is to make accessible to a wide variety of readers,
specialists and laypeople alike, hundreds of Sumerian literary works.
Black presented the philological and pedagogical rationale for the
project, while Robson discussed its operating procedure. This procedure
involves the use of SGML tags and a simple wordprocessing macro
interface for the entry and markup of transliterated texts by
Sumerologists, and hence a minimum of custom software development.
Robson showed the project's Web browser interface via an online Internet
connection, emphasizing the project's use of basic HTML generated from
the underlying SGML version of the texts. Because the intended audience
goes beyond scholars at major research universities, users of the
electronic Sumerian text corpus should not and do not need the latest
version of Web browser software running on the fastest computers with
high-speed Internet connections in order to use the texts effectively.
The Electronic Text Corpus of Sumerian Literature (ETCSL)
http://www-etcsl.orient.ox.ac.uk/
The ETCSL docum ent type definition for composite texts
http://www-etcsl.o rient.ox.ac.uk/technical/compdtd.htm
The ETCSL document type definition for English prose translations
http://www-etcsl. orient.ox.ac.uk/technical/transdtd.htm
The ETCSL document type definition for bibliographies
http://www-etcsl .orient.ox.ac.uk/technical/bibliodtd.htm
SGML Declaration for all ETCSL SGML files
http://www-etcsl. orient.ox.ac.uk/technical/sgmldecl.htm
The
next presentation was by Miguel Civil of the University of Chicago's
Oriental Institute, who drew on his decades-long experience in cuneiform
text encoding to comment on the history of efforts in this area. Civil
gave an overview of his own approach and the software he has developed
to integrate text corpora with grammatical and lexical information.
During the course of the conference a number of participants
congratulated Civil for his influential pioneering work and for his
generosity in supplying otherwise unavailable editions of texts in
digital form to a wide variety of colleagues.
Sumerian Lexical Archive (SLA)
http://www-oi.uchica go.edu/OI/PROJ/SUM/SLA/SLA1.html
David
Schloen, an archaeologist in the University of Chicago's Oriental
Institute, gave the final formal presentation on Saturday afternoon,
entitled "Texts and Context: Using XML to Integrate and Retrieve
Archaeological Data on the Web." Schloen noted that XML is as
suitable for representing archaeological databases as it is for
representing ancient texts. But whether the information is expressed in
XML or in some other data format (e.g., a relational database),
archaeologists need an appropriate data model that captures in a
rigorous and consistent fashion the idiosyncrasies of units of
archaeological observation, as well as the spatial and temporal
interrelationships among them. Schloen proposes a hierarchical, "item-based"
data model, rather than the "class-based" (tabular) data model
which currently prevails. The item-based data model has the advantage of
being straightforwardly represented in XML as a nested hierarchy of
tagged elements with their attributes. Moreover, texts can be treated
like any other type of artifact, as items in a spatial hierarchy with
their own properties. Schloen concluded by presenting an XML tagging
scheme dubbed ArchaeoML ("Archaeological Markup Language")
which can represent any kind of archaeological data on any spatial
scale, including the vector map shapes and raster images which belong to
individual archaeological items.
In the discussion that followed, the question arose of the precise relationship between electronically represented texts and archaeological data disseminated on the Web using XML. Schloen's response was that the physical characteristics and archaeological context of a text would be represented as for any other artifact, but the XML "item" element representing a given text as an archaeological item would have a link to another Web location containing the contents of the text from a philological perspective. The same kind of link would operate from the other direction, so that each XML "text" element in an electronically represented corpus of texts would be able to retrieve its geographical location and archaeological context from an archaeological dataset.
Abzu: Guide to Resources for the Study of the Ancient Near East Available on the Internet
http://www-oi.uchica go.edu/OI/DEPT/RA/ABZU/ABZU.HTML
| The program of the conference is also available
|
Charles E. Jones
Research Associate and Archivist - Bibliographer
The Oriental Institute
University of Chicago
1155 E. 58th St. Chicago IL 60637-1569
USA
Voice (773) 702-9537
Fax (773) 702-9853
Email: cejo@midway.uchicago.edu
David Schloen
Assistant Professor
The Oriental Institute
and The Department of Near Eastern Languages and Civilizations
University of Chicago
1155 E. 58th St. Chicago IL 60637-1569
USA
Voice (773) 702-1382
Fax (773) 702-9853
Email: d-schloen@uchicago.edu
![]()