Web Magazine for Information Professionals

SETIS: Electronic Texts at the University of Sydney Library

Creagh Cole describes a project dedicated to providing in-house access to a large number of electronic texts on CD-ROM.

The University of Sydney Library has acquired a large number of primary texts in digital form over the last few years. These texts include numerous versions of The Bible, the works of Shakespeare, Goethe and Kant, more than 700 classical Greek texts in the Thesaurus Linguae Graecae, the enormous Patrologia Latina Database of the Church Fathers, the English Poetry Full-Text Database, and the Intelex Philosophy Texts. To these texts and others like them must be added the texts available from remote sites such as the collection of some 2,000 French literary, scientific and philosophical texts at the Frantext Web site [1], and the many public domain texts available at the University of Virginia Electronic Text Centre [2] and the Oxford Text Archive [3]. This is the story of my Library’s attempt to provide the best possible support for this growing collection of electronic literary, philosophical and religious texts.

Whether commercially available (at substantial cost) or freely available on the Web, electronic texts such as these require a significant commitment of resources. If the Library chooses to provide the texts at all, it must also provide, at a minimum, computers and associated hardware, search software (continually updated), auxiliary text analysis software and staff expertise.

In addressing these problems my Library has initiated a new service devoted to the needs of textual scholars throughout the University, and one which will not only maximise the utility of electronic texts currently held by the Library, but also provide valuable skills, knowledge and resources which can in turn feed into further initiatives in digitisation of textual material within the Library and at the University more generally. The new service has no local precedent and has depended heavily on help from overseas sources and models, such as the University of Virginia.

This is the story then of the Scholarly Electronic Text and Image Service (SETIS) [4] at the University of Sydney Library [5]. The acronym SETIS refers to the Egyptian goddess of the inundation, and for no better reason than this the image of the goddess has become an icon for the new service at the University.

At the beginning of 1995 the Library approached a number of staff in the Arts Faculty on the feasibility of establishing an electronic text centre similar to centres already established at a number of universities in the United States. (The Centre for Electronic Texts in the Humanities maintains a Directory of Electronic Text Centres [6] world wide.) Consequently, a successful submission was presented to the University’s Information Technology Committee on the basis of strong academic support for the proposal, the university’s traditional involvement in and depth of research within the arts and humanities, and, finally, the Library’s strong record in introducing, managing and supporting new information technology at the University. From the initial library group concerned with the project a full- time Coordinator was appointed for the service and space within the Library was found for the new service.

SETIS was officially opened by the Vice-Chancellor in September 1996, although it had been functioning for some months prior to that time. SETIS and its Coordinator were placed under the direct administrative control of the Associate Librarian largely responsible for its creation and the service currently operates with some independence from the departmental structure of the Library.

SETIS intended from the start to achieve results on a number of different levels:

1: Electronic Texts Within the Library

SETIS provides in-house access to a large number of electronic texts on CD-ROM, for use with IBM Pentium and Power Macintosh computers with high speed CD-ROM drives, large monitors and ample memory. The texts may be used in conjunction with auxiliary software programs: wordprocessing, text analysis and comparison programs such as TACT (Text Analysis and Computing Tools) and Collate. Furthermore, the Library now provides specialist training and support for the use of the texts. Previously the texts could only be used at slower machines already groaning under the weight of other functions required of them in the Reference Collection, and there was no access to auxiliary software programs. Staff could only learn about electronic texts as part of their other Reference responsibilities. Since the texts were relatively little used in comparison with many of the bibliographic databases they seemed to have little priority in this sense.

The gains have been important. It should, however, be mentioned that a disadvantage of the new service dedicated to electronic texts has been, paradoxically, reduced access to the texts within the Library since they are now located separately from the other services and SETIS has more restrictive hours than the library as a whole. This situation will change as staffing levels for the service are reviewed. More importantly, the intention of the SETIS project was to concentrate on networking the texts so that their physical availability within the Library became irrelevant to questions of access. The networking of the texts is the central development in this project.

2: Networking the Texts

A visit to the University of Sydney Library in 1995 by David Seaman from the University of Virginia Electronic Text Centre gave us a strong impression of the value of networking our texts. At the University of Virginia and a number of other sites, large bodies of text encoded in SGML (Standard Generalised Markup Language) are made available via a web interface with the Open Text search engine (originally developed to search the Oxford English Dictionary). The results of the search are filtered “on the fly” to html format for display on the user’s web browser. The advantages of this approach are great: One of the major publishers of electronic text, Chadwyck Healey, recently acknowledged the advantages of this approach, late last year releasing for access the new web site of literary and reference texts (The Lion Web site [7]). Through this site the publisher is signalling the intention of moving towards leasing access to the products rather than selling the texts themselves for libraries to mount independently. Libraries attempting to mount the texts themselves in this way will avoid the annual leasing costs involved in such plans but will need to solve some major problems: Nevertheless, the advantages of networking the texts in this way seem to outweigh the disadvantages, not least because skills learned in the process will feed into the Library’s own text creation and publishing ventures. SETIS makes available a growing number of SGML encoded texts for use via web browsers. Many of these are commercial texts and are restricted for use at the University of Sydney only (this is regulated by IP address of the client machine.) However, some public domain texts and texts created at SETIS will be unrestricted and, therefore, available globally.

3: Text Creation

SETIS provides IBM Pentium and Power Macintosh machines with flat bed scanners, imaging and OCR software as well as SGML editing software with compiled TEI and EAD DTD’s for encoding of texts and library finding aids respectively. X-terminals provide direct access to texts on the main server and are expected to become increasingly important to the functioning of the service as we become more familiar with working directly on large bodies of texts and image databases in the UNIX environment.

SETIS is engaged in a number of text creation projects, and this has involved acquiring knowledge and skills not only about scanning and text recognition software, but more significantly, about Standard Generalised Markup Language (SGML) and the Text Encoding Initiative guidelines for humanities texts. Current projects include work on lecture notes by Professor John Anderson from the 1930’s up to the 1950’s and which are held in the University of Sydney Archives; an edition of Lord Shaftsbury’s Characteristics, Manners, Opinions, Times held in the Rare Books collection at Fisher Library, digital images of the New Australia Journal in Rare Books which are in a state of decomposition. SETIS is also engaged in encoding the novels identified for digitisation by the Australian Co-operative Digitisation Project [8]. These projects will give us the expertise to provide support for similar initiatives at the University among academic staff and research students.

Collaboration

From the beginning the SETIS project has sought and obtained support outside the Library. Academic staff within the Arts Faculty were closely involved from the early stages of the project and an Academic Advisory Group was formed with representatives from the Information Technology Committee, the Arts, Economics and Science Faculties and from SUPRA, the University’s Postgraduate Representative Association.

When we began our project we thought we might be breaking new ground in Australia. We quickly discovered that in this endeavour we were in friendly competition with the University of Western Australia’s Scholar’s Centre [9] which, under Dr Toby Burrows had launched their own project centred on the networking of electronic texts but using a different search engine, Dynaweb. In fact, their use of Dynaweb gave us an opportunity to compare the two main search engines currently used for this purpose, and more generally, we have gladly exchanged information and experiences for a range of common problems.

SETIS was in large part modelled upon the initiatives taken at the University of Virginia by David Seaman. Following his visit to Sydney in 1995 we agreed to act as a mirror site for our region for the large collection of public domain texts provided by the University of Virginia. Since that time, similar agreements have been reached with the Oxford Text Archive and the Stanford University online Encyclopedia of Philosophy.

Conclusion

The Scholarly Electronic Text and Image Service began as an attempt to provide the best support possible for our growing collection of electronic texts. In the process, the Library has taken on tasks and responsibilities that have not traditionally been asked of it. These include supplying text analysis software which mediate the scholar’s use of his or her source texts, and participating in the creation of electronic versions of primary source material from the Library’s collection. It is interesting that from the initial meeting of academic and Library staff there was a general consensus that this was an entirely appropriate thing for the library to be involved in. Indeed academic staff voiced very strong support for the project, acknowledging the importance of Library involvement in the University’s adoption of new information technology.

References

  1. Frantext Web Site,
    http://www.ciril.fr/~mastina/FRANTEXT/
  2. University of Virginia Electronic Text Centre
    http://www.lib.virginia.edu/etext/ETC.html
  3. Oxford Text Archive,
    http://sable.ox.ac.uk/ota/
  4. Scholarly Electronic Text and Image Service (SETIS),
    http://setis.library.usyd.edu.au/
  5. University of Sydney Library,
    http://www.library.usyd.edu.au/
  6. Directory of Electronic Text Centres,
    http://www.ceth.rutgers.edu/info/ectrdir.html
  7. Lion Web Site,
    http://lion.chadwyck.com/
  8. Australian Co-operative Digitisation Project,
    http://www.nla.gov.au/ferg/fergproj.html
  9. University of Western Australia’s Scholar’s Centre,
    http://www.library.uwa.edu.au/libweb/w_sch/

Author Details

Creagh Cole
SETIS Coordinator
Email: creagh@library.usyd.edu.au
Phone: +61 02 9351 7408
Fax: +61 02 9351 7290
SETIS Home Page: http://setis.library.usyd.edu.au/
Address: University of Sydney Library, University of Sydney 2006, Australia