Hans W. Groenewegen, Debbie Hedger and Iris Radulescu describe Monash University's Electronic Reserve Project which is at the core of the electronic library project at the University's new Berwick campus. This paper was written for presentation at CAUSE in Australia '97, Melbourne, 13 -16 April 1997. This article appears in the Web version only of Ariadne.
During the second half of 1995, as the building was slowly beginning to take shape on the University's Berwick campus, the Library started work on an electronic "reserve" [1] facility that was to be at the core of the electronic library . It was decided to base the e-reserve [2] on an existing image system that had been developed largely in- house by Library systems staff , to give access to past examination papers. This system stores digitised images of examination papers, which can be retrieved for viewing and printing. Most of the functionality required for an electronic reserve system was already incorporated in the examination papers system, including a module which would let the Library charge a fee to recover printing costs. The major new features required were: more direct integration with the Library's on-line public access catalogue (OPAC) and a copyright management facility.
However it did not appear likely that a satisfactory arrangement would be arrived at in time to fit in with Monash University's timetable for the Berwick campus. It was therefore decided to adopt as an interim solution the approach previously taken by San Diego State University, reportedly successfully [3], and contact publishers direct.
To this end a form letter was developed and this was used in all initial approaches to publishers. The following conditions were offered by the Library in return for permission to scan and store the material in question:
RP x TP x U.
In this formula the value of RP is the Recommended Retail Price of the monograph, or, in the case of a journal the single issue price of an issue of the journal, or the annual subscription price.
The value of TP is based on a straightforward page count, i.e. number of pages scanned as a fraction of the total number of pages of text contained in the monograph, or in the case of a journal, the total number of pages of editorial contained in the entire issue (if RP is the single issue price) or the entire volume (if RP is the annual subscription price).
The value of U is either the number of students enrolled in the course/10 or 15, whichever is the greater [4].
The rationale for dividing the number of potential users by 10 is that this would be the likely number of uses of a physical copy of the material that the library would expect to get in the conventional reserve system. The rationale for introducing a minimum value of 15 for U is that otherwise the royalty payable for small classes (i.e classes of less than 150 students) would become trivial.
As is clear from Table 1, the approach taken by the Library was successful to the extent that the majority of publishers who replied gave their approval without charging a royalty. The minority who wished to charge a fee, were generally happy to accept the "Monash formula". The major problem encountered was that a very large number of publishers did not reply at all, in spite of several follow up letters. Almost certainly, in the majority of cases the letter ended up in their "too hard basket", although inevitably, there were problems locating a current address for many smaller publishers, particularly of older publications.
| Number of requests sent: | 1,235 |
| Number of requests approved (no fee): | 489 |
| Number of requests approved (fee charged): | 53 (*) |
| Number of requests denied: | 45 |
| No reply: | 648 |
Table 1: Fate of Requests for Approval to Scan (as at 28/2/1997)
The multipage feature of this format also means that documents are much more manageable, as all the related pages are stored within a single disk file, rather than in multiple files that must be put together by program or other means. PDF (Adobe Acrobat) also handles multipage documents. However, later statistics show that it is very labour intensive to create PDF documents, which may themselves include other binary formats and even TIFF, with the added drawback that speeding up their printing is not possible, unlike with TIFF - more about this later. It is generally accepted in imaging circles that TIFF Group 4 is the best option for imaging books or documents that are black and white text, even when these include the occasional diagram.
There are drawbacks to the use of this TIFF format, one being that multipage is not handled well by most of the available freeware viewers for Windows 3.11. Also, the specification [5] allows for a huge range of variations, including the hardware- specific fill-order of bytes and even bits within the byte. These problems can, however, be overcome by software, and we have been singularly successful at that.
Scanning: Very few programs for the IBM PC exist that interact with a scanner to allow the creation of the required TIFF format. Utility tools from Kofax, Watermark and Accusoft were trialled and eventually the Accusoft engine was included in the final product. However, there are other issues in scanning large numbers of pages, such as the cost and availability of high-speed scanning equipment and staff numbers to handle what is, essentially, labour intensive. For this reason, it was decided to send large batches of pages / volumes to be scanned to an outside bureau specialised in this sort of work. The Library has received excellent service from its chosen contractor, the Australian Securities Commission, which uses specialised high speed scanners and a properly run quality control work flow so that results are always of the highest quality.
In the context of the day-to-day operation of the e-reserve section, however, many book and article excerpts are trickling in. These can easily be scanned in-house, using a simple scanning utility built in the viewing module, or using the off-the-shelf Watermark program. The images are stored directly on the computer's hard disk. To overcome the limitations of flat-bed scanners (the Library uses a HP ScanJet 3c), a photocopy of the chapter or article is first made. These A4 pages are then fed into the scanner's ADF for automatic chaining within the multipage TIFF file.
For both in-house and bureau scanning, the very simple idea was adopted of barcoding the article or excerpt and using the barcode number as the file name. This has worked extremely well for both imaging implementations. The graphic resolution used is quite low, such as 200x200 or 300x300; this yields perfectly acceptable quality both on screen and in print and leads to very small file sizes. For example 20 densely-printed pages, scanned in-house at 300 dots per inch, are stored in a 1 Megabyte file.
Storage: When large batches of documents are scanned by the bureau, the resulting CD-ROM is used both as a transfer medium and as a back-up medium. For speed of access, images are transferred to the hard disk of the server, which is periodically backed up to DAT tape. In-house scans go directly to the hard disk, so they are also picked up on the tape backup.
Retrieval of imaged documents: At the point when it comes to handling more than 100 documents, one needs a database and a search algorithm. In the case of the Monash Library, PALS, the Library's on-line catalogue, already holds the searchable records of all required course reading, making it an ideal candidate for the on-line e-reserve database. The Library also has a web-based interface to this mainframe catalogue, implemented via CGI programming to pass search parameters from a web user to the mainframe, capture the search results, format them as HTML, then display them to the user's browser. For imaged course materials, a tag is added to the catalogue record, containing the full path name of the associated image. The CGI program, in turn, was modified to look for this tag before displaying the search results, making this into a hot link with the caption "View this item".
When the user clicks on the hyperlink, the browser fires up a full URL that results in the e-reserve WWW server automatically retrieving the image file and serving it directly to the user's browser. The delivery mechanism is built into the http protocol, and so is the subsequent handling of the image file: if the user's browser has a viewer configured to handle TIFF files, then this is invoked by the browser, else the user is prompted to get and install a multipage TIFF viewer. This kept programming and development work to a minimum, while utilising the best and most up-to-date tools to achieve the purpose.
Viewing: Because of the lack of commercially available viewers capable of handling multipage TIFF under Windows 3.11, one was developed in-house: Monview, a standalone viewer based on the Accusoft imaging engine. This is available for FTP download from the campus web server [6], and has been downloaded by about 500 users world-wide in the past 9 months, since its development. The viewer also includes options for scanning and printing TIFF images, and can also work as a DDE server.
Printing: The thorniest issue in imaging implementations is the very slow speed of printing, since any graphics files are converted by the Windows printing engine before being submitted to the printer. While it does that, the workstation is kept busy, sometimes for a very long time. The Library solved this problem by implementing a "fast-print" method, which relies on using a specialised printer capable of doing the decompression and printing at the same time. QMS printers with the ImageServer option accept the raw compressed TIFF and process it extremely fast, freeing up the computer within a fraction of a second [7]. There is specialised code added to the Monview viewer which detects the presence of a QMS printer, then encapsulates the raw TIFF in PostScript and submits it directly to the printer queue, bypassing the Windows conversion mechanisms. Although all new models of QMS also accept raw TIFF, it was decided to stay with EPS because it permits separation of the pages, so they can be counted for internal accounting purposes.
Print charges are levied via the University's networked print accounting system, itself using a shareware program called PCOUNTER, or via Unicard charge cards on slaved QMS printers.
When the library opened in late March 1996, there were 24 networked Pentium workstations and a QMS printer. There were 72 documents scanned on the database and available for viewing. A further 73 articles were available in "hard copy" format, as permission to scan had not yet been received. By the end of 1996 there were around 500 items available for viewing on the electronic reserve database.
The workstations at Berwick are set up with the web interface to the Library catalogue. To catalogue new additions to the e-reserve system, the conventions of the traditional reserve system were followed. An item record is created with author, title, description and identification fields. When the publisher's permission to scan an item is received, the item is scanned using the document feeder on the scanner. It takes approximately 20 seconds per page to scan in an article [8]. The resulting TIFF image is saved to the image drive, under the appropriate course code directory. To link the catalogue record and the image file, the identification field in the catalogue record is updated and then the image is viewable immediately over the web.
Responses from users have been varied. The best use of e-reserve seems to occur when lecturers encourage their students to use it. For example, when they brought their students into the library for classes at the beginning of semester, this resulted in the highest use of e-reserve by these students. Other students learned to browse by asking library staff for help and soon became quite successful in browsing for themselves. Mature age students found the whole system quite overwhelming, and many of them made a point of coming into the library for lunch time user education classes and even after examinations were over, for refresher lessons, to prepare themselves for the next academic semester. Some students could not understand why the e-reserve articles were not on the shelf, like the ones for which the Library had not received permission to scan and which were on the shelves in folders.
The advantages of electronic reserve were explained to them, such as the item never being "out on loan", and its protection against theft and mutilation by anti-social students. Also, in "conventional" reserve, students can spend a good deal of time queuing to use an item that is in great demand, and then, quite probably, again to use the photocopier. Having e-reserve cuts down on the chaos created by the situation when one or two hard copies of a document are required by a entire class.
Material on this page is copyright Ariadne/original authors. This article last updated/links checked on 12-Mar-1997