In her regular appearance in Ariadne, Sue Welsh, introduces a new experiment in network indexing underway at OMNI.
OMNI is by no means the first or only of the Electronic Libraries Programme Access to Network Resources projects to experiment with using the well known and popular Harvest software to create descriptions of networked resources automatically. EEVL (the Edinburgh Engineering Virtual Library, for example, recently announced the availability of their own Harvest experiment [1]) and non-eLib gateway projects had in any case, beaten us to it some time ago.
In the light of this activity, setting up a Harvest for OMNI is neither a difficult or revolutionary thing to do, and in fact such a thing has existed in the development area of the OMNI server for some time. The best way to deploy the Harvester though, was not immediately clear, especially taking into account the prominence given by OMNI throughout the Project’s existence to the issue of the quality of Internet resources. This article describes how these concerns were resolved, and describes the new OMNI Harvester. Finally, if you are a researcher or practitioner looking after a list of links in your own subject area which you’d like to share, we’d very much like to hear from you, read on!
Commonly, Harvest has been used to make a automated, full text version of gateways containing records created by hand. In this scenario, the URL’s contained in the gateway records are used as starting points for Harvest. After indexing the document identified by a URL, the Harvest is usually allowed to follow URL’s contained in the document, and continue indexing.
As all the starting points are relevant, it is likely that all the URL’s in the corresponding documents will point to other relevant resources. Of course, the further Harvest is allowed to wander, the more irrelevant documents will become to the subject gateway. But, it is assumed that most documents which are only a few hops from the originals selected by the subject gateway will be useful.
OMNI has always made evaluation and selection cornerstones of its approach to creation of our gateway to biomedical resources. If we created this sort of Harvest database, it would be necessary to make quite clear to our users that they were using something quite different; nothing in the Harvest database could be said to have been reviewed, evaluated or selected. After much discussion, there was no consensus on whether this was a useful thing to do, was there another option?
The OMNI Harvester is an experiment based around these two strands of activity. It is a full text database of resources taken from listings compiled by subject experts (in virology, nutrition, pediatrics, orthopaedics, neuroscience) or projects such as PharmWeb [2] (for pharmacology) and the UK Human Genome Mapping Project [3] website (for genetics/molecular biology). Seven subject areas are covered so far, and the scope of the Harvester will be extended. Because we constrain the Harvest software so that it does travel away from the resources contained in the lists, the key element of selection is retained.
In the future we hope to integrate the Harvester and the main OMNI databases so that both can be searched simultaneously.
Even better, if you are a subject expert maintaining a list of resources, contact us and we’ll include your resources next time the Harvester is updated. All the listings involved are used with permission, and are prominently linked from the Harvester pages.
[1] EEVL, the Edinburgh Engineering Virtual Library,
<
http://eevl.ac.uk/
>
[2] PharmWeb: Phamacy Information on the Internet,
<
http://www.pharmweb.net/
>
[3] UK MRC HGMP Resource Centre,
<
http://www.hgmp.mrc.ac.uk/
>
[4] The OMNI Harvester,
<
from http://omni.ac.uk/general-info/harvest.html
>
Material on this page is copyright Ariadne/original authors. This article last updated/links checked on January 27th 1997