Web Magazine for Information Professionals

Distributed Services Registry Workshop

John Gilby reports on the UKOLN/IESR two-day workshop at Scarman House, University of Warwick on 14-15 July 2005.

The number of available online digital collections is growing all the time and with this comes the need to discover these collections, both by machine (m2m) and by end-users. There is also a trend towards service-orientated architectures and a likely critical part of this will be service registries to assist with discovering services andtheir associated collections. UKOLN and the JISC Information Environment Services Registry Project (IESR) [1] organised a two-day workshop to look at some of the issues that are likely to be present in building a distributed approach. All presentations from the workshop are available on the UKOLN Web site [2].

Presentations

Andy Powell, UKOLN, began proceedings by outlining the purpose of the workshop. The first day was an opportunity via a number of presentations for sharing knowledge of current approaches to service registries and was not limited to the UK. Andy then went on to say that delegates would be required to be more active on Day 2 by taking part in the planned breakout sessions considering the many issues with service registries that are distributed. It was hoped that future work could be agreed and potential partnerships and funding sources identified at the end of the workshop.

The next session was a series of three short presentations from the IESR Project. Amanda Hill, MIMAS gave an overview of the project, currently in its third phase and holding data on around 260 electronic resources contained within the JISC Information Environment. Pete Johnston, UKOLN, then described the background and outline workings of the RSLP Collection Description Schema [3] which forms the basis of the IESR collection metadata. He then concluded his session by outlining the Dublin Core Collection Description Application Profile and the NISO Metasearch Initiative.

The final IESR presentation was given by Ann Apps, MIMAS, who spoke in more detail about how the IESR works. The metadata describes the resources and includes information on how to access them by various methods such as Z39.50, SOAP (Simple Object Access Protocol), OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting), Web/CGI and by location URL. All IESR records are covered by Creative Commons licences and all resource providers have agreed to this as it effectively advertises their services in the registry. IESR records can be accessed at present via the Web, Z39.50 and OAI-PMH. Ann concluded her presentation by sharing some thoughts on some current issues identified by the project (scope, scalability, manageability, data ownership, relationship with library portal products) and a distributed service registry model.

Following a sumptuous lunch and the opportunity to network, Jeremy Frumkin, Oregon State University, detailed what they were doing in creating the OCKHAM Digital Library Services Registry (DLSR) [4]. The broad goals of the project were to create a registry for all possible digital library services and to enable m2m digital library service resolving. The DLSR is distributed from the outset, having many nodes over the network and the approach was partly based on the DNS with its hierarchical design and de-centralised administration. Each node has a local copy of the complete registry and OAI is used for propagating data across the network. Similar to the IESR, the DLSR needs to be scalable, manageable, use existing standards and technologies and offers OAI-PMH, Z39.50, SRU/W (Search and Retrieve URL Service/Search and Retrieve Web Service) and Web interfaces to its metadata.

Rob Sanderson, University of Liverpool, then gave an overview of the NISO Metasearch Initiative, concentrating on one of three task groups, Collection and Service Descriptions. The purpose for the group was defined as being "To enable the discovery of appropriate, remotely maintained content and a means of retrieving that content", and the scope for the group was enabling the retrieval of items. The group have devised a draft Collection Description Schema based on the Dublin Core Collection Description Application Profile and it provides a core set of collection description properties suitable for collection discovery (rather than item discovery). Service descriptions form the other element of the group's work, and it has been decided to use the ZeeRex [5] schema for service information. Rob concluded by commenting that the Metasearch Initiative does not consider service registries within their remit, tending rather to concentrate on recommending best practice.

Next up was Thomas Habing, University of Illinois at Urbana-Champaign (UIUC), who described the UIUC OAI registry. The catalyst for the registry was an identified need for finding relevant repositories to harvest. The registry itself is not distributed but receives regular metadata feeds from distributed services. Registry entries come from various existing registries, from use of OAI Protocol features of Friends and Provenance, from periodic searching of Google Web indexes and by manual addition. There was a desire to make the registry's contents available m2m and this is achieved via OAI-PMH, an RSS feed (to notify changes in the registry) and via an SRU service for searching the registry.

Following a short break, Wilbert Kraan, CETIS, presented some thoughts on service registries and e-learning, starting off by describing the Content Object Repository Discovery and Registration/Resolution Architecture Project (CORDRA) [6] that has developed a model of how to create local federations of repositories and has also built practical implementations of the model. Wilbert then spoke about registries in the eFramework, some lightweight solutions and finished by raising a number of open issues including relationships between national and local services with local and national registries respectively, authentication, authorisation and access control.

Matthew Dovey, Oxford University e-Science Centre, spoke about an evaluation of Universal Description Discovery and Integration (UDDI) [7] for the UK e-Science GRID and began by giving an overview of UDDI. Following details of the evaluation, Matthew concluded that UDDI could be used to provide an infrastructure for UK e-Science but that there also remained issues which it was hoped would be addressed by the next version of UDDI.

The last presentation was given by Jeffrey Young, OCLC Office of Research, and introduced their work on WikiD (Wiki Data). WikiD extends the Wiki model to "..support the creation and maintenance of structured data..". WikiD thus could support such things as MARC data and field level data editing and searching. Jeffrey did a walkthrough of a normal WikiPage creation and contrasted this with the creation of a WikiD 'collection' which also entailed the addition of a new XML schema. WikiD supports a variety of protocols including OAI-PMH, SRW/U, OpenURL and RSS.

The eating theme returned during the evening with the Workshop Dinner which, with fantastic food and suitable lubrication, provided a good forum for discussing the day's presentations, networking and generally chatting about issues as diverse as rock groups and Z39.50!

Breakout Groups and Discussions

After yet more food at the breakfast bar, the second day required a greater degree of delegate participation. Three breakout groups were formed to consider what issues exist in creating a viable, globally distributed service registry. In case anyone thought that task was easy, delegates were also asked where possible to suggest appropriate solutions to the issues. Feedback from the breakout sessions has been grouped together under three headings for the purposes of this report. As often happens, there are more questions raised than there are answers identified.

Usage Issues

The purpose and value of a Distributed Service Registry is unclear, is there a cost benefit for creating a DSR? Who are the likely users and how will they want to use the registry? To provide evidence of need, it was suggested that different user communities be asked what they want. It is likely that the different communities would require different approaches and standards so perhaps some form of minimum input standard could be devised which suits the needs of most.

Technical Issues

Should the registry be service- or collection-driven? Perhaps co-locating service and collection descriptions should be avoided? Start with services, as services have associated collections; but in terms of the DSR, what is a service? How would the DSR be searched ? It may be better to avoid cloning the whole registry, there could be local registries as sub-sets of the global whole or there could be 'core' metadata for sharing with extensions for specific user groups. How feasible will it be to agree a reference model? There is also the matter of the appropriate record: there are likely to be multiple records describing the same resource; which one should be used and where? Is there a role for ebXML (Electronic Business using eXtensible Markup Language)?

Management Issues

Intellectual property rights (IPR) of the metadata within the DSR are also an important issue: can the records be used (or reused) by system vendors? What about access control to the records, will there be public and private elements of the records and will the same access control and IPR cover the whole record? Who sets up nodes in the DSR, who is responsible for updating records, and ultimately, who pays for it all? How can organisations be motivated to keep entries up to date? What should be done about ensuring consistency with regard to the quality of records?

Conclusions

Where do we go from here? I might have missed something, but to me, there did not seem an obvious way forward. Certainly delegates agreed that the framework in which the DSR would sit will benefit from being clarified and use cases need to be investigated across digital libraries, eScience, eLearning, museums, etc.. Development of an agreed reference model for service registries would also be an important task along with practical experimentation between projects/services such as OCKHAM, IESR and eScience. Leona Carpenter, JISC, summed up the workshop and commented that funding appeared to be the biggest issue. It was not clear what future activities would require funding or were covered in existing project budgets. There was support from delegates for collaborative (cross domain/country) funding and this is an area that ought to be explored further.

References

  1. The IESR Project Web site: http://iesr.ac.uk/
  2. Workshop presentations: http://www.ukoln.ac.uk/events/dsr-workshop-2005/programme.html
  3. RSLP Collection Description Schema: http://www.ukoln.ac.uk/metadata/rslp/schema/
  4. The OCKHAM Digital Library Service Registry: http://www.ockham.org/
  5. ZeeRex Web site: http://explain.z3950.org/
  6. CORDRA Web site: http://cordra.lsal.cmu.edu/cordra/
  7. UDDI Web site: http://www.uddi.org/

Author Details

John Gilby
Project Manager M25 Systems Team
London School of Economics

Email: j.gilby@lse.ac.uk
Web site: http://www.m25lib.ac.uk/M25link/

Return to top