Web Magazine for Information Professionals

Towards Library Groupware With Personalised Link Routing

Daniel Chudnov, Jeremy Frumkin, Jennifer Weintraub, Matthew Wilcox and Raymond Yee describe a potential groupware framework for integrating access to diverse information resources and distributed personal collection development.

'Library groupware' - a set of networked tools supporting information management for individuals and for distributed groups - is a new class of service we may choose to provide in our libraries. In its simplest form, library groupware would help people manage information as they move through the diversity of online resources and online communities that make up today's information landscape. Complex implementations might integrate equally well with enterprise-wide systems such as courseware and portals on a university campus, and desktop file storage on private individual computers. Ideally, successful library groupware should provide individuals and groups with a common set of information functions they may apply to any information they find anywhere.

In this article we make a simple case for library groupware as a unifying service model across disparate information environments. We consider the distributed, personalised collection development model that groupware would serve, and propose an architectural model which might provide a first step in an evolutionary path from today's commonplace digital library services towards integrated library groupware.

Why Do We Need Groupware?

Consider three networked applications that are already used constantly: link resolvers [1], which short-cut access from one Web resource to related resources or library services; bibliographic reference managers, which enable users to manage records about information resources they might need to reference again; and weblogs, which let anyone write whatever they want about anything they like. Support for each of these applications varies widely in today's libraries. Link resolvers are centralised tools used via the Web by users and library staff to connect licensed resources and library services; traditional tools for reference management are desktop applications introduced to library visitors via bibliographic instruction, although recent versions and new products make Web-based reference management possible; weblogs are typically managed by weblog users themselves, with only a few examples of weblog support provided by libraries or campus computing services to be found.

It is interesting to examine the relationships between these tools and what they help users to do. For example, is following a cited reference link to a link resolver the same kind of action as following a link on someone's weblog? Are citing a work in a peer-reviewed paper and citing a work on a weblog the same action, or are they different somehow? Because the support levels libraries provide for each kind of application vary widely, it might seem natural to consider that these applications and their functions are quite different. But it also seems likely that to the library users following and citing many references from many sources as they manage the bibliographic lifecycle of their ongoing work, the functions these applications provide are quite similar.

In a fluid world where users move regularly between informal discussion and scholarly/research domains, we can consider the functional areas of linking, reference management, and weblogging to be service points on a single continuum of information gathering, study, and creation. Following a reference from a weblog or from a scholarly article are each similar steps in exploring threads of related ideas. Capturing a reference in your own weblog or reference library indicates that the citation somehow relates to your own thought process. Publicly citing a reference more closely associates your thinking with that of others.

The link resolver solution works because it simplifies navigation through diverse library resources. There are so many online resources with so many different interfaces that it can become nearly impossible for users to move naturally through the threads of ideas embodied in the content of those resources without link resolvers. Libraries that provide reference-linking services with link resolvers provide navigational clarity to this sea of interface complexity. Resolver services also let librarians customise the connections between the formally published resources contained within the centralised information space defined by library collections. Although these are major improvements for users and librarians, the benefits are limited to the use of centrally collected library resources.

The broader information landscape - including library resources among weblogs, pre-print archives, and decentralised information resources and repositories mingling with enormous desktop computing power and storage on private devices - is where users and groups find, collect, and use information today. We would do well to consider how we might bring better navigational clarity and the ability to customise connections to this more diverse and decentralised information landscape.

Formalising Personal Collection Development

The increasing network-savvy of information consumers, always connected in multi-user gaming, chat, and file-sharing environments, symbolises a shift from a model of centralised collection development [2]. To consider the relationship between these newer patterns of information usage and traditional library collection development is to realise quickly that we have enlarged the idea of what collection development means. More than ever, libraries are sharing collection development responsibilities with library users. As decision-making about how to organise information expands from the centre (libraries) to the edge (users and user groups), we need to find ways to make the resources libraries provide fit more easily into a larger and more dynamic information landscape.

We are beginning to see efforts addressing this need. The Interactive University at UC Berkeley is building the Scholar's Box [3] application to enable users better to integrate digital resources from libraries with other information sources and tools. The Scholar's Box makes it easier to create personalised and themed collections of digital cultural objects for use as research and learning materials. Benefits of such a tool include simplifying integration of digital primary source materials into teaching and learning, and simplifying integration of user-built collections with other end-user and institutional tools for managing and sharing information.

The Scholar's Box application enables these functions by bundling

A core motto of the Scholar's Box project is 'Gather, Create, Share.' This motto speaks of the need to put more control in the hands of individuals to select information from a diversity of sources, to collect and organise that information as they see fit, and to enable broad use in a manner not limited by the boundaries of traditional systems and individual applications. These objectives match those already sought by the aforementioned information consumers as they navigate their own information landscapes.

Managing Information Across Communities

The most prominent current example of individuals and groups defining the shape of their own information landscapes is the tremendous growth of weblogs. No longer just the realm of undergraduates talking about their online friends and social lives, scholars use weblogs for both scholarly and avocational reasons [4]. Some use weblogs to keep up with their own academic work and that of their colleagues. Other academics use weblogs for personal material or to write about personal opinions that may or may not have to do with their scholarly work. These sites can be particularly illuminating, as academics seem to feel freer to express their opinions in their own places on the Web. Among technologists and scientists, Web pages and weblogs are (and have been for a while) quite common. As scientific communication has moved online, scientists have begun to post pages with reprints, supplemental materials, and other publications more frequently. In the last few years a larger community of social scientists, humanists, and scholars from many other disciplines have also moved online, especially as new software has made starting and contributing to weblogs easier.

We are in the early stages of understanding how people use weblogs for research, but it seems clear that weblogs have already become essential methods of interaction in academia, where weblogs help academics to connect, augmenting formal scholarly communications. In a way, weblogs represent the informal end of the continuum from formal to informal scholarly communications: starting at the other end with peer-reviewed publications, we can envision pre-print and e-print archives, institutional repositories, online community forums, mailing lists, and weblogs as tending to have varying degrees of formality depending on the level and character of administrative policy, peer review, and institutional stewardship brought to each. Weblogs can also be seen as a new tool for controlling and personalising both formal and informal aspects of research and teaching. Keeping weblog pages with results of saved searches or tables of contents, for instance, is an easy way to link storage and sharing with traditional information-seeking tools. Course weblogs are also proving to be useful, directing students to Internet resources on certain topics, and allowing teachers to post material and get feedback (often through comments) from students. In this context, the boundaries between scholarly communication systems, weblogs, and dedicated courseware systems as teaching tools are not so clearly drawn.

Indeed, much like bibliometric techniques for formal communication systems, tools for connecting weblogs to each other and to other information services are already in widespread use. 'TrackBack,' for example, allows a weblog author to connect their own comments directly to others' posts:

"In a nutshell, TrackBack was designed to provide a method of notification between websites: it is a method of person A saying to person B, 'This is something you may be interested in.' To do that, person A sends a TrackBack ping to person B... the TrackBack ping has created an explicit reference between my site and yours. These references can be utilized to build a diagram of the distributed conversation. Say, for example, that another weblogger posted her thoughts on what I wrote, and sent me a TrackBack ping. The conversation could then be traced from your original post, to my post, then to her post. This threaded conversation can be automatically mapped out using the TrackBack metadata." [5]

TrackBack looks much like the same kinds of citation practices followed in scholarly and other publishing contexts for generations. That members of the blogosphere have defined techniques for accomplishing this indicates that people are perhaps more willing than ever to speak informally, and to speak publicly, in ways that bolster connections forward (by leaving TrackBack ping URLs) for others as readily as backward (by citing preceding sources).

For instance, other tools gaining prominence include Blogdex [6], which generates a summary of popular links anywhere on the Web by analysing the outward link patterns from weblogs, and Technorati [7], which provides an impact factor-like ranking of weblogs by inbound links from other weblogs. Delicious [8], Furl [9] and the authors' unalog [10] are 'shared link logs' allowing distributed individuals and groups to quickly categorise and share bookmarks and recently read links. Biologging [11] directly connects weblogging to the Pubmed database by allowing users of a custom Pubmed interface to add entries for interesting citations to a shared weblog.

These new services and tools indicate that increasingly people want to share information about what they are reading, and what they have to say about it, and what others have to say about what they say. Many new services such as weblogs are quickly becoming mainstream, as major institutions such as Harvard Law School and MIT are bringing up public weblog services for their community members. Information-sharing innovations are also coming from within academia. The University of Minnesota Libraries, for example, has added their community weblogging system UThink as a target in their link resolver system, so users can post a citation directly onto their own weblogs [12].

In the current library software marketplace, where digital library services can include metasearch portals with citation clipboards and UThink with private weblogs connected to link resolvers, it seems clear that we are in the middle of a wave of innovation and integration of these new services. The common thread running through these innovations is that each new service helps individuals move and connect more kinds of information from more diverse resources through the various information communities in which they participate. We are still at a stage where each innovation adds value within a well-defined community or information context, even while we are learning that we will have to meet the needs of users who regularly move between formal and informal communities, and between public and private contexts. Before long, our ability to meet these users' needs will be limited by our inability to allow users to create and connect information sources and services as they see fit.

A Simple Architectural Solution: Personalised link routing

Because these services have so much in common, it seems likely that one or more common architectural patterns could help formalise the roles and relationships of each. One view of how to build these services can be found in a reconception of our first-generation link resolution systems. In most deployments (aside from UThink), link resolvers take a single anonymous user at a single library from one information source to one of many services of use for that source: from a reference to a full-text article, for instance, or from an article to a cited reference list, or from a metadata record to an inter-library loan form. Hence the term 'resolver': the system reports which services - as pre-defined by librarians - are available relative to a given information object, and resolves a user service choice by redirecting users to their chosen site. The entire transaction is stateless, in that which service one user chooses for any arbitrary source has no effect on his next choice for a different source, or on the next user's available choices for the same source. For each source item, one set of suggested services for that source appears, and usually one service request is then resolved.

There are many potentially interesting artifacts from these transactions, such as usage logs, and analysis thereof, or anecdotal user feedback. But typically there is no remembered state, in that there is no attempt within each separate transaction for the system to recognise the user involved (aside from simple authentication and authorisation in, say, an off-campus proxy context), and there is no attempt to determine any preferences that user might have for potential service targets. A user cannot specify her own source and target categories; she is left to enter the process only from - and exit only to - sources and targets defined by library staff. There is no opportunity for insertion of per-user rules that will trigger secondary services for a given source link, such as link logging to a private or group weblog, or automated subject-specific indexing based on the referring source. Users cannot configure resolvers to automate services to be performed, for instance, in the background at the same time as they select a resolution service, (e.g. 'log all my links automatically but this time I want to read the full text'). And users cannot stipulate that, perhaps, they want resolution to happen within a different library's resolver, (e.g. 'I'm just visiting this university library for the week, please forward these requests to the resolver at my home institution').

These missing features can be summarised in the phrase 'personalised link routing.' 'Personalised' means the addition of functions that will vary depending on the user, either as predetermined by librarians or as specified by the user. 'Link routing' means arbitrary rewiring of the current single transaction paradigm (source -> service list -> target). 'Personalised link routing' would allow hooks at and between each phase of the current resolver pipeline, which would support multiple, arbitrary, parallel, or sequential actions to be specified by either librarians, as at present, or by users.

Scenarios

To explore this model further, here are examples of how some personalised and routable actions might be wired in to the various steps:

This 'personalised routing' model seems very conducive to imagining additional scenarios and information paths, with a variety of potential connections between information and services in any of these systems. For instance, an instructor using a system like Scholar's Box to build collections for use in teaching could seamlessly route found items to colleagues, or to a reference library for later use in authoring research articles. The same instructor could also route information in the other direction, from a reference library or a colleague's weblog back into one or more teaching collections.

Implementation Choices

To build personalised link routers, two implementation paths are available which involve enhancements to existing systems. The first, and most straightforward, would involve layering personalised routing services onto existing resolvers. Enhancing link resolution rule engines to add new hooks in the different request phases and more flexible routing/bouncing/chaining should be feasible. Layering in user and group management services should also be feasible, especially if we leverage recent work such as the Open Knowledge Initiative [13] or Shibboleth [14] specifications, among others, for which enterprise-class implementations are available or under development.

A second implementation path might involve integration with MyLibrary and UPortal-type systems, which are 'personalisation' services by definition. Interesting questions along this path might include how close a binding might be necessary between personalised link routers and portals. Should all the personalisation happen in a portal, with routers just serving as rule engines for service resolution? Or should portal systems just be tuned to be well-behaved sources and targets, with personalised routing functions living in the routers? It is easy to imagine different institutions with different I.T. administration models preferring a design that allows different pieces to live under different management branches.

In considering either of these implementation models, issues of distributed storage, security, portability and descriptive models quickly come to the fore. Fortunately, significant progress is being made on each of these issues, and it seems possible that modular solutions might be ready for integration very soon. Indeed, modular separation of services and easy integration was a core design principle largely responsible for the success of the link resolver paradigm. Ideally, it should also be possible to integrate next-generation groupware services with external non-library toolkits (with 'non-library' meaning 'blogosphere and otherwise from the general Internet community'). As highlighted earlier in this article, many of these technical innovations occur far away from libraries. Supporting users who define and implement their own processing models reinforces the pattern of increasingly distributed collection development.

Conclusion

When faced with the difficulties involved in integrating link resolvers, federated search engines, courseware servers, and other contemporary systems, the library community has solved one set of problems related to collection development and navigation. At the same time, we have amplified the integration problem by investing in our own incompatible resources with their own administration and navigation nightmares. If we can be successful in delivering a second-generation user front-end to these disparate services and resources that successfully integrates with how users move through and manage information in 2004, we will have taken a significant step toward a vision of integrated groupware. The idea of 'library groupware' suggests a change in library services and philosophy, but helping users manage information across their diverse personal collections and information communities remains true to the core mission of libraries.

References

  1. The canonical overview of the general link resolver model is Van de Sompel H., Hochstenbach, P., "Reference Linking in a Hybrid Library Environment: Part 1: Frameworks for Linking", D-Lib Magazine 5(4), April 1999, http://dlib.org/dlib/april99/van_de_sompel/04van_de_sompel-pt1.html
  2. An excellent overview of the social shift toward "information consumption" can be found in 2003 OCLC Environmental Scan: Pattern Recognition, available online at http://www.oclc.org/membership/escan/
  3. More information on Scholar's Box can be found at http://iu.berkeley.edu/IU/SB and http://raymondyee.net/wiki/ScholarsBox
  4. Glenn D., "Scholars Who Blog: The soapbox of the digital age draws a crowd of academics." The Chronicle of Higher Education 49(39), June 2003, available online at http://chronicle.com/free/v49/i39/39a01401.htm
  5. Trott M., Trott B., "A Beginner's Guide to TrackBack." Available online at http://www.movabletype.org/trackback/beginners/
  6. Blogdex, available online at http://blogdex.net/
  7. "Top 100 Technorati," available online at http://technorati.com/cosmos/top100.html
  8. Delicious, available online at http://del.icio.us/
  9. Furl, available online at http://furl.net/
  10. unalog, available online at http://unalog.org/
  11. Biologging, available online at http://www.biologging.com/
  12. Nackerud S., "Post Database Citations in Your Blog!" http://blog.lib.umn.edu/archives/000477.html
  13. Open Knowledge Initiative specifications, available online at http://web.mit.edu/oki/specs/
  14. Shibboleth, available online at http://shibboleth.internet2.edu/

Author Details

Daniel Chudnov
Librarian/Programmer
Yale Center for Medical Informatics

Email: daniel.chudnov@yale.edu
Web site: http://curtis.med.yale.edu/dchud/

Jeremy Frumkin
Gray Family Chair for Innovative Library Services
Oregon State University Libraries

Email: jeremy.frumkin@oregonstate.edu

Jennifer Weintraub
Digital Collections Specialist
Yale University Library

Email: jennifer.weintraub@yale.edu

Matthew Wilcox
Epidemiology and Public Health Librarian
Yale University School of Public Health

Email: matthew.wilcox@yale.edu

Raymond Yee
Technology Architect
University of California, Berkeley Interactive University Project

Email: yee@berkeley.edu
Web site: http://iu.berkeley.edu/rdhyee

Return to top