Web Magazine for Information Professionals

Subject Portals

Judith Clark describes a three-year project to develop a set of subject portals as part of the Distributed National Electronic Resource (DNER) development programme.

The vision that created the Distributed National Electronic Resource (DNER) grew out of the Joint Information Systems Committee (JISC)s history of engagement with Higher and Further Education Institutions and significant research libraries in the UK. The DNER has an ambitious goal - to empower the HE/Post-16 community by providing quick, coherent and reliable access to a managed information environment that is geared to supporting learning and teaching activities. The JISC has established a great number of services that are helping to fulfil this vision of an integrated information environment. It also sponsors many initiatives that are developmental in nature.

The Subject Portals Project (SPP) is one of these. It is a JISC-funded initiative led by the Resource Discovery Network (RDN) for the DNER. The project aims to enhance resource discovery by developing a series of portals focussed on the requirements of end-users located in a variety of learning environments within the further and higher education sectors who are the main clients of the DNER. The first phase of the project ran from Nov 2000 to Sep 2001 and was mainly concerned with building a Z39.50 cross search prototype at each of these three RDN Hubs SOSIG , EEVL and BIOME . The second phase, which ends in 2003, involves two further Hubs, HUMBUL (in collaboration with the Arts and Humanities Data Service) and PSIgate, and will examine the feasibility of delivering a set of subject portal services that build on the research network already established by the RDN.

Enhancing the RDN

The RDN and its component subject Hubs are well known to librarians in the higher education community. The RDNs roots go back to the early days of networked information services. It is now one of the more successful components of an academic and research information infrastructure that continues to evolve rapidly. The RDNs services today fulfil an expressed goal of the eLib programme, which in 1994 funded a series of demonstrator services designed to create a national infrastructure capable of generating significantly more widespread use of networked information resources [1]. Those services, then known as subject gateways, developed in response to community interests specific to each gateway. The RDN itself was established in 1999 to bring the gateways together under a federated structure. The Hubs are based around faculty-level subject groupings, chosen with a view to potential for partnership, sustainability, and growth, while preserving legacy investments [1]. The RDNs internet resource catalogues include records describing almost 40,000 web sites and are growing steadily as new subject areas are encompassed.

The Hubs offer a discovery tool, enabling users to quickly locate the most relevant Internet resources. They were built around many of the same concepts that underpin the library catalogue. The Hubs direct their users towards content that is freely available although difficult to find using a non-specific search engine. Like a library, a Hub is inherently reliable because it has applied standardised policies and procedures. Sites are selected on the basis of selection criteria developed in partnership with the research libraries and universities that contribute to the RDN, they are catalogued following consistent practices, and they are analysed by people with expertise in the relevant subject discipline. Links are checked daily in an automated process and all entries are updated regularly by subject specialists.

Because only high-quality Internet resources are included, and these are classified using an appropriate controlled vocabulary, the RDN is an ideal environment for resource discovery. Undergraduate students with only a rudimentary knowledge of a discipline can be confident that their search results will be authoritative and appropriate to their learning context. Researchers and professionals who are comfortable using advanced search techniques with vocabulary terms particular to their specialised field will also retrieve relevant records. This makes the RDN a valuable tool for teaching and learning. Students can use the Hubs to find appropriate resources fast, while at the same time developing their own abilities to identify and locate relevant information. Lecturers, librarians and tutors can quickly identify key web sites for their students to explore. Usage statistics, however, indicate that the RDN is under-utilised [2] .

Several research studies have shown that tutors and lecturers are disinclined to direct their students to explore the Web. From necessity, most opt for reading lists and/or prescribed resources. A Managed Learning Environment (MLE) could reinforce this tendency, by making it even easier to embed pre-selected resources in with other electronic curriculum material. The subject portal is the perfect complement to packaged courseware. It will enrich the digital learning environment, not only by providing structured opportunities for learners to access web-based information relevant to their immediate needs, but also by providing opportunities to interpret, compare and interact with resources. A portal could even provide something broadly the equivalent of the social context of an old-style departmental library, encouraging self-study and imbuing discipline-based values.

Hubs to portals

The RDN is a resource discovery engine tailored to the needs of scholars and researchers. Around this core service, each Hub offers additional related services including news and alerting services, and conference information. So are the Hubs already portals?

A portal is a type of Web site. There are many definitions of what makes a site a portal. The typical early commercial portals were consumer Web destinations such as Yahoo! and Excite, which aggregated news and information and offered other enticements, designed to keep people at the site and to draw repeat visitors, such as free e-mail and calendars. This led to the commonly accepted concept of a portal as any Web site that offers a community of users the comfort of a home base, a single entry point to resources from a range of other Web places. The RDN Hubs have done this for some time and do it very successfully. In the terms of one definition of a portal, they provide an exciting, interesting and focussed view of a special part of the Internet [3]. In other quarters, the defining attribute of a portal is the ability for users to personalise the site according to their own preferences. Here again the RDN meets the criteria. SOSIGs Grapevine has over 5,000 registered users and a number of channels that can be customised according to the users preferences.

A more rigorous use of the term portal has developed within the Knowledge Management (KM) industry. The concept of an Enterprise Information Portal (EIP) is based on a simple idea, to offer end-users one-stop-shopping for business information. A number of portal solutions have emerged offering fast, easy and consistent access to corporate intelligence, but despite the hype, most companies are finding that the one-stop-shop remains an elusive idea. In many cases the EIP has proved to be an ever more-costly goal, and a far more complex undertaking than first envisioned. Considerable resources have been invested by the universities that have developed portals, such as the MIT Web Portal project , the uPortal implementation at the University of British Columbia and myMonash in Australia [4], [5], [6].

The RDN subject portals share certain functionality with EIPs/MLEs. This includes

In the same way that enterprise portals aim to enhance corporate productivity, portal-enhanced RDN Hubs seek to support research, learning and teaching in the academic context. A successful portal environment is one that extends the users capabilities it has a capacity-building aspect that is really exciting. There is no doubt that in offering advanced aggregation mechanisms, RDN subject portals have potential to stimulate learning and teaching in ways that we cant predict. The Hubs currently provide a basic level of aggregation using HTTP. The SPP seeks to take this to the next level, to create an integrated system that is both dynamic and stable. It is very much a developmental project, and will involve much testing to see how such a system might be utilised by learners and teachers.

Enterprise information portals have three basic components; context, content and activity. The RDN portals can be modelled in this way too. The context in both cases is a technological infrastructure that supports resource discovery. Typically this includes a set of communication protocols that support data exchange between different services. Within this context, resource discovery is enabled according to the extent to which the content is structured or unstructured, in other words, discovery can only be effective if the appropriate metadata tags have been applied to the resources themselves. In the SPP case, we are not just concerned with resources as items, but also as collections of items. Further, these items are far more dynamic than those typically described by the traditional library catalogue. Implied in Lorcan Dempseys description of the web as a pervasive social, research and business medium, home to the full reach of intellectual product [7] is a complexity driven by a whole range of emerging uses of information resources. User-driven activity ultimately defines what the portal does, but what makes any portal project so unpredictable is that as the content and context are changed, new behaviours are enabled.

The RDN portals are primarily concerned with technologies that broker subject-oriented access to resources. Effective cross-searching depends on consistent metadata standards, but these are still under development and although the RDNs collection is governed by sophisticated metadata schemas, this is not the case for many of the other resources targeted by the portals. Z39.50 is the standard that has been adopted for the preliminary cross-search functionality. Further portal functionality is being developed using RSS (Rich Site Summary) and OAI (Open Archives Initiative). Other standards applications that underpin the portals are notably Dublin Core and a variety of subject-specific thesauri such as the CAB Thesaurus and MeSH.

Back to the drawing board, again and again

The problem that the RDN portals address is that there are so now many information resources available at the desktop that the proliferation itself has become a barrier to effective use of information technologies. Having to separately check a number of different types of commercial and institutional databases, each with its own access instructions and search interface, is at best annoying and at worst, completely confusing. Portals seek to lower the barriers by operating across a variety of different systems in such a way as to make it appear to the user as if they are using a single coherent resource.

The project started with an assumption that in the wealth of resources that make up the DNER is contained quality content that learners need to be able to discover easily and quickly. We assume that we have the skills and tools to create the technological context to enable rich resource discovery. We assume that the subject approach to resource discovery is one that will be attractive and valuable to learners. What we have no idea about is how learners will use a retrieval tool that can cope with resources in many different formats, of varying size and scope, from a number of different owners. Whats more, we are now dealing with resources that can be used and reused, reformatted, reassembled, annotated, and shared. Portal users will be able to interact with and contribute to DNER content (Pinfield and Dempsey in their description of the DNER in Ariadne 28 propose that personal information spaces will increasingly be visible alongside other content in the DNER [8]).

This raises a difficulty faced by any large digital library, and in this case, an issue that is also one of the fundamental challenges faced by the DNER -- how to make it useful to different communities of users and for different purposes. The SPP reflects the DNERs vision of an information landscape. Each portal will present a different view of that landscape, shaped by each Hubs unique understanding of the needs of its own user community. The RDN is familiar with the concept of users who dont have any loyalty to a library in the sense of it being a place that they visit. Hub or portal users are likely to be sitting at their own desks at home, at their place of work, or on the train, and they may never enter a library building. On the other hand, they may be using a public access workstation in their own library, where they have certain rights, or at another library where as a guest they have a more limited set of rights. They may belong to several institutions that each provide access to a different set of information resources. In some cases these institutions may subscribe to the same content but via different providers who compete on the basis of the specialised interface features each has to offer.

The funding proposals for the SPP sought to exploit this and other strengths of the RDN organisation. However, the RDN has an explicit mandate to reduce the Hubs dependence on central (JISC) funding. Each Hub is actively seeking commercial arrangements and linkages that will help to sustain service provision. There is a dynamic tension between the income-generation goals of each Hub and the wider vision of subject portals as a set of windows to the DNERs current collection, in which the SPP is viewed as part of the work of creating a unified national digital resource [9].

The JISC has a long history of funding projects with the expectation that innovation and research work will lead smoothly on to fully operational services in the public arena and the Subject Portals project is a case in point. SPP funding has been allocated to the Hubs to support research and development effort that is inseparable from the ongoing strategic business development of an existing service. There are bound to be difficulties with this approach. A way of minimising future problems is to ensure that the desired outcomes are clearly spelled out at the outset. It may be that for the SPP these are in fact less to do with providing subject portals in the DNER sense and more to do with leveraging previous and current investment across the RDN to serve stakeholders better. Nevertheless, there is a truly R&D component to this work. The outcomes are unpredictable and portals development has to be iterative. It is unrealistic to expect that at the end of the project, the RDN will have achieved the JISCs goal of being able to provide seamless access to existing services through a variety of entry points [9]. While it is important that portal functionality is developed with an eye to scalability, it will be necessary to start small, by first releasing a limited set of functions that meet the needs of a specific sub-set of users, and thoroughly testing that before offering additional functions.

Searching for the needles in a field of haystacks

Access to a broad range of high-quality materials underpins successful performance [of universities]. Easy access to electronic information from desktops is expected, and required, by students and staff [9]. The JISCs portfolio already includes a superb range of the major bibliographic databases, electronic journals and multimedia information resources used in universities and colleges across the UK. Increasingly this includes a variety of primary resources and course materials available online, including hybrid local collections held in museums, archives, academic departments and other institutions across the country. In the DNER architecture, portals are seen as one means of overcoming the paradox of in-depth electronic resources that remain invisible to everyone except those who are already aware of their existence.

Each of the target information services available through any RDN portal is likely to be organised under a different system, accessed according to conditions and property rights specific to the particular service, and managed by a different organisation. The end-user may not care about the supply chain but needs to be led through it in such a way as to preserve the integrity of each and every link in the chain. In terms of interface design, there are many options in how results are presented to users; for example, it may better to rank results in some predetermined order, or to group them by supplier. The portals may offer users the ability to customise how their results sets are displayed, for example a user may prefer the briefest description in the first instance as opposed to the more detailed records that the RDN Hubs currently display. Even if this level of user-driven functionality can be offered, there will need to decisions made as to the default presentation format. Experience at SOSIG suggests that only a small percentage of users take up the options available to them to customise the screen appearance of the Grapevine services.

Portals are but one means to present the DNERs content and services to end users, and there is as yet little understanding of user behaviours in the context of a wide-reaching managed information environment. Usability testing thus has to consider some of the fundamental issues and is not a simple matter of determining the look and feel of the interface. It is not possible to predetermine whether different functionality will be required by different subject communities. There are also questions about how many subject portals are possible and how they relate to each other. The SPP is working on broad faculty-level groupings. It is unclear whether these can co-exist with more specific portals for more specialised communities, for example, nursing or veterinary studies, or indeed if this is desirable.

A question that follows from this is whether it may be feasible to offer pre-defined views of the DNER landscape for a particular organisation, in which case the interface would be at the institution rather than at the RDN. To the end-user, the subject portals are essentially an aggregation of services on one web site. Does it matter which web site that is? It is possible that an MLE could be an alternative entry point to a subject portal It remains to be seen how universities will handle the potential proliferation of portals on campus. Channel systems based on tools such as RSS present a whole range of new opportunities to be explored, and may support delivery of subject-focused RDN content via a variety of institutional channels [10]. A key to the feasibility of such services for the RDN is the extent to which necessary maintenance processes can be automated.

New models for collection development

The RDN seeks to provide subject portals that will showcase the diversity and quality of electronic resources available for UK academic teaching, learning and research. In the first instance this means the JISC current collections. These are the resources made available to institutions and colleges on the basis of deals struck on behalf of the sector by the JISC. However, portals present a new set of issues to add to the negotiating table. For this reason, it is probably not appropriate that each RDN Hub negotiate with each supplier/distributor/owner/vendor individually. A collection development policy can be established at the Hub only in so far as it is part of the business plan for the delivery of services to a constituent group of users.

One of the requirements for the portals is that they present a user with a discovery landscape that is comprised only of those resources that he or she is eligible to use. This can be achieved via an ATHENS account [11]. There is an argument that suggests that it would be better if the user was shown the environment as selected by the RDNs teams of subject specialists. There could be several ways of handling this. Either a user would see a full set of collections/services (targets) with some greyed out, or a search could be run across all the targets and metadata returned from all and the user would be challenged only on the basis of further use of the metadata.

Another challenge raised by the SPP is that of RDN use statistics and their role in decision-making. Potentially libraries could draw on a whole new set of usage data that would indicate not just what had been used, but what would have been useful had the institution held an appropriate licence. Publishers and other content providers may also be interested in the data that a portal will be able to provide that shows comparative use of products. Portals offer a different type of marketplace and have potential to change some of the ground rules of competition between scholarly content providers.

New business models distributed, networked

The SPP team has not just taken on the challenge of breaking new technological ground, but also need to consider effective business models to underpin effective services to learners and teachers. The RDN has an innovative organisational structure. In the business world there is a lot of talk about an emerging networked organisational form that transcends the traditional boundaries of place, company loyalties and hierarchical structures. The RDN is an outstanding example of how to share human and financial resources to deliver an integrated service. A great many HEIs and other institutions collaborate to provide the RDNs services. These institutions are distributed across the UK and their remits vary widely. Yet the efforts of the diverse team of content providers and Hub managers come together to provide benefits for each institution that are far over and above what they could achieve working by independently. The cataloguers (content providers) represent over 50 different universities, plus museums and many of the peak research bodies, and this broad constituency ensures that the RDN remains appropriate to changing needs in teaching, learning and research communities.

The Subject Portals development exploits the software applications that drive the RDNs Internet resource catalogues, its distributed administrative and organisational structure and the value of the existing content. These have resulted from significant investment by JISC over the past seven years. The SPP will add further value in ways that have already explored in the eLib hybrid Library projects.

What software?

The recent study by Andy Powell and Liz Lyon outlined a technical framework to underpin the DNER [12]. The Subject Portals are an integral part of this overall architecture. It was acknowledged in the proposal that significant in-house development effort would be required at the Hubs because there was no suitable out-of-the-box portals product available. Several options were investigated, and we decided to take two differing approaches at the development work. At the SOSIG Hub, the test portal was built using Java servlets, relying upon various open source packages, including Velocity, SiRPAC and ZAP! Tomcat was used to run the application and Jetspeed and MySQL or PostgreSQL are being explored to provide additional portal-like functions. At BIOME and EEVL, a prototype portal was developed using SiteSearch, an OCLC product. Further work will continue using open source software using a distributed development model.

The DNER technical architecture provides the framework for machine-to-machine (M2M) interoperability. A mass of agreements, licences and contracts underpin the inter-institutional (B2B) collaboration and interaction. Below that are the formal and informal person to person (P2P) communication networks that are required to operate distributed services. There is no doubt that maintaining effective relationships between so many different partners has a high overhead cost. The SPP needs to quantify what additional resource will be required to support fully functional portals across the RDN.

Final words

Subject portals offer unique potential to support innovative learning and enhanced resource discovery. The SPP is seeking to create the tools and applications that are required to give users the ability to tap content from across a range of sources via a single interface. By building portals that are integrated into the Hubs, the RDN will be able to provide timely, reliable high-quality access to a greatly enriched range of highly respected content providers and information services, while maintaining its trusted subject focus.

Connecting these vast resources to the greatest number of students in such a way as to encourage learning, exploration and discovery is what the portals seek to achieve. The project is working with many sectors of the education community to realise this goal. Much progress has been made but the business issues remain a challenge.

References

  1. Lorcan Dempsey, The subject gateway: experiences and issues based on the emergence of the Resource Discovery Network', Online Information Review, 24 (1), 2000, 8-23
  2. The First Annual Report of the JISC User Behaviour Monitoring and Evaluation Framework by Jennifer Rowley, is at: http://www.jisc.ac.uk/pub00/m&e_rep1.html#top
  3. Practical Portals, http://www.practicalportals.com/
  4. Social context of the construction of an mIT Web Portal, by Christopher Beland, at: http://web.mit.edu/beland/www/papers/STS.092.htm
  5. http://my.ubc/ca/
  6. http://www.its.monash.edu.au/services/flt/portal/
  7. Lorcan Dempsey, The subject gateway: experiences and issues based on the emergence of the Resource Discovery Network', Online Information Review, 24 (1), 2000, 8-23
  8. http://www.ariadne.ac.uk/issue26/dner/
  9. http://www.jisc.ac.uk/dner/
  10. Forthcoming article by Tessa Griffiths and Simon Jennings, in Vine, http://litc.sbu.ac.uk/publications/vine.html
  11. http://www.athens.ac.uk
  12. Andy Powell and Liz Lyon, The technical architecture of the DNER, at: http://www.ukoln.ac.uk/distributed-systems/dner/arch/

Author Details

Judith Clark
Subject Portals Project Manager
Resource Discovery Network

Email: judi.clark@kcl.ac.uk
Web site: www.portal.ac.uk