ARROW and the RQF: Meeting the Needs of the Research Quality Framework Using an Institutional Research Repository
This paper describes the work of the ARROW Project to meet the requirements of the forthcoming Research Quality Framework (RQF). The RQF is an Australian Federal Government initiative designed to measure the quality and impact of Australian research, and is based partly on the existing Research Assessment Exercise (RAE) held in the UK. The RQF differs from the RAE in its reliance on local institutional repositories for the provision of access to research outputs, and this paper will explain how it is envisaged that this role will be filled, and the challenges that arise from this role.
The Australian Research Repositories Online to the World (ARROW) Project came into existence in response to a call for proposals issued in June 2003 by the Australian Commonwealth Department of Education, Science and Training (DEST). DEST was interested in furthering the discovery, creation, management and dissemination of Australian research information in a digital environment. Specifically, it wanted to fund proposals that would help promote Australian research output and build the Australian research information infrastructure through the development of distributed digital repositories and the common technical services supporting access and authorisation to them.
In response to this a consortium, consisting of Monash University (lead institution), University of New South Wales, Swinburne University and the National Library of Australia, submitted a bid and was successful in attracting $A3.66M over three years (2004-6), with follow-up funding of $4.5M for ARROW2 and a sub-project called Persistent Identifier Linking Infrastructure (PILIN) in 2007 .
The ARROW Project has been working with VTLS Inc to develop a supported repository software solution called VITAL. This software has been licensed by fifteen universities in Australia, and ARROW's work at present is focussed on refining and supporting this software, and aiding these universities in their use of it.
The ARROW Community
It was decided that a framework was needed to provide this support, and this has taken the form of a developing ARROW Community. The Community was established in 2006 to enable the sharing of knowledge and experiences by institutions using the ARROW software solution. This sharing has occurred through regular meetings and update sessions, the establishment of virtual contact processes and attempts to co-ordinate the common needs of the repositories. Working groups (ARROW Repository Managers Group, ARROW Development Group and Metadata Advisory Committee for Australian Repositories) have been established to create structures and relationships that will survive beyond the end of the project funding. Each of these groups regularly discusses RQF issues as they relate to repositories. It is expected that at the conclusion of the formal project that these activities will continue through self-funded co-operation and sharing of resources by the community members. This is not dissimilar to other Internet-related activities that rely on a shared commitment to a successful outcome.
The Australian Federal Department of Education, Science and Training (DEST) began the process of establishing the RQF in 2004, based on the model adopted by the Research Assessment Exercise (RAE), used in the UK for many years.
The rationale and intentions for the RQF are described on the DEST Web site:
"The aim of the Research Quality Framework initiative is to develop the basis for an improved assessment of the quality and impact of publicly funded research and an effective process to achieve this. The Framework should:
- be transparent to government and taxpayers so that they are better informed about the results of the public investment in research;
- ensure that all publicly funded research agencies and research providers are encouraged to focus on the quality and relevance of their research; and
- avoid a high cost of implementation and imposing a high administration burden on research providers." 
The RQF has been undergoing a substantial planning and development process since then, with an Expert Advisory Group providing the original framework in 2005. This was then reviewed and revised by a RQF Development Advisory Group in 2006. The ARROW Project provided technical and structural advice on the potential use of repositories to the latter group. DEST is continuing to work on final specifications for the process at time of writing.
When it was originally announced, DEST stated that there would be both a Research Quality Framework and a Research Accessibility Framework. Latterly it has been emphasising the accessibility components of the RQF. The RQF is expected to create a greater level of accessibility to Australian research, as the best of it will be stored in repositories, which should then make a considerable amount of it freely available for harvesting and discovery.
The Proposed RQF Model
While the details of the RQF as a whole are complicated and at time of writing not completely settled, the basic model as it relates to repositories is quite straightforward:
- Each institution that is subject to the process collates and assembles information about "research groups" from within their institution. All members of a group are required to nominate their four "best" research outputs for assessment. It is expected that the substantial majority of these outputs will be journal articles or refereed conference papers. Copies of these outputs and their metadata are stored within an institutional repository. DEST has not mandated any specific software solution for this, and Dspace, ePrints, VITAL, Fez and Digital Commons are all expected to be used.
- The institution then packages the information about groups, together with statements about the quality and impact of their research , as well as other information that might be relevant to the assessment (such as funding attracted or usage metrics), and stable links to the research outputs (rather than copies of the outputs themselves) into a defined XML format. This is submitted electronically to the DEST Information Management System (IMS). An online submission process will also be available.
- Nominated assessors log in to the IMS system to view this information. Within the IMS the information is displayed for the assessors in a uniform style, to exclude the danger of presentation being used to outshine inadequate content. Each research output that can be displayed electronically will have a link back to the repository that holds it. The link will need to resolve directly to the research output, without displaying any intermediate pages from the repository, and the IMS system will pass on anonymised authentication information. The reason for specifying no intermediate pages is to ensure that institutions do not try to 'gild the research output lily'. The reason for anonymising access is discussed below.
The Role of Repositories in the RQF
As can be seen from the model described above, institutional repositories have a key role to play in the RQF. They will serve as the predominant source of research outputs for assessors to view. This has been a deliberate choice by DEST, as it ensures that the outputs are available to assessors throughout the world, without the necessity of making paper copies available, as was used in the RAE model. It also helps to fill repositories, thus avoiding the oft-cited problem that academics are reluctant to submit work to repositories themselves. This fits into DEST's broader desire to encourage the population of repositories, and make more Australian research accessible to the world.
Challenges Arising from the RQF Model
There are a number of key challenges that arise from this model for making outputs available. These include:
It is the belief of ARROW and DEST that, for most assessors, the preferred version of most research outputs will be one which has clearly been vetted by a peer review process, and which takes the form of the final published version (i.e. with the formatting and pagination of a standard journal article). This is in conflict with the copyright policies of most academic publishers, especially if the outputs are made freely available online, as part of a repository.
- Access control
The assumption therefore is that if publishers are to give permission to use their articles in this form, some type of access control will be required to restrict viewing of them to the designated assessors. For DEST's purposes however, this access control will have to be implemented in a scalable and anonymous fashion. The process is more workable if assessors can be identified to the repository without the institution owning the repository being obliged to create an identity for all possible assessors. Furthermore, DEST does not want institutions to be able to identify any assessors, as this may compromise the process; nor does it want assessors to have to log in to each repository individually. So the repository will need to be able to restrict access, based on a machine-to-machine transfer of a pre-defined identity.
- Integration with other institutional systems
The research outputs are only one part of the process of information collection, as detailed in the previous section. Therefore the more integration there is between the repository and the other data collection systems used at an institution, the less work there is for those collecting the data. The data collection process is a substantial piece of work, especially as this is the first time this exercise has been run.
- Integration with the DEST IMS
A repository being used for the RQF must be able to provide a stable link to the standard demanded by the DEST IMS (i.e. one that resolves directly to the output, not to an intermediate page, for the reasons given above). It must also respond correctly to the https-based authentication challenge that is delivered by the IMS when following a link.
- Storage and access to research outputs other than journal articles
The RQF is intended to measure the quality and impact of research outputs outside the traditional areas of journal articles and conference papers. Performances, artwork, reports and installations, among other things, can be submitted. The fact that such outputs need to be viewable online presents particular challenges for the repository manager.
ARROW's Responses to the Challenges
This remains a significant issue for the RQF. At time of writing, DEST is examining how a sector-wide solution can be achieved. This is not a problem that ARROW can or should attempt to solve in isolation. Fall-back positions could include the use of digital object identifiers (DOIs) that resolve to publisher versions or the use of author versions of papers where available, however DEST have indicated they would prefer not to use these methods. Both involve extra work and/or authentication issues.
- Access control
ARROW has been testing the XACML  access control language as implemented in Fedora 2.2. This will be used to provide datastream-level control of the research outputs. VTLS, the providers of the VITAL software used by ARROW has also enhanced its access control implementation to enable the software to understand the authentication information sent by the DEST IMS.
- Integration with other university systems
A large number of Australian institutions use the Research Master management software  to record and manage their research performance and outputs. ARROW has been working with Research Master P/L to provide a seamless input mechanism for copies of the outputs. The intention is that an output can be attached briefly to a record in the Research Master software, then sent to the repository, and then a handle (a unique, permanent identifier) is returned to Research Master and recorded. The output itself will only be stored in the repository, and not in Research Master. This is designed to simplify the ingest process. Once an output is submitted it will still be subject to review by the repository manager so that access control can be applied if needed, and metadata can be checked.
- Integration with the DEST IMS
VITAL can be configured to allow for a direct link to a research output stored within the repository, using a handle. As discussed above, work is underway to ensure that the DEST authentication standard can be understood by VITAL.
- Storage and access to research outputs other than journal articles
ARROW has been working on this issue in a number of mock exercises. A key strategy has been trying to understand what an assessor might be looking for by asking academics what aspects of their own work they would want to see in order to assess it. ARROW has also been experimenting with different file types, and has been able to load and display most popular formats. There is an ongoing tension associated with this approach - some outputs are being loaded in a format that has a limited life because preservation standards are still unknown, or because of rapid development. However the short-term needs of the RQF have to be met, even if this means that some objects are less permanent than is ideal.
There is a considerable amount of work that still needs to be done before the RQF goes 'live' in April 2008, and it is quite probable that some aspects of this paper will be superseded as events unfold. However, the basic model as proposed above seems sound, and should provide easy online access to research outputs, as well as promoting an expanding use of repositories within Australian institutions.
The ARROW Project wishes to acknowledge the support of the Systematic Infrastructure Initiative as part of the Australian Commonwealth Government's Backing Australia's Ability - An Innovation Action Plan for the Future (BAA).
- For more details on the ARROW Project see: Treloar, A and Groenewegen, D, "ARROW, DART and ARCHER: A Quiver Full of Research Repository and Related Projects" Ariadne, Issue 51, April 30, 2007, http://www.ariadne.ac.uk/issue51/treloar-groenewegen/
- Australian Government Department of Education, Science and Training Web site: Research Quality Framework: Assessing the quality and impact of research in Australia http://www.dest.gov.au/sectors/research_sector/policies_issues_reviews/key_issues/research_quality_framework/default.htm
- Note that 'quality' and 'impact' have very specific meanings within the RQF. 'Impact' in particular is used quite differently in this context to the commonly understood use of the term in academia. In the RQF it refers to the impact of research on the wider community and Australian society in particular.
- Wikipedia entry on eXtensible Access Control Markup Language, retrieved 29 July 2007
- ResearchMaster Pty Ltd http://www.researchmaster.com.au/