Web Magazine for Information Professionals

Open Archiving Opportunities for Developing Countries: Towards Equitable Distribution of Global Knowledge

Leslie Chan and Barbara Kirsop explore some of the implications of using the OAI Protocol.

Although the World Wide Web is less than a decade old, it already has had a profound impact on scientific publishing and scholarly communication. In particular, open standards and low-cost networking tools are opening many possibilities for reducing and even eliminating entirely the cost barriers to scientific publications. (1)

One development that has great potential value for poorly-resourced countries is “open archiving”, or the deposition of scholarly research papers into networked servers accessible over the Internet. (2) This process allows scientists in the south to retrieve research results from the north through an online interoperable mechanism. Equally, it allows scientists in the south to contribute to the global knowledge base through participation. The purpose of this article is to inform scientists and publishers in the developing world about this and related initiatives and so allow informed decisions to be made about participation. Our intention is not to provide technical details about electronic publishing and the set up of “eprint” servers for open archiving, but rather to focus on the strategic significance of open archiving for scientists from developing countries.

Since 1991, researchers in high-energy physics around the world have been connected through an eprint archive set up by Paul Ginsparg at the Los Alamos National Research Laboratory in New Mexico. Since its inception, the scope of the archive (now known as arXiv) has expanded to include many areas of physics, mathematics and computer science and archived papers can now be accessed free of charge from over a dozen world wide mirror sites (3). The eprint archive receives two-thirds of its two million weekly hits from institutions outside the United States, including many research facilities in developing regions. The archive has become indispensable to researchers world wide, but in particular to research institutions that would otherwise be excluded from the front line of science for economic and sociological reasons.(4)

The success and wide adoption of arXiv has prompted new thinking about the reform of scientific publishing in other disciplines. Scientists have become aware of the many benefits conferred by open archiving, such as the removal of the cost barrier to high-priced journals, the reduction of time in announcing research findings, and the provision of access to all with Internet capability. As a result, other e-servers have been set up (5) and the movement to free scientific publishing from financial restrictions has been growing steadily.(6)

Among the best known proponents of these developments is Stevan Harnad’s advocacy for authors to self-archive their published papers (postprints) which, if adopted widely, would lead to the ultimate removal of cost barriers for the exchange of publicly funded research information.(7) These developments have generated much debate and a number of international initiatives have evolved to refine and standardise the archiving procedure.

One important international movement is the Open Archives Initiative (OAI), which aims to develop and promote the use of a standard protocol, known as the Open Archives Metadata Harvesting Protocol (OAMHP), designed for better sharing and retrieval of eprints residing in distributed archives. (8) With the OAI harvesting protocol, articles in OAI compliant servers will form a global library that facilitates searching, data retrieval, cross-linking, as well as stable long-term archiving.(9)

Types of Archiving

There are various forms of open archiving. The term ‘self archiving’ is often used to refer to the process whereby individual authors submit their own papers to a server or archive of their choice. There are ‘institutional archives’, where authors submit eprints to a server administered by an organisation or scholarly society, commonly their university or research institute; there are also discipline-based archives and other speciality archives.(10)

An important example of a speciality archive is the Electronic Research Archive in International Health (ERA), set up by the long-established international medical journal, The Lancet. (11) This archive allows medical researchers to deposit papers of special relevance to health issues met in many developing countries. Papers submitted are reviewed before acceptance and are thereafter archived and available online free to all.

Benefits for developing countries

Archiving initiatives described above are of great importance to all scientists, but particularly for those in the developing world. Free access to research information from the north would have incalculable benefits for local research.(12) Of equal importance is the opportunity for researchers in these countries to contribute to the global knowledge base by archiving their own research literature, thereby reducing the south to north knowledge gap and professional isolation.(13) Equally, there now exists an increasingly available means to distribute local research in a way that is highly visible and without the difficulties that are sometimes met in publishing in journals (e.g. biased discrimination between submissions generated in the north and south).

A key benefit for developing country scientists is that global participation could take place without further delay. The academic communities in poorer countries can take advantage of servers anywhere in the world offering OAI services, without the need to set up their own independent servers or maintain them. Establishing partners, either S/N or S/S, can minimise infrastructure costs, share expertise and readily become part of the international interoperable effort.

Common misconceptions

Some of the recently established archives have not been as well supported as was hoped because of a number of misconceptions about the nature of the archives and the professional consequences of collaboration. Since any individual could ‘publish’ material online, there is a concern that self-archiving could lead to ‘vanity press’ that has not undergone quality control procedures. However, scholarly archives, while possibly containing both refereed (postprints) and non-refereed material (preprints), nevertheless provide clear options for readers to selectively retrieve material.(14) The experience of physicist/mathematicians who have used open archiving for a number of years shows that quality of research is not jeopardised by the process, since researchers that submit material are concerned with their reputation and professional credibility and their work is open for review by their peers around the world. Therefore, ‘vanity publishing’ by individuals must be distinguished from the institutional or author-archiving of preprints of papers submitted for peer review.

Another concern is that the volume of material available online makes it more difficult to find and retrieve required material. However, efforts such as the OAMHP, with its emphasis on common metadata standards, are designed specifically to address the issues of accurate and efficient retrieval and interoperability with other OAI-compliant servers(15). It should be mentioned that there are a number of large-scale initiatives, such as PubMedCentral,(16) offering free access to papers that are not using the OAI protocols and are therefore not necessarily interoperable with the OAI-compliant archive. But increasingly, these open access archives are moving towards becoming OAI-compliant as the power of interoperability is now widely recognised.(17)

Indeed, due to the growing popularity of the OAI movement, commercial publishers are adopting the OAI protocols in some cases so that their publications are interoperable with free-to-all archived documents. Unlike true open archives, access to such servers are restricted to those who can afford the high cost of subscription, which creates access barriers even for some of the research institutions in the north. It is therefore important to remember that “open” archive does not necessarily mean that the content is available free of charge, as “open” refers to the open technical framework and the open architecture of the archive that promotes easy exchange of information between compatible servers.(18)

Copyright is also seen by many as a major concern. In the paper era, researchers routinely sign away their copyright to publishers in exchange for the opportunity to make their research known and to gain career advancements. However, in the electronic era authors are becoming increasingly aware of their rights and professional need to distribute their own research as widely as they can for maximum impact, while retaining academic credibility through peer review. Authors in developing countries who wish to publish in printed journals should ensure they retain the rights to submit to archives, either at once or subsequently. Some authors who are unable to obtain such rights from their preferred journal have elected to publish elsewhere,(19) and increasing numbers of major journals are relaxing their restrictions on author self-archiving or institutional-archiving.(20) Increasingly, authors archive the pre-print before submission to their chosen journal and, if archiving rights are not obtained, the linked corrigenda.

Limitations

For scientists in poor countries, a major obstacle to participating in these developments is the lack of awareness of the availability of the different mechanisms for distributing and accessing research documents. Since most of the developments and services are on the Internet, the lack of awareness is caused mainly by the lack of telecommunications infrastructure in the developing world. However, there are major international and local efforts to invest in the infrastructure and there is growing optimism that with time this problem of ‘digital divide’ will be resolved.(21) Additionally, the development of telecentres, way stations or staging posts, radio communication and similar efforts will help regional development and encourage participation.(22)

Another cause of the lack of awareness is a lack of concerted effort from the archive institutions and the development agencies to inform and promote the new practices regarding the use of the technology. It is therefore important that the information in this short briefing is distributed as widely as possible.

Where are we now?

The OAI is widely consulting with institutions and library communities in refining standards and protocols that serve the researchers needs. New open archives are becoming established in many universities and libraries that will ultimately become part of the network of archives accessible to all [see directory of archives at www.openarchives.org]. Open source and free software have been developed and are currently being improved for use by institutions wishing to set up their own archives in an interoperable way.(23)

While this international movement is spreading rapidly and its potential is increasingly recognised, the process is at an early stage. Active testings of many of these developments are ongoing and it is important that the needs of developing countries are considered during the refinements.

Conclusion

This is an encouraging time for scientists everywhere as means of communication improve. Opportunities are great, but to ensure that the needs of academic communities in the developing world are not left out, further awareness, consultation and partnership building are required. We recommend that scientists keep aware of these initiatives, keep all publishing options open and inform colleagues of opportunities now underway through regional discussions. The EPT will continue to monitor progress and post new developments on its web site.(24) An experimental server has been set up by one of the EPT Trustees and is ready for evaluation.(25) The issues involved in open archiving and the movement to free scholarly literature are hotly debated on several online discussion fora hosted, for example, by American Scientist and Nature, the past contributions of which are archived and all available online.(26) Readers who wish to familiarise themselves with these issues may wish to consult these archives and the references provided below.

(1) Declan Butler, ‘The writing is on the web for science journals in print’. Nature 397, 195 - 200 (1999).
(2) Networked servers are often referred to as “repositories” or as “archives”, hence the term open archiving. However, the servers are not archives in the technical sense or the library community’s understanding of repositories or archives.
(3) Los Alamos Preprint Archive (arXive): http://www.arxive.org. Note that this archive moved to Cornell University in July 2001.
(4) Subbiah Arunachalam has written extensively on the obstacles scientists face in developing countries. See “Accessing information published in the Third World: Should spreading the word from the Third World always be like swimming against the current?” Journal of Scientific and Industrial Research, 53, 408-417. 1994.
(5) RePEc (Research Papers in Economics, http://www.repec.org/), CogPrints (for cognitive science, http://cogprints.soton.ac.uk/), Economics and the PhilSci Archive (for philosophers of science, http://philsci-archive.pitt.edu/).
(6) The Free Online Scholarship (FOS) newsletter published by Peter Suber is a highly useful source for keeping up to date with developments in all areas related to the electronic scholarly publishing: http://www.earlham.edu/~peters/fos/index.htm
(7) The many informed writings by Steven Harnad on the movement to free the refereed literature is available on his personal website: http://cogsci.soton.ac.uk/~harnad/intpub.html. Read in particular his more recent paper, “For Whom the Gate Tolls? How and Why to Free the Refereed Research Literature Online Through Author/Institution Self-Archiving, Now.” http://www.cogsci.soton.ac.uk/~harnad/Tp/resolution.htm
(8) See key documents on the OAI web site: http://www.openarchives.org
(9) See Lynch, Clifford (2001) Metadata Harvesting and the Open Archives Initiative. ARL Bimonthly Report 217 August 2001. http://www.arl.org/newsltr/217/mhp.html
(10) Op cit.
(11) The Lancet’s NetPrints for Clinical Medicine and Health Research: http://clinmed.netprints.org
(12) Godlee et al. Global information flow: Publishers should provide information free to resource poor countries. BMJ 2000; 321: 776-777 ( 30 September ). Online: http://www.bmj.com/cgi/content/full/321/7264/776
(13) Canhos et al. “Close the South-North knowledge gap”, Nature, Vol 397, pg. 201, Jan. 1999
(14) See for example the Open Citation (OpCit) project at the University of Southampton: http://opcit.eprints.org/. See also Hitchcock and Hall, “How Dynamic E-journals can Interconnect Open Access Archives.” Online: http://www.ecs.soton.ac.uk/~sh94r/elpub01-online.html
(15) see 9.
(16) PubMedCentral: http://www.pubmedcentral.org
(17) Two notable services that recently announced their compliance to the OAI protocols are BioMedCentral http://www.biomedcentral.com, and the Chemistry Preprint Server http://preprint.chemweb.com/
(18) See the FAQ on the OAI web site: http://www.openarchives.org
(19) The ‘Public Library of Science’, http://www.publiclibraryofscience.org/, and the Scholarly Publishing and Academic Resources Coalition supported by the Association of Research Libraries, are significant movements that are hastening significant reforms in scholarly publishing and lowering and removing barriers to access to publicly funded research results. See http://www.arl.org/sparc/home/ for details.
(20) The prestigious journals Nature and Science, and the American Psychological Association have all recently relaxed their policies regarding author self-archiving.
(21) The United Nation has been greatly concerned about the imbalance in access to communication facilities. An ICT Task Force of the United Nations has recently been set up by Secretary-General Kofi Annan “to find new, creative and quick-acting means to spread the benefits of the digital revolution and avert the prospect of a two-tiered world information society.” See http://www.unicttaskforce.org
(22) Telecentres, Waystations/Staging Posts: http://www.waystations.org
(23) The free ‘eprints software’ released by The University of Southampton, http://www.eprints.org, is designed to run centralised, discipline-based as well as distributed, institution-based archives of scholarly publications and ‘Kepler’ is a simple OAI repository tool that claims to allow individual researchers to participate in the OAI with a minimum of effort. Details about Kepler and how it is implemented can be found at: http://www.dlib.org/dlib/april01/maly/04maly.html
(24) The EPT web site: http://www.epublishingtrust.org. See also The Electronic Publishing Trust for Development (EPT): putting developing country journals online. Proceedings of Scientific Communication & Publishing in the Information Age. 1999. Online: http://www.inasp.info/psi/scpw/papers/kirsop.html
(25) The eprint server is located at: http://eprints.utsc.utoronto.ca
(26) American Scientists forum on freeing the refereed scientific literature: http://amsci-forum.amsci.org/archives/september98-forum.html. Nature’s Forum on future e-access to the primary literature: http://www.nature.com/nature/debates/e-access/

Author Details

 
Leslie Chan,
Centre for Instructional Technology Development,
University of Toronto at Scarborough;
a Trustee of the EPT and is also the Associate Director of Bioline International which distributes journals from developing countries.
Email: chan@scar.utoronto.ca
 
Barbara Kirsop
is the Secretary of the EPT and an advisory board member for Bioline International.barbara@biostrat.demon.co.uk,