Web Magazine for Information Professionals

JSTOR

Daniel Holden reports on his trip to the United States to visit colleagues at JSTOR, a not-for-profit organisation creating a digital archive collection of scholarly journals

In August I was fortunate enough to spend a week visiting the JSTOR offices in the United States. This invaluable experience provided me with the opportunity to discuss the progress being made since the launch of the UK JSTOR Mirror Service [1] at Manchester Information Datasets and Associated Services (MIDAS) [2] and to plan future developments.

History and Background

If you have not heard of JSTOR before: it is an electronic journal collection of core research titles. JSTOR is an acronym for Journal STORage and the focus is on older volumes rather than current issues. The coverage starts from first issues and runs up to between 1 and 7 years before current publication. In the case of some titles users can browse and search over 100 years worth of material (see Figure 1). JSTOR was developed from a Mellon Foundation [3] funded project which had been designed to test whether it was possible for libraries to save space by digitising back-runs of journals. The libraries that tested the prototype database were so enthusiastic about the project that a full service [4] was launched in January 1997 in the US. JSTOR is now an independent, charitable organisation dedicated to helping the academic community and publishers make the most of new information technologies. JSTOR aim to:

* Build a reliable and comprehensive archive of important research journals
* Improve access to these journals
* Help fill gaps in existing back-runs
* Address preservation issues such as mutilated pages and long-term deterioration of paper copy
* Reduce long-term capital and operating costs of libraries associated with the storage and care of journal collections
* Assist scholarly associations and publishers in making the transition to electronic modes of publication
* Study the impact of providing electronic access on the use of these scholarly materials


Staff and students from over 300 North American institutions use the database regularly and in the US, JSTOR logged approximately 20 million hits in the first half of this year.

{short description of image}
Figure 1 Volumes of The Philosophical Review Available for Browsing

UK Mirror Service

Universities and colleges in the United Kingdom have been able to access JSTOR, on a site licence basis, since March 1998 when the UK Mirror Service was launched as a result of negotiations between JSTOR and the Joint Information Systems Committee (JISC) [5]. Based at MIDAS, it is JSTOR’s first overseas mirror and is their first step towards becoming a global resource. It also marks the start of a long-term relationship with the JSIC. MIDAS are responsible for providing user support, documentation and training for the UK audience. I learnt during my visit that JSTOR’s second overseas mirror will be located at the Central European University in Budapest, Hungary. It is being established with support from the Soros Foundation [6] and the Open Society Institute and will service institutions in Central and Eastern Europe. I also learnt that universities and colleges in South America are testing the two servers in the US (located at the University of Michigan and the University of Princeton) to determine whether the level of performance is sufficient to facilitate their inclusion in JSTOR.

JSTOR Production Process

During my visit I spent time at the JSTOR offices in New York and the University of Michigan. The business administration is based in New York and includes the Publisher Relations and Library Relations teams. User Services, Production Services and Technology Support and Development are based in Michigan. There is also a JSTOR office at the University of Princeton where other Production Services and Technology Development staff are based. The workforce totals 35, the largest department being Production Services with a staff of 18. Visiting both New York and Michigan allowed me to follow the production process from initial journal selection through to the addition of new titles to the JSTOR collection. JSTOR select journals on the basis of 4 criteria:

* Citation impact factor data
* Ranking by experts in the field
* Number of institutional subscriptions
* Length of run

Once titles have been chosen, publishers have generally been keen to collaborate with JSTOR because of the new interest it generates in their older materials. In addition, it will soon be possible for publishers to link their current electronic issues with the digitised back-runs so users can search seamlessly through entire journals. At the moment 54 publishers are participating in JSTOR.

After an agreement has been reached, JSTOR acquire the back-run and each page is scanned at 600 dots per inch (dpi) resolution. The resulting images are then processed using Optical Character Recognition (OCR) software to create text files. The text files (which are reviewed with spell-checking software) and table of contents files (which are double keyed) enable browsing and full-text, author, title and abstract searching. When an article is retrieved by a user it is delivered as page images (GIFs), which are stored as TIFFs. In order to reduce the size of the TIFFs the Cartesian Perceptual Compression (CPC) format, developed by Cartesian Products [7], has been adopted. An overview of the technical aspects of JSTOR will be available on their Web site in the near future.

Articles may be printed using Adobe Acrobat, JPrint (which is a helper application developed by JSTOR) or as a PostScript file. The new data is added to the servers on a monthly basis. JSTOR have this taken this unique approach to the storage and delivery of journals to ensure that the appearance of the electronic copy exactly matches that of the printed equivalent (see Figure 2).

{short description of image}
Figure 2 Sample Page from Political Science Quarterly

Phase I

JSTOR is currently in Phase I of development. The aim is to include 100 titles by the end of 1999, in 15 subject clusters: African-American Studies, Anthropology, Asian Studies, Ecology, Economics, Education, Finance, History, Literature, Mathematics, Philosophy, Political Science, Population Studies, Sociology and Statistics. The database is weighted towards the arts, humanities and social sciences because JSTOR recognised that these areas were not as well served by electronic journals as other disciplines. Ninety-seven journals in all subject clusters have been signed so far and 61 are available in the database. As progress is ahead of schedule, JSTOR are already planning Phase II and I was able to witness some of these preparations.

Phase II

Phase II will involve the addition of extra subject clusters. Users from participating sites in the US were recently surveyed to determine which new clusters they would like and which existing clusters they want strengthening. The results of the survey make interesting reading. In the arts and humanities the fields that received the greatest number of ‘Essential’ and ‘Important’ responses were:

* Literary Criticism
* Literature, Literary Journal
* Latin American Studies

In the social sciences:

* History
* Economics
* Psychology
* Political Science

In the sciences:

* Ecology
* Biology
* Chemistry
* General Sciences

JSTOR plan to open out this survey in the US to include non-participants but I discovered that the first two new subject clusters in Phase II will be General Science and Botany/Ecology. The criteria by which the new journals will be chosen will be based on the scheme used for Phase I. One of the first titles that will be available to American users is Science published by the American Association for the Advancement of Science. During Phase II institutions in the US will also be able to select and licence the clusters that are most appropriate to their needs. Feedback from Librarians has indicated that this would be more popular than the blanket licensing arrangement devised for Phase I.

In order to be ready for Phase II and to keep ahead of schedule for Phase I, JSTOR is expanding its production capacity. Extra office space has been acquired in Michigan, a new Production Services department has been opened at Princeton and a second scanning vendor is being sought. Although over 2 million pages have been processed so far and 14 new journals have been released online since the UK Mirror was launched, JSTOR are aiming to increase their digitisation rate from 100,000 to 300,000 pages per month by Christmas 1998.

Conclusion

The launch of JSTOR’s first overseas mirror has been quite an experience. It was extremely useful to finally meet my colleagues face-to-face and discuss all that has happened and plan for the future. They were particularly keen to hear how the mirror had been received and what extra support they could provide for MIDAS. Much of what JSTOR have learnt in the UK will be applied to the development in Budapest. I will be following this very closely and look forward to working with my colleagues at the Central European University. It will be interesting to see how their experiences compare with mine and also which countries participate in JSTOR next. It will also be interesting to see how Phase II is received in the US and which subject clusters are added after General Science and Ecology/Biology. The many developments taking place at JSTOR meant that I had a hectic visit. However this fact, together with the warm welcome I received, made the trip thoroughly enjoyable.

Sources of Further Information

The primary source of information regarding JSTOR are the JSTOR About Pages on the JSTOR Web site.

JSTOR documentation and details about training and user support are available on the MIDAS Web site.

References

[1] UK JSTOR Mirror Service
http://www.jstor.ac.uk/
[2] MIDAS
http://www.midas.ac.uk/
[3] The Andrew W. Mellon Foundation
http://www.mellon.org/
[4] JSTOR
http://www.jstor.org/
[5] JISC
http://www.jisc.ac.uk/
[6] Soros Foundations Network
http://www.soros.org/
[7] Cartesian Products, Inc.
http://www.cartesianinc.com/

Author Details

Daniel Holden
JSTOR Information Officer
MIDAS
Email: Daniel.Holden@mcc.ac.uk
URL: http://midas.ac.uk/jstor/