The previous article about the Overlay Journal Infrastructure for Meteorological Sciences (OJIMS) Project  dealt with an introduction to the concept of overlay journals and their potential impact on the meteorological sciences. It also discussed the business cases and requirements that must be met for overlay journals to become operational as data publications.
There is significant interest in data journals at this time as they could provide a framework to allow the peer-review and citation of datasets, thereby encouraging data scientists to ensure their data and metadata are complete and valid, and granting them academic credit for this work. This would also benefit the wider community as a whole, as data publication would also ensure that expensive (and often irreproducible) data are archived and curated appropriately. Science, as a discipline, benefits from publishing processes that facilitate the appropriate application of data and the reproduceability of experiments.
The OJIMS Project aimed to develop the mechanisms that could support both a new (overlay) Journal of Meteorological Data and an Open-Access Repository for documents related to the meteorological sciences. Its work was conducted by a partnership between the Royal Meteorological Society (RMetS) and two members of the National Centre for Atmospheric Science (NCAS), namely the British Atmospheric Data Centre (BADC) and the University of Leeds.
This article goes into more technical detail about the OJIMS Project, giving details of the software used to deploy a demonstration data journal and operational document repository and the form of the submission processes for each.
At the start of the OJIMS Project, there were three fundamental aims:
The third aim has been detailed in our previous article , so this contribution will concentrate on the details of the first two aims.
The specific objectives of the project were detailed as below.
Set up a repository for meteorology and atmospheric sciences capable of preserving documents relating to the subject area with the following in mind:
Create a demonstration overlay journal system with the following aspects addressed:
Most of these objectives remained the same over the course of the project, though time spent working on the prototype 'star-rated' journal was reduced in order to spend more time on the construction of the prototype data journal. This was decided after in-depth user surveys (as reported in  ) suggested that the meteorological and atmospheric science communities were more interested in a data journal than the provision of a 'star-rated' overlay journal (mainly due to the low levels of documents in pre-existing repositories). It should be pointed out that the software developed to provide the overlay documents for the data journal is nonetheless equally applicable to the 'star-rated' journal.
However after examining the business models, we discovered that the creation and operation of the data and 'star-rated' journals themselves stood quite explicitly outside the project scope as such work required a long-term commitment from a journal publisher.
The main project issues were:
Figure 1 gives an overview of the components required for this project and their interactions. It is worth noting that the software requirements for the data journal and the overlay subject repository are very similar, hence the same basic software (with minor modifications) can be used for both the data journal and overlay subject repository.
The OJIMS Project Web site was produced to act as a dissemination point for the results of the project, and as a collaboration tool for the project partners. The Web site  will remain operational for several years after the project ends to publicise the project results.
The work of the OJIMS Project was conducted by a partnership between the Royal Meteorological Society (RMetS) and two members of the National Centre for Atmospheric Science (the British Atmospheric Data Centre and the University of Leeds).
A key deliverable of the OJIMS Project was to create a discipline-based open access document repository embedded within the BADC. There were two main requirements for the subject repository:
The overlay document requirements are considered in the data journal developments (see Creating the Infrastructure for Overlay Journals) so the subject repository development concentrated on identifying how to provide a suitable place to lodge grey literature.
The deposit policy, documentation and training process for maintenance of the repository system were all developed during the project. The full deposit policy is available on the repository site . It is broken down into separate metadata, data, content, submission and preservation policies. Key parts of the policy are that anyone can access the metadata, full-text and other full data items stored in the repository free of charge, and that items stored in the repository will be retained indefinitely.
Implementation of the subject repository was done by installing the EPrints software (version 3) on a Xen (virtual server) platform running Red Hat Enterprise. The basic configuration was supplemented by:
After populating the repository with some sample content, and training BADC staff to administer the repository, the repository was launched on 30 October 2008, and advertised to BADC users. Documents already held by the BADC and NEODC were were added to the repository. The repository has been running operationally since launch as the Centre for Environmental Data Archival Document Repository (CEDA Docs ).
The repository has the standard EPrints interface with the addition of the tags and comments extensions from the SNEEP Project. The standard repository workflows apply. The repository currently has over 200 items mainly added by BADC staff from existing material held within the data centre. 27 users are registered with the repository.
The OJIMS Project provided the funding to run the CEDA document repository for a year, with the principal expenditure devoted to moderating the deposit of new items into the repository. The sustainability and cost modelling of the repository were also investigated, and the costs of running the repository within the BADC in the long term were not found to be prohibitive. Hence the repository will be maintained for the foreseeable future now that the OJIMS Project has ended.
The infrastructure requirements for the overlay journals are similar, regardless of whether the overlay journal is a data journal, or a 'star-rated' journal. The project team examined current overlay infrastructure tools and technologies and chose the Open Journal Systems (OJS) because of its open source nature and the ease of adaption. A series of interfaces and forms were generated for the publishers and authors, including a peer-review management interface and issue construction interface for publishers, and a submission interface form for authors.
An overlay document is a structure document that is created to annotate another resource with information on the quality of the resource. This document can be referred to as the data description document. However, it contains more than just a description of the data, including, for example, details of the review process context for which it is constructed. It is for this reason that the term 'overlay document' has been coined. The document has three basic elements:
When considering how to encode this information, project staff considered various implementation methods; as this is an annotation document, RDF seemed appropriate. It is potentially harder to render RDF documents for human readers because of RDF's more complex data representation, but as the structure of these documents is not overly complex, it can be done. We took inspiration from annotations of Flickr photos by Masahide Kanzaki .
Only openly available software was used to create the overlay document editor and the structure for the data journal. Any modifications made to the software during the project have been made freely available in the sub-version repository on the OJIMS Web site .
The creation of the overlay documents used in the overlay journals required a custom-built editor system. This was written using the Pylons Web application framework. The editor system supported creation of documents with XML schema, Dublin Core fields for the overlay documents themselves and, for the overlaid dataset, metadata for the data centre. The OJIMS editor is also freely available from the sub-version repository on the OJIMS site and will remain there for the foreseeable future.
This work, led by the RMetS, concentrated on producing viable business plans, as well as submission and acceptance policies for the data and 'star-rated' journal.
The main tasks for the data journal included:
For the 'star-rated' overlay journal, the tasks included:
Both types of overlay journal required sustainability and business modelling. Full details of the policies and procedures for data and star-rated journals can be found in the business models report .
For the data journal the acceptance policy for datasets depends on the subject area covered by the data journal and whether the datasets are stored in an existing data centre that satisfies standards of good practice in archiving and data management and which is registered with the data journal. For example, for a data journal specialising in meteorological data, a dataset of rain gauge measurements stored in the BADC (or other accredited data centre) would be appropriate for publication, while a dataset on road traffic flows would not.
The contents of the data journal could be categorized in the following ways:
For the overlay journal and document repository, two types of ratings for the referenced documents were proposed. The first rating advises readers on how far the material has gone through the independent peer-review process, giving four ratings as explained in Figure 3.
The second form of rating comes from the users of the overlay journal (Figure 4), where users could rate the entry out of 10. The average rating would be displayed alongside the number of reviews and number of downloads.
A demonstration overlay journal system used to produce a data journal has the following requirements:
The production of an overlay document repository can be done using an analogous process.
Figure 5 gives a schematic view of the data journal structure. The data journal contains a database of XML documents relating to various published datasets. These XML data description documents contain links to the datasets as they are published in various accredited data repositories. The data journal editor edits these XML files, but does not make any changes whatsoever to the underlying datasets.
The tactic taken in the development of the demonstration system was to use as much standard online journal technologies a possible, thereby introducing all the functions of journals without engineering new solutions. Various online journal systems considered including the Open Journal Systems (OJS), Digital Publishing System (Dpubs) and Hyperjournal. OJS was chosen because of its open source nature and the ease of adaption. The RIOJA  Project also used this software for exactly these reasons.
The approach used was to add the data description documents into the standard workflow of the journal software. The additional elements needed were a tool to author the data description documents and a method to render the documents.
To create these documents a Web-based authoring tool was developed. This was done using the Pylons Web application framework, which allows the rapid development of Web applications in the Python programming language. The code for this application is available from the sub-version repository on the OJIMS Web site . The editor requires input of metadata about the overlaid dataset and other information such as the author of the document. It also adds information set and constrained by the data journal's review processes. For example, a text description of the review process is the same for all documents and is simply inserted from the editor's configuration.
The XML documents produced by the editor were rendered into a human-readable document using a XSLT style sheet when viewing through the data journal interface (see screenshots below).
The main project achievements have included:
A significant part of the OJIMS project work was the survey of scientists and organisations which served to introduce the work the project was doing at the same time as capture the requirements for the data journal and document repository. The results from these surveys are documented in the reports OJIMS Survey of Organisations  and OJIMS Survey of Scientists .
These surveys and presentations at conferences and meetings served to kick-start a community debate on what materials need archiving and which should be regarded as 'publication-quality'. The OJIMS project has a high profile within the repository and atmospheric science community. At the recent NERC Data Management Workshop (February 2009 ) the OJIMS Project was mentioned in more than one key-note speech, with special emphasis on the data journal and its potential ability to provide academic credit for those data scientists who publish their data.
The OJIMS Project has demonstrated that standard online journal technologies are suitable for the development and operation of a data journal as they allow the use of all the functions of journals without the need to engineer new solutions.
OJIMS also showed that there is a significant desire in the meteorological sciences community for a data journal, as this would allow scientists to receive academic recognition (in the form of citations) for their work in ensuring the quality of datasets. The funders of the research that produces these data also benefit from data publication as it raises the profile of the data, ensuring reuse. Furthermore, such publication encourages the scientists involved to submit to accredited data repositories, where their data will be properly archived.
With regards to standards, the OJIMS data journal system chosen was the Open Journal Systems (OJS) and the repository software was EPrints. Both OJS and EPrints were chosen because of their open source nature and their ease of adaption. However they also offer standard interfaces such as OAI-PMH .
The overlay document schema incorporated Dublin Core metadata and used RDF to encode the needed information.
The project endeavoured to make use of pre-existing and mature software to implement the document repository and the overlay journal infrastructure, modifying it as appropriate. This was to ensure ease of use and stability of the resulting software.
The OJIMS Project would recommend that further work be done on the implementation and operation of a data journal. The authors are aware of one data journal currently in operation, the Earth System Science Data Journal (ESSD) , which has four papers in its library as of time of writing.
The authors would like to acknowledge the Joint Information Systems Committee (JISC) as the principal funder of the OJIMS Project under the JISC Capital Programme call for Projects, Strand D: - 'Repository Start-up and Enhancement Projects' (4/06). Complementary funding was provided by NCAS through the BADC core agreement, and also by the Natural Environment Research Council.