With an increasing number of publications being made available digitally, and new supply chains and business models emerging for trading them, an urgent need has been identified for a standard way of expressing and communicating usage terms, and linking those terms to the publications.
Reflecting the development pattern of the markets, this need was first identified in the scholarly journals sector. More recently, a similar requirement has been articulated for the communication of usage terms between publishers' digital repositories and search engines such as Google.
EDItEUR, the international trade standards organisation for the book and journals sectors, has been working with stakeholders to develop ONIX for Licensing Terms, a new set of formats that will enable the full range and complexity of licensing terms to be expressed in a structured machine-readable form and communicated between systems using a standard XML schema. This article explains why and how these standards have been developed.
EDItEUR formed a joint Rights Metadata Working Party with Book Industry Communication (BIC) in the UK and the National Information Standards Organization (NISO) in the US as long ago as 1998 'to collaborate with other bodies to help define an international standard for rights metadata elements'. This work was taken up by the EU <indecs> (Interoperability of Data in Electronic Commerce Systems) Project  between 1998 and 2000, culminating in the influential but rather theoretical <indecs> Metadata Framework .
Meanwhile, as the number of digital resources in library collections continued to grow, libraries were having difficulty in complying with the widely differing licence terms applied to those resources by their creators and publishers. The ability to receive these terms into a library's electronic resource management system in a machine-readable form, link them to the appropriate digital resources and communicate them to users was becoming a pressing need.
In the United States, following a report by Tim Jewell at the University of Washington on the selection and presentation of commercially available electronic resources , an informal working group was set up to work on the functionality and data elements required to manage these resources.
The Digital Library Federation (DLF), a grouping of the major US academic research libraries, co-sponsored with NISO a workshop on Standards for Electronic Resource Management. It also set up the Electronic Resource Management Initiative (ERMI) to aid the rapid development of library systems by providing a series of papers to help both to define requirements and to propose data standards for the management of electronic resources .
EDItEUR commissioned an evaluation of the ERMI work from the Rightscom Consultancy. The aim was to assess the extent to which it might provide a basis for standard XML formats that would take into account the requirements of all the stakeholders in the supply chain, provide for the full complexity of licence expression, and be flexible enough to support any business model as well as all media types. The assessment paper concluded that the ERMI work would provide extremely valuable input but would require further development in order to meet all these requirements.
With funding from the Publishers Licensing Society (PLS) and the JISC, Rightscom were commissioned to undertake a 'proof of concept' project, working with the EDItEUR ONIX team (David Martin and Francis Cave) to explore the possibility of developing an ONIX for Licenses message that could be used by publishers and online hosts to communicate licence terms to libraries and subscription agents.
The aim of the project was to produce a prototype XML message for communicating in a computable form the terms of a Licence agreement for the use, by libraries, of a publisher's digital works. The main use case was the licensing of eJournals (electronic Journals), but the structure of the message was to be flexible enough to be extensible to any other type of digital media and licence in the future by adding to its semantics but not significantly changing its structure. The message therefore needed to be generic in structure but successfully demonstrate an initial, specialised application.
The prototype message was produced as an XML schema and had, in conceptual terms, a relatively simple structure containing the following four main elements:
In addition the message incorporates the necessary relationship structure between Usages and Requirements to make clear their dependencies on one another as Conditions and Exceptions.
The prototype message demonstrated that each element of the example Licensing clauses could be fully modelled in this way. The modelling also highlighted the range of possible variations within even apparently simple licensing clauses; and the limitations of the current ERMI approach of defining only a 'typical' set of Usages with no mechanism for variation.
The use of an underlying ontology meets the 'future-proofing' requirements for flexibility and extensibility. In its current form, the prototype message could express Licensing Terms for any kind of content or use, given the necessary ontology.
The prototype 'ONIX for Licensing Terms' message was produced within the constraints of the above requirements. It was demonstrated at a dedicated 'proof of concept' workshop held in London in April 2005 which was attended by representatives of the sponsoring organisations, publishers, librarians, agents and systems vendors as well as members of the DLF ERMI team. There was a consensus view that the approach taken had the potential to fulfil all its objectives.
This led to two further JISC-funded projects. The first, undertaken by BIC, Cranfield University and John Wiley, aimed to create an XML expression of a complete sample Licence, ensuring that any questions of interpretation of the semantics of the Licence were as far as possible resolved with the participation of both publisher and library representatives. At the same time, the terms found necessary to express the Licence would be added to the ONIX Licensing Terms (OLT) Dictionary, which supports the ONIX for Licensing Terms formats.
The second project, undertaken by BIC, Loughborough University and the Association of Learned and Society Publishers (ALPSP), was to promote the benefits of electronic expression of licensing terms to both libraries and publishers, examine the difficulties that not-for profit and smaller publishers, including learned societies, might have in generating an XML version of their library licences, and show how tools and services could be developed to support them. A particularly valuable deliverable was the specification of tools to help publishers draft XML formats of their licences.
The first manifestation of ONIX for Licensing Terms arising out of these projects, is an ONIX Publications Licence format, ONIX-PL, intended to support the communication of licensing terms for electronic resources from a publisher to a user institution (e.g. an academic library or consortium), either directly or through a subscription agent. The purpose is to enable the licence terms to be loaded into an electronic resources management system maintained by the receiving institution. The ONIX-PL format may also be used for the communication of licensing terms from a content host system to a user institution; and it should also be possible to extend it for the communication of licensing terms from a publisher to a content host system that carries the publisher's materials.
In order to ensure input and buy-in from the US library and system vendor community, a joint EDItEUR / DLF / NISO / PLS License Expression Working Group was set up with members from all stakeholding sectors including publishers, hosts, agents, libraries and systems vendors. This is a very large group with sixty members that meets, so far, only by teleconference. Its role is to monitor and make recommendations regarding the further development of standards relating to electronic resources and licence expression, including, but not limited to, the ERMI and EDItEUR work and to engage actively in the development of the ONIX-PL licence messaging specification.
Most of the major library systems vendors developing electronic resource management systems have already indicated their intention to implement the ONIX-PL format to a greater or lesser extent. As we write, a workshop is being arranged in Boston at which the EDItEUR consultants, David Martin and Francis Cave, will work with the developers of Electronic Resource Management systems on implementation of ONIX-PL.
On the publisher side, funding has been provided by JISC and PLS, for development of prototype publisher drafting tools, as specified in the second JISC project above. As readers will see if they look at the ONIX Publications License documentation on the EDItEUR Web site, the message appears very complicated. This is inevitable if it is to be able to express the full complexity of a conventional written licence. These drafting tools will enable publishers to choose from, and where necessary extend, a menu of clauses and terms, and create a machine-readable ONIX-PL licence without needing to engage with the format on a technical level.
It was always envisaged that the underlying design of ONIX for Licensing Terms should support the development not just of ONIX-PL but of a whole family of formats for communicating licensing terms between rights owners and any licensed party. It was therefore identified by some major projects in the electronic book, periodical and newspaper industries as an existing platform on which to build the necessary standards to provide permissions information to search engine "crawlers" so that search engines are able to comply with those terms systematically and, in turn, grant the appropriate access to their users. All the stakeholders acknowledge that the search engines can only be expected to accept this model if the content providers provide this information in a standard form.
To meet this requirement, ACAP (Automated Content Access Protocol) is being developed as an industry standard, sponsored by the World Association of Newspapers, European Publishers Council and International Publishers Association, working with search engines and other technical and commercial partners.
ACAP will enable the providers of all types of content published on the World Wide Web to communicate permissions information (relating to access and use of that content) in a form that can be automatically recognised and interpreted, so that business partners can systematically comply with the publishers' policies. ACAP will provide a framework that will allow any publisher, large or small, to express access and use policies in a language that search engines' robot crawlers can be taught to understand. ACAP has already expressed its intention to work with EDItEUR and base its protocol on ONIX for Licensing Terms, although it will also be essential to develop simpler methods of expressing permissions that can be interpreted 'on the fly' by crawlers.
In a separate development, EDItEUR is working with the reproduction rights organisations, both in the UK and, through the International Federation of Reproduction Rights Organisations (IFRRO) to develop ONIX for Licensing Terms-based messages for the communication of rights information between such organisations and, potentially, between them and publishers.