The Framework addresses the need to synchronise resources between Web sites. &nbsp;Resources cover a wide spectrum of types, such as metadata, digital objects, Web pages, or data files. &nbsp;There are many scenarios in which the ability to perform some form of synchronisation is required. <a href="/issue70/lewis-et-al#author1">Stuart Lewis</a>, <a href="/issue70/lewis-et-al#author2">Richard Jones</a> and <a href="/issue70/lewis-et-al#author3">Simeon Warner</a> explain some of the motivations behind the development of the ResourceSync Framework.

This article describes the motivations behind the development of the ResourceSync Framework. The Framework addresses the need to synchronise resources between Web sites. Resources cover a wide spectrum of types, such as metadata, digital objects, Web pages, or data files. There are many scenarios in which the ability to perform some form of synchronisation is required. Examples include aggregators such as Europeana that want to harvest and aggregate collections of resources, or preservation services that wish to archive Web sites as they change. <a href="/issue65/thompson-hs#author1">Henry S. Thompson</a> describes how recent developments in Web technology have affected the relationship between URI and resource representation and the related consequences.

URI stands for Uniform Resource Identifier, the official name for those things you see all the time on the Web that begin 'http:' or 'mailto:', for example http://www.w3.org/, which is the URI for the home page of the World Wide Web Consortium [1]. (These things were called URLs (Uniform Resource Locators) in the early days of the Web, and the change from URL to URI is either hugely significant or completely irrelevant, depending on who is talking—I have nothing to say about this issue in this article. If you have never heard of URIs (or IRIs, the even more recent fully internationalised version), but are familiar with URLs, just think 'URL' whenever you see 'URI' below.)</p> <p>Historically, URIs were mostly seen as simply the way you accessed Web pages. These pages were hand-authored, relatively stable and simply shipped out on demand. More and more often that is no longer the case; in at least three different ways:</p> <ul> <li>Web pages for reading have been complemented by pictures for viewing, videos for watching and music for listening;</li> <li>The Web is now more than a conduit for information, it is a means to a variety of ends; we use it to <em>do</em> things: purchase goods and services, contribute to forums, play games;</li> <li>The things we access on the Web are often not hand-authored or stable, but are automatically synthesised from 'deeper' data sources on demand. Furthermore, that synthesis is increasingly influenced by aspects of the way we initiate the access.</li> </ul> <p>It is against this background that I think it is worth exploring with some care what URIs were meant to be, and how they are being used in practice. In particular, I want to look at what is to be gained from a better understanding of how other kinds of identifiers work.</p> <h2 id="The_Official_Version">The Official Version</h2> <p>Insofar as there are definitive documents about all this, they all agree that URIs are, as the third initial says, <strong>identifiers</strong>, that is, names. They identify <strong>resources</strong>, and often (although not always) allow you to access <strong>representations</strong> of those resources. (Words in <strong>bold</strong> are used as technical terms—their ordinary language meaning is in many cases likely to be more confusing than helpful.)</p> <p>'Resource' names a role in a story, not an intrinsically distinguishable subset of things, just as 'referent' does in ordinary language. Things are resources because someone created a URI to identify them, not because they have some particular properties in and of themselves.</p> <p>'Representation' names a pair: a character sequence and a media type. The <strong>media type</strong> specifies how the character string should be interpreted. <a href="/issue62/nicholas-et-al#author1">Nick Nicholas</a>, <a href="/issue62/nicholas-et-al#author2">Nigel Ward</a> and <a href="/issue62/nicholas-et-al#author3">Kerry Blinco</a> present an information model of digital identifiers, to help bring clarity to the vocabulary debates from which this field has suffered.

Discussion of digital identifiers, and persistent identifiers in particular, has often been confused by differences in underlying assumptions and approaches. To bring more clarity to such discussions, the PILIN Project has devised an abstract model of identifiers and identifier services, which is presented here in summary. Given such an abstract model, it is possible to compare different identifier schemes, despite variations in terminology; and policies and strategies can be formulated for persistence without committing to particular systems. The abstract model is formal and layered; in this article, we give an overview of the distinctions made in the model. This presentation is not exhaustive, but it presents some of the key concepts represented, and some of the insights that result.</p> <p>The main goal of the Persistent Identifier Linking Infrastructure (PILIN) project [<a href="#1">1</a>] has been to scope the infrastructure necessary for a national persistent identifier service. There are a variety of approaches and technologies already on offer for persistent digital identification of objects. But true identity persistence cannot be bound to particular technologies, domain policies, or information models: any formulation of a persistent identifier strategy needs to outlast current technologies, if the identifiers are to remain persistent in the long term.</p> <p>For that reason, PILIN has modelled the digital identifier space in the abstract. It has arrived at an ontology [<a href="#2">2</a>] and a service model [<a href="#3">3</a>] for digital identifiers, and for how they are used and managed, building on previous work in the identifier field [<a href="#4">4</a>] (including the thinking behind URI [<a href="#5">5</a>], DOI [<a href="#6">6</a>], XRI [<a href="#7">7</a>] and ARK [<a href="#8">8</a>]), as well as semiotic theory [<a href="#9">9</a>]. The ontology, as an abstract model, addresses the question 'what is (and isn't) an identifier?' and 'what does an identifier management system do?'. This more abstract view also brings clarity to the ongoing conversation of whether URIs can be (and should be) universal persistent identifiers.</p> <h2 id="Identifier_Model">Identifier Model</h2> <p>For the identifier model to be abstract, it cannot commit to a particular information model. The notion of an identifier depends crucially on the understanding that an identifier only identifies one distinct thing. But different domains will have different understandings of what things are distinct from each other, and what can legitimately count as a single thing. (This includes aggregations of objects, and different versions or snapshots of objects.) In order for the abstract identifier model to be applicable to all those domains, it cannot impose its own definitions of what things are distinct: it must rely on the distinctions specific to the domain.</p> <p>This means that information modelling is a critical prerequisite to introducing identifiers to a domain, as we discuss elsewhere [<a href="#10">10</a>]: identifier users should be able to tell whether any changes in a thing's content, presentation, or location mean it is no longer identified by the same identifier (i.e. whether the identifier is restricted to a particular version, format, or copy).</p> <p>The abstract identifier model also cannot commit to any particular protocols or service models. In fact, the abstract identifier model should not even presume the Internet as a medium. A sufficiently abstract model of identifiers should apply just as much to URLs as it does to ISBNs, or names of sheep; the model should not be inherently digital, in order to avoid restricting our understanding of identifiers to the current state of digital technologies. This means that our model of identifiers comes close to the understanding in semiotics of signs, as our definitions below make clear.</p> <p>There are two important distinctions between digital identifiers and other signs which we needed to capture. First, identifiers are managed through some system, in order to guarantee the stability of certain properties of the identifier. This is different to other signs, whose meaning is constantly renegotiated in a community. Those identifier properties requiring guarantees include the accountability and persistence of various facets of the identifier—most crucially, what is being identified. For digital identifiers, the <strong>identifier management system</strong> involves registries, accessed through defined services. An HTTP server, a PURL [<a href="#11">11</a>] registry, and an XRI registry are all instances of identifier management systems.</p> <p>Second, digital identifiers are straightforwardly <strong>actionable</strong>: actions can be made to happen in connection with the identifier. Those actions involve interacting with computers, rather than other people: the computer consistently does what the system specifies is to be done with the identifier, and has no latitude for subjective interpretation. This is in contrast with human language, which can involve complex processes of interpretation, and where there can be considerable disconnect between what a speaker intends and how a listener reacts. Because the interactions involved are much simpler, the model can concentrate on two actions which are core to digital identifiers, but which are only part of the picture in human communication: working out what is being identified (<em>resolution</em>), and accessing a representation of what is identified (<em>retrieval</em>).</p> <p>So to model managing and acting on digital identifiers, we need a concept of things that can be identified, names for things, and the relations between them. (Semiotics already gives us such concepts.) We also need a model of the systems through which identifiers are managed and acted on; what those systems do, and who requests them to do so; and what aspects of identifiers the systems manage.</p> <p>Our identifier model (as an ontology) thus encompasses:</p> <ul> <li><strong>Entities</strong> - including actors and identifier systems;</li> <li><strong>Relations</strong> between entities;</li> <li><strong>Qualities</strong>, as desirable properties of entities. Actions are typically undertaken in order to make qualities apply to entities.</li> <li><strong>Actions</strong>, as the processes carried out on entities (and corresponding to <strong>services</strong> in implementations);</li> </ul> <p>An individual identifier system can be modelled using concepts from the ontology, with an identifier system model.</p> <p>In the remainder of this article, we go through the various concepts introduced in the model under these classes. We present the concept definitions under each section, before discussing issues that arise out of them. <em>Resolution</em> and <em>Retrieval</em> are crucial actions for identifiers, whose definition involves distinct issues; they are discussed separately from other Actions. <a href="/issue58/eckes-segbert#author1">Georg Eckes</a> and <a href="/issue58/eckes-segbert#author2">Monika Segbert</a> describe a Best Practice Network funded under the eContentplus Programme of the European Commission, which is building a portal for access to film archival resources in Europe.

The European Film Gateway (EFG) [1] is one in a series of projects funded by the European Commission, under the eContentplus Programme, with the aim of contributing to the development and further enhancement of Europeana - the European digital library, museum and archive [2]. <a href="/issue57/voss#author1">Jakob Voss</a> combines OpenSearch and unAPI to enrich catalogues.

In recent years the principle of Service-oriented Architecture (SOA) has grown increasingly important in digital library systems. <a href="/issue56/gatenby#author1">Janifer Gatenby</a> identifies criteria for determining which data in various library systems could be more beneficially shared and managed at a network level. <a href="/issue56/tonkin#author1">Emma Tonkin</a> looks at the current landscape of persistent identifiers, describes several current services, and examines the theoretical background behind their structure and use.

What Is a Persistent Identifier, and Why?

Persistent identifiers (PIs) are simply maintainable identifiers that allow us to refer to a digital object – a file or set of files, such as an e-print (article, paper or report), an image or an installation file for a piece of software. <a href="/issue54/allinson-et-al#author1">Julie Allinson</a>, <a href="/issue54/allinson-et-al#author2">Sebastien Francois</a> and <a href="/issue54/allinson-et-al#author3">Stuart Lewis</a> describe the JISC-funded SWORD Project which has produced a lightweight protocol for repository deposit. <a href="/issue48/chudnov-et-al#author1">Dan Chudnov</a> and a team of colleagues describe unAPI, a tiny HTTP API for serving information objects in next-generation Web applications.

Common Web tools and techniques cannot easily manipulate library resources. <a href="/issue46/dcc-fpw-rpt#author1">Maureen Pennock</a> reports on a two-day workshop on Future-Proofing Web Sites, organised by the Digital Curation Centre (DCC) and the Wellcome Library at the Wellcome Library, London, over 19-20 January 2006.

This DCC [1] and Wellcome Library [2] workshop sought to provide insight into ways that content creators and curators can ensure ongoing access to reliable Web sites over time. <a href="/issue37/ecdl-web-archiving-rpt#author1">Michael Day</a> reports on the 3rd ECDL Workshop on Web Archives held in Trondheim, August 2003.

On 21 August 2003, the 3rd ECDL Workshop on Web Archives [1] [2] was held in Trondheim, Norway in association with the 7th European Conference on Digital Libraries (ECDL) [3]. <a href="/issue24/metadata#author1">Paul Miller</a> on Digital Object Identifiers.

People, places, and things are identified in any number of different ways. (See Clare McClean's report in Ariadne issue 6 for more details <a href="#REF1">[1]</a>). <a href="/issue10/web-focus#author1">Brian Kelly</a> reports on the TALiSMAN seminar: Copyright and the Web.

In April a former colleague of mine from Leeds University sent a message to me about a strange copyright statement she had come across on the web, and asked for my comments. <a href="/issue9/trenches#author1">Jon Knight</a> looks at how the Web is currently undergoing the sometimes painful internationalization process required if it is to live up to its name of the World Wide Web. Performance and Security - Notes for System Administrators: <a href="/issue8/unix-security#author1">Andy Powell</a> offers some hints and tips on the performance and security aspects of running electronic library services on UNIX based machines.

The eLib Technical Concertation day last November brought together techies from many of the eLib projects. <a href="/issue8/canberra-metadata#author1">Paul Miller</a> and <a href="/issue8/canberra-metadata#author2">Tony Gill</a> offer a view of the recent Dublin Core metadata workshop in the Australian capital, Canberra.

Continuing a long and glorious tradition, the 4th Dublin Core Workshop [1] last month went to a really nice country and picked one of the least lively settlements in which to meet. <a href="/issue5/securing-forms#author1">Jon Knight</a> discusses some of the options available to the designers and implementors of HTML FORMs for providing authentication of users in a library environment.

There are now many HTML FORMs in use in libraries of all types.