You may recall in Issue 61 an open and sincere investigation by Michael Kennedy into his views of the wider involvement of non-professionals in the generation of information for archival entries. This article is based on a presentation given at the Innovations in Reference Management workshop, January 2010. They generally comprise resources that can be neatly and discretely bound in the covers of a book or journal, or their electronic analogues, like the Portable Document Format (PDF): objects in established library or database systems, with ISBNs and ISSNs underwritten by the authority of formal publication and legal deposit.

Yet, increasingly, native Web resources are also becoming eminently citable, and managing both the resources, and references to them, is an ongoing challenge. Moreover, the issues associated with referencing this kind of material have received comparatively little attention, beyond introducing the convention that includes the URL and the date it was accessed in bibliographies. While it may be hard to quantify the "average lifespan of a web page" [1], what is undeniable is that Web resources are highly volatile and prone to deletion or amendment without warning.

Web Preservation is one field of endeavour which attempts to counter the Web's transient tendency, and a variety of approaches continue to be explored. The aim of this article is to convey the fairly simple message that many themes and concerns of Web preservation are equally relevant in the quest for effective reference management in academic research, particularly given the rate at which our dependence on Web-delivered resources is growing.

Digital preservation is, naturally, a strong theme in the work of the University of London Computer Centre (ULCC)'s Digital Archives Department, and Web preservation has featured particularly strongly in recent years. This article will draw upon several initiatives with which we have been involved recently. These include: the 2008 JISC Preservation of Web Resources Project (JISC-PoWR) [2], on which we worked with Brian Kelly and Marieke Guy of UKOLN; our work for the UK Web Archiving Consortium; and the ongoing JISC ArchivePress Project [3] (itself, in many ways, a sequel to JISC-PoWR).

Another perspective that I bring is as a part-time student myself, on the MSc E-Learning programme at Edinburgh University. As a consequence I have papers to read, and write, and a dissertation imminent. So for this reason too I have a stake in making it easier to keep track of information for reading lists, footnotes and bibliographies, whether with desktop tools or Web-based tools, or through features in online VLEs, databases and repositories. An Attack on Professionalism and Scholarship? Democratising Archives and the Production of Knowledge http://www.ariadne.ac.uk/issue62/flinn

Andrew Flinn describes some recent developments in democratising the archive and asks whether these developments really deserve to be viewed as a threat to professional and academic standards.

This article was originally delivered as a paper for the 'Archives 2.0: Shifting Dialogues Between Users and Archivists' conference organised by the University of Manchester's ESRC Centre for Research on Socio-Cultural Change (CRESC) in March 2009. The paper came at an opportune time. I was absorbed in a research project examining independent and community archival initiatives in the UK and exploring the possibilities of user- (or community-)generated and contributed content for archives and historical research [1]. Furthermore I had just received referees' comments on a proposed research project examining the potential impact of the latter developments on professional archival practice. Whilst two of the reports were very positive, one was more than a little hostile. The reviewer was scathing about the focus of the proposed research on the democratisation of knowledge production, dismissing the notion as part of a short-term political agenda that was detrimental to the idea of scholarship and one with which the archive profession should not concern itself. In particular, scorn was reserved for the idea that, in future archive catalogues, many 'voices' might be enabled 'to supplement or even supplant the single, authoritative, professional voice', an idea which was described as being, in extremis, 'a frontal attack on professionalism, standards and scholarship'.

At the time of receiving this review and considering my response, I was also beginning to write my paper for the conference and had already decided that my theme would be democratising the archive. However I realised that these comments neatly encapsulated a powerful and genuine strand of thinking within the archive profession and academia more generally, which one might loosely term 'traditional'. Although there are now many user-generated content archive and heritage projects in existence, and terms such as participatory archives, Archives 2.0 and even History 2.0 are an increasingly common part of professional discourse [2], some, perhaps many, archivists and scholars remain deeply sceptical about the need for a democratisation of the archive and of scholarship.

In the end the research project was supported by the AHRC despite the critical review and has now commenced [3]. However, in this brief article I will try to respond to this strand of thinking by, first identifying what is meant by the democratisation of the archive and why advocates of such a thing believe it to be important. I will then briefly introduce two different but linked developments (independent or community archives and user- or community-generated content), which in harness with new technologies might play a role in such a democratisation, and in so doing challenge aspects of traditional archival thinking and practice. Finally I will offer a few thoughts on the shifts in our understanding of the archive and the resistance to those shifts. Ultimately, I will suggest that rather than viewing this debate as one between the expert (or the academic or the professional) and the crowd, it is in the concept of communities that the key might be found. A successful democratised and participatory archive is one which recognises that all those who come into contact with the archive (directly or indirectly), the 'community of the record', can and do affect our understanding and knowledge of that archive. Access, management and sharing of information about research activities and researchers (who, what, when and where) lie at the heart of all these needs and driving forces for improvements. To bring more clarity to such discussions, the PILIN Project has devised an abstract model of identifiers and identifier services, which is presented here in summary. Given such an abstract model, it is possible to compare different identifier schemes, despite variations in terminology; and policies and strategies can be formulated for persistence without committing to particular systems. The abstract model is formal and layered; in this article, we give an overview of the distinctions made in the model. This presentation is not exhaustive, but it presents some of the key concepts represented, and some of the insights that result.

The main goal of the Persistent Identifier Linking Infrastructure (PILIN) project [1] has been to scope the infrastructure necessary for a national persistent identifier service. There are a variety of approaches and technologies already on offer for persistent digital identification of objects. But true identity persistence cannot be bound to particular technologies, domain policies, or information models: any formulation of a persistent identifier strategy needs to outlast current technologies, if the identifiers are to remain persistent in the long term.

For that reason, PILIN has modelled the digital identifier space in the abstract. It has arrived at an ontology [2] and a service model [3] for digital identifiers, and for how they are used and managed, building on previous work in the identifier field [4] (including the thinking behind URI [5], DOI [6], XRI [7] and ARK [8]), as well as semiotic theory [9]. The ontology, as an abstract model, addresses the question 'what is (and isn't) an identifier?' and 'what does an identifier management system do?'. This more abstract view also brings clarity to the ongoing conversation of whether URIs can be (and should be) universal persistent identifiers.

Identifier Model

For the identifier model to be abstract, it cannot commit to a particular information model. The notion of an identifier depends crucially on the understanding that an identifier only identifies one distinct thing. But different domains will have different understandings of what things are distinct from each other, and what can legitimately count as a single thing. (This includes aggregations of objects, and different versions or snapshots of objects.) In order for the abstract identifier model to be applicable to all those domains, it cannot impose its own definitions of what things are distinct: it must rely on the distinctions specific to the domain.

This means that information modelling is a critical prerequisite to introducing identifiers to a domain, as we discuss elsewhere [10]: identifier users should be able to tell whether any changes in a thing's content, presentation, or location mean it is no longer identified by the same identifier (i.e. whether the identifier is restricted to a particular version, format, or copy).

The abstract identifier model also cannot commit to any particular protocols or service models. In fact, the abstract identifier model should not even presume the Internet as a medium. A sufficiently abstract model of identifiers should apply just as much to URLs as it does to ISBNs, or names of sheep; the model should not be inherently digital, in order to avoid restricting our understanding of identifiers to the current state of digital technologies. This means that our model of identifiers comes close to the understanding in semiotics of signs, as our definitions below make clear.

There are two important distinctions between digital identifiers and other signs which we needed to capture. First, identifiers are managed through some system, in order to guarantee the stability of certain properties of the identifier. This is different to other signs, whose meaning is constantly renegotiated in a community. Those identifier properties requiring guarantees include the accountability and persistence of various facets of the identifier—most crucially, what is being identified. For digital identifiers, the identifier management system involves registries, accessed through defined services. An HTTP server, a PURL [11] registry, and an XRI registry are all instances of identifier management systems.

Second, digital identifiers are straightforwardly actionable: actions can be made to happen in connection with the identifier. Those actions involve interacting with computers, rather than other people: the computer consistently does what the system specifies is to be done with the identifier, and has no latitude for subjective interpretation. This is in contrast with human language, which can involve complex processes of interpretation, and where there can be considerable disconnect between what a speaker intends and how a listener reacts. Because the interactions involved are much simpler, the model can concentrate on two actions which are core to digital identifiers, but which are only part of the picture in human communication: working out what is being identified (resolution), and accessing a representation of what is identified (retrieval).

So to model managing and acting on digital identifiers, we need a concept of things that can be identified, names for things, and the relations between them. (Semiotics already gives us such concepts.) We also need a model of the systems through which identifiers are managed and acted on; what those systems do, and who requests them to do so; and what aspects of identifiers the systems manage.

Our identifier model (as an ontology) thus encompasses:

Entities - including actors and identifier systems;
Relations between entities;
Qualities, as desirable properties of entities. Actions are typically undertaken in order to make qualities apply to entities.
Actions, as the processes carried out on entities (and corresponding to services in implementations);

An individual identifier system can be modelled using concepts from the ontology, with an identifier system model.

In the remainder of this article, we go through the various concepts introduced in the model under these classes. We present the concept definitions under each section, before discussing issues that arise out of them. Resolution and Retrieval are crucial actions for identifiers, whose definition involves distinct issues; they are discussed separately from other Actions. We briefly discuss the standing of HTTP URIs in the model at the end.