The development of the Dublin Core Application Profiles (DCAPs) has been closely focussed on the construction of metadata standards targeted at specific resource types, on the implicit assumption that such a metadata solution would be immediately and usefully implementable in software environments that deal with such resources. The success of an application profile would thus be an inevitable consequence of correctly describing the generalised characteristics of those resources. Yet despite the earlier success of application profiles, more recent growth in usage of the DCAPs funded by the Joint Information Systems Committee (JISC) has been slow by comparison . It has become quite clear that even the JISC DCAP with the best established user community, the Scholarly Works Application Profile (SWAP), has not so far enjoyed the expected level of implementation .
The term 'application profile' within Dublin Core was described in 2001 as follows
"A set of metadata elements, policies, and guidelines defined for a particular application. The elements may be from one or more element sets, thus allowing a given application to meet its functional requirements by using metadata from several element sets including locally defined sets. For example, a given application might choose a subset of the Dublin Core that meets its needs, or may include elements from the Dublin Core, another element set, and several locally defined elements, all combined in a single schema. An Application profile is not complete without documentation that defines the policies and best practices appropriate to the application." 
There exist a good many application profiles designed for areas as diverse as agriculture , government documents  and folklore . As many of them are application profiles of Dublin Core, the term DCAP would appear to be an appropriate designation. However, the definition of the term has more recently been extended by the Dublin Core Metadata Initiative (DCMI) to require that such an application profile be based on the Dublin Core Abstract Model (DCAM) and include a Description Set Profile (DSP) .
Dublin Core Application Profiles are intended to be based upon an application model , which can be extremely simple. This article concentrates on the recent set of JISC-funded application profiles, which make use of application models based on variants of FRBR , and which follow the Singapore Framework for Dublin Core Application Profiles . While application profiles are by no means limited to repositories and can for instance be implemented in such wide-ranging software environments as Virtual Learning Environments (VLEs), Virtual Research Environments (VREs) and eAdmin, this paper focusses in the first instance on digital repositories . However, these wider areas are within the broader scope of this study and it is intended that future work will address them more specifically.
However successful such an approach may be for developing high-quality metadata standards, it has become increasingly clear that implementation in repositories is not guaranteed without a far more sophisticated understanding of what such systems can practically achieve, and by what means. There is no single answer to this question, as the aims and working practices of every repository can differ markedly from its contemporaries in other institutions. It is broadly a matter of diverse institutional policies and management, being grounded in individual circumstances and an institutional context. However, in some cases, those practices may derive historically from valid but arbitrary decisions that have been taken since the inception of the repository. In either case, this article sets out a methodology by which to assess the effects of this diversity on the particular metadata requirements of individual repositories and to examine and facilitate practical local solutions. Before any given application profile can be adopted in a particular situation, a strong case needs to be made to justify its suitability.
It is the contention of this article that application profiles, in order to be workable, living standards, need to be re-examined in their constituent parts in far greater detail than before, and that a range of implementation methods need to be practically tested against the functional and user requirements of different software systems. This can only be achieved through a process of engagement with users, service providers (such as repository managers), technical support staff and developers. While the ideal target audience are the end-users of the repository service, it is in practice difficult to engage them in the abstract with unfamiliar, possibly complex, metadata schemas. So much of the process must inevitably be mediated through the repository managers' invaluable everyday experience in dealing directly with users – at least until the stage in the process when test interfaces, test repositories or live services can be demonstrated. In order to engage developers in the process of building and testing possible implementation methods, it is absolutely crucial to collect and present tangible evidence of user requirements. It is highly likely that practical implementation in repositories will vary greatly between individual services based on different software platforms. However, it may well be that there are other more significant factors in individual cases.
The assertion is often made that repository software environments in the UK Higher Education (HE) sector, and indeed in institutions globally, consists of the 'Big Three' open source repositories: EPrints , DSpace , and Fedora . Statistics for usage of these items of software in the UK from ROAR  and OpenDOAR  suggest that EPrints is most widespread with between 70-80 registered instances, while DSpace has approximately 35, but surprisingly Fedora only has 3 registered live instances. It is often overlooked that a number of universities have deployed Digital Commons, the main commercial competitor from BEPress , while others have DigiTool, the offering from Ex Libris . Many additionally use home-grown or less well known systems, both open source and commercial. So while it is reasonably possible to implement and demonstrate methods within the major open source platforms, it is much harder, if not impossible, to implement and demonstrate outside that scope.
Increasing numbers of universities are also seeking to deploy Current Research Information Systems (CRISs), whose remit overlaps both those of repositories and Virtual Research Environments (VREs). As of yet there are no open source CRISs or VREs available, and their practical engagement with application profiles must remain an area for future research and development.
There are both significant differences and remarkable similarities in the way that the 'Big Three' model their content, which affect their capacity to represent the JISC DCAPs. At this stage it is worth summarising the data models of these three applications, and examining their strengths and weaknesses with regard to both each other and to their capacity to support the complex data models of DCAPs. Some of the observations below are derived from the work at Symplectic in integrating their commercial CRIS with a variety of repository systems .
Fedora models 'objects' as its primary archival unit, and each object contains a Dublin Core metadata file, an RDF document describing the relationships relevant to the object, and an arbitrary number of 'datastreams', which represent any form of content . It is the most enabled in that it is capable of supporting entity-relationship (E-R) models, with some caveats. Relationships from one Fedora object to some other entity (identified by its URI) or some object literal are permitted. This means the internal relationships between the object as a whole and its parts are supported, and relationships between the object as a whole and other external entities are likewise supported. Problems arise when attempting to create fine structure inside a single Fedora object, such as datastream-to-datastream relationships. This can be worked around by including home-grown solutions to the requirements, but they are not then regarded as special by the overall Fedora framework. Fedora objects are also exclusive with regards to their owned content, in that a datastream can only 'belong' to a single object in a sense that Fedora natively understands. Any such extensions require programmatic intervention on the part of the adopter.
DSpace calls its primary archival units 'items', and each contains some metadata in a flat schema, and a number of 'bundles' which equate approximately to FRBR 'manifestations'. These bundles then contain 'bitstreams', which are the content elements of the item . DSpace is a little more flexible in terms of the shared internal structure of the items in the archive, but less flexible in terms of formal support for E-R models. Bitstreams can belong to more than one bundle, and bundles to more than one item, although the native UI tools for describing these relationships are somewhat lacking. Relationships to other entities are produced ad hoc using URIs embedded in a flat qualified Dublin Core style metadata record, and these relationships are not trivially machine-readable. There is also no built-in concept of versioning, so while it can be applied using Dublin Core fields such as 'replaces' and 'isReplacedBy', such fields do not have any impact on the behaviour of the rest of the application – it will not offer the user navigable version chains, or index only the most recent version of the content.
EPrints has a data model perhaps unremarkably similar to that of DSpace; they both grew from similar use cases: attempting to fill the Open Access and Institutional Repository space; they even share a common original software developer. EPrints archival units are called 'eprints', and they contain metadata in a hierarchical schema. Each eprint can contain an arbitrary number of 'documents' which again are similar to the FRBR 'manifestation', such that each document may contain a number of files . The key difference from DSpace is that documents can only be associated with a single eprint - the same limitation from which Fedora also suffers. Again, entity relationships of most kinds are done ad hoc using URIs embedded in the hierarchical metadata record. There is only one special kind of identifier in the standard EPrints application, which is the one which supports versioning (although it should be noted that the identifier is not a URI). The version metadata are still just metadata, but the user interface is capable of presenting a navigable version chain, and only the most recent version of the item is returned in search results.
There is ongoing work on each of these repositories, responding to user requirements, and to contributions from their communities. It is possible, for example, that DSpace 2.0  and EPrints 3.2  will provide much of the more sophisticated E-R features which are required for the DCAPs. Meanwhile, EPrints has an implementation of SWAP as an export plug-in . This alternative approach effectively treats SWAP as an exchange format rather than an integrated metadata set and structural model, and could be seen as relatively agnostic towards the future community acceptance of the application profile. At present, this is a reasonable compromise position, and is one that DSpace could reasonably follow, given the similarities of the two systems.
However, it may be too simplistic merely to argue that these architectural differences alone account for the lack of spontaneous interest on the part of repositories in implementing the DCAPs. It is not clear that the mismatch between internal data model and exposed, serialised logical model actually presents a significant barrier to implementation, a matter that deserves further investigation as part of the process outlined in this article. Although no working demonstrator has been tested, it is difficult to see any insurmountable technical barrier to this in theory. There are, after all, repository systems - largely in mainland Europe - that implement the Common European Research Information Format (CERIF), which has a far more complex data model than FRBR. It seems likely that the benefits that might be derived from the DCAPs for working repositories have not been sufficiently demonstrated in order to justify the effort involved in overcoming these technical difficulties. If this were the case, then drivers from the user communities themselves would compel the implementation. If the benefits are less clearly obvious to the repository communities, it may be the case that simultaneous implementation should be funded and coordinated in a similar way to the SWORD standard, which enjoys widespread adoption and implementation across all major repository platforms.
In practical implementation within a repository context, it is evidently possible to use the metadata elements from these DCAPs even where the data model of the repository does not support FRBR, which would lead to some level of semantic interoperability despite the structural incompatibilities. Of course, the suitability of such elements in pre-existing DCAPs should properly be a matter for thorough usability testing.Yet the use of metadata elements from the DCAPs without the structural components has only to date been explored in the WRAP repository in Warwick, at the time based on EPrints 3.0, whose SWAP implementation was reported as one of the less successful aspects of the repository . As its funding by the JISC was partially dedicated to demonstrating SWAP, it provides no independent evidence that the repository community feels either anxious or able to implement the application profile for the benefits it might offer them.
This impasse has become a barrier to progress with the recently developed set of DCAPs. The case has not been made for the practical benefits of implementing them, nor is there any clear demonstration of how individual repositories could do so either at the data-model or interface levels. In repositories whose digital objects are at present incompatible with a structured data model based on a complex entity-relationship model, it would either be necessary to provide backwards compatibility with existing records or else to develop a method for batch processing records to the new format, a process that has not been explored in most cases. In any event, both a motivation and a means need to be provided for this to happen.
In response to this situation, UKOLN has initiated a collaborative programme of user engagement and practical testing. The initiative contributes to several JISC projects, including the Application Profiles Support Project, the Information Environment Metadata Registry (IEMSR), and the Start-up and Enhancement (SuE) and Shared Information Services (SIS) projects. Discussions are under way with both the JISC and the Dublin Core Metadata Initiative (DCMI) about this proposed approach. The principal researchers at UKOLN are Talat Chaudhri, Julian Cheal, Mahendra Mahey, Emma Tonkin, and Paul Walk.
It is recognised that the resource types described by the various JISC DCAPs are far from homogeneous in nature and scope, and that solutions that may be appropriate for one may not apply to another. Whereas scholarly publications comprise a relatively narrowly defined resource type (described by SWAP), it is arguable that some resources may be far wider in scope, e.g. time-based media (TBMAP), images (IAP), teaching and learning materials (LMAP) and scientific data (SDAP) . In domains with such wide variation of resource types, it is possible that a single application profile may not cover the whole domain effectively, a matter that deserves further deconstruction as part of the development of any targeted solution. The scoping studies for LMAP  and SDAP  have raised this as a possible concern. Another exceptional case is geospatial information (GAP), which can be attached to resources of almost any type, supporting the use of that application profile as a modular extension to other DCAPs. It is therefore essential to work together with domain experts in order to test implementation methods to assess their practical benefit to users.
Application profiles usually comprise a number of discrete layers that can be examined and tested separately. Of these, perhaps the two most significant structural components are: (1) the metadata elements as a vocabulary for describing a resource, or a series of resource or entity types; and (2) the structure that describes the relationships between different parts of that metadata, both within a resource and between multiple digital objects. The latter also encompasses how elements are distributed between different metadata entities, since this may affect how resources can be described as complex digital objects.
UKOLN is currently developing a methodology for paper prototyping in both of these areas, in addition to usability testing via rapidly developed software prototypes. This is intended as an iterative process of engagement with the community, with repository managers and, wherever appropriate, with end-users. This will provide specific evidence for user requirements, required by developers as a basis for effective implementation. There are two main, related areas of study: (1) testing established, documented application profiles to determine whether they meet the requirements that these methods establish; (2) testing the requirements for resource types at the outset of the development of prototype application profiles, in order to establish a sound methodology for the future development of similar application profiles.
It is not sufficient merely to promote standards that cannot be shown to provide useful services to end-users, an approach that rarely leads to real usage. From a technical perspective, there is a need for considerably more collaboration with developers from across the community, particularly with those of the major repository platforms in the first instance. It is likely that success will depend upon integration into ongoing software releases of the most popular software upon which live services are based.
Common approaches to user-centred design in information architecture include contextual enquiry, ethnographic methods, and card sorting . We also make use of a fourth approach developed in-house that combines elements of several approaches to explore data structures (entity-relationship models).
Conceiving information architecture in general as a user-centered process is complicated by the fact that designers must balance a number of issues, enumerated by Sinha and Boutelle  as: the need to develop an understanding of user conceptual structures; the need to incorporate understanding of business goals and concerns, and the need to ensure that the design is neither quickly rendered obsolete, nor designed in too inflexible a manner to incorporate future additions of content and functionality. A simple conceptual model describes user perceptions of how an object or system operates. As abstract generalisations, conceptual models are generally difficult to explain and to understand. This is alleviated through provision of examples and scenarios, which ground the conceptual model in a practical context and enable exploration of the model through concrete examples.
The perception that a single conceptual model is shared by the designers, the developers and the users is very likely to be inaccurate. Many seemingly simple interfaces conceal a complex data model; the fact that the complexity is hidden is simply an artefact of good design practices. Paper prototyping, then, supports the exploration of the user's existing, accessible conceptual model of the area; if it can be captured, it is possible to compare and contrast it with alternative models. This area of research enables designers to establish whether the views on the resource or resource set with which the user is most comfortable, can be appropriately supported by the candidate model. It also enables them to establish the manner in which the complex model must be 'folded', or simplified, to present only that subset of characteristics that satisfy the user's data modelling needs. In short, this area of research explores cheap, simple means to elicit information from the user relating to the way in which the user would describe the resource(s) in the context of a given set of tasks; information that can feed directly into the development of candidate interfaces for software testing.
In addition, practical implementation of these major metadata components introduces further areas that also require considerable usability testing:
If metadata entities are simply presented as multiple input forms, the demands upon users for manual metadata entry are likely to be high: a factor which is often perceived to be a major disincentive to self-archiving, at least in the case of scholarly publications. Whether or not this is in fact the case deserves further study as part of the usability testing process, and may well depend on the users and specific resource type in question. For example, it is likely that more detailed metadata would need to be supplied by the depositor in the case of technical images.
It may be possible to generate automatically or programmatically deduce some of the structural metadata in order to alleviate these problems, and to develop simpler input forms to distribute the metadata into the appropriate places in the underlying metadata model without the need for the user to explore the full complexity of a given structure directly. For example, as discussed above, the more common resource types and configurations to be archived may be presented to users in the form of a greatly simplified model that closely resembles their own perception of the resource. This facilitates the task by building on users' own understanding of the domain area. The general approach of building upon a simplified description, often based on an 'interface metaphor', is a classic tactic in human-computer interaction .
In order to design such an interface, it is necessary to assess which metadata are of significance to the largest number of users, and which metadata satisfy relatively minor use cases. To achieve this, it is also necessary to establish the most common uses for the system, and to elicit information from the user community regarding commonplace models of resource, task and system function. Once this is established it becomes possible to reduce the problem to one of optimising an existing interface for use by the defined user groups: reducing the number of key strokes required to complete a deposit process; improving the learnability and the overall usability of the interface; further improving the interface through user evaluations of each prototype. These may then feed back into the optimisation process.
It is our contention that the interface design cannot be relegated entirely to a post-hoc engineering process, and that this in fact forms a valuable part of the process of evaluating the application profile itself. The process allows evidence from user evaluation of views of the data structure (essentially special-case simplifications of the general model) to be fed back into the development of the DCAP, providing a useful first glimpse of the way in which the DCAP may be applied in practical contexts. Issues ranging from the most basic, such as time taken to fill out the fields manually, to the more technical, such as evaluation of the cognitive load involved in the process of grasping the metadata model as shown to the user, are all potentially significant in evaluating the system''s chances of acceptance in a real-world context of use.
Methodologies used in exploring interface design can vary widely, from methods based on efficiency evaluation such as GOMS (Goals, Operators, Methods, and Selection rules) and its derivatives to lightweight methods such as heuristic evaluation and Wizard-of-Oz walkthroughs  of paper prototypes. Cost can vary depending on the level of implementation required to support the evaluation, from a minimal cost of a pure paper prototype to a modest cost for fast prototyping of a sample interface in a framework capable of supporting rapid development, such as Ruby on Rails.
The scope and purpose of various serialisation formats and their schemas, such as the Resource Description Framework (RDF), eXtensible Mark-up Language (XML), Open Archives Initiative - Object Re-use and Exchange (OAI-ORE) and Description Set Profile (DSP), must be addressed from the perspective of delivering practical functionality to users. The Singapore Framework for DCAPs requires the provision of a DSP, based on the Dublin Core Abstract Model (DCAM) . To date, however, the functional benefits of the standard have not been sufficiently examined, not least because it has not been implemented in a functioning repository.
The particular technologies, serialisations and schemas that will be used to implement the JISC DCAPs may well vary between different software packages, and even between individual local deployments of that software. It is therefore sensible to seek to maximise the possible implementation methods available, so that the best method for any given software scenario can be arrived at through testing.
In recognition of the simple 'flat' data model currently in use in many repositories, notably all instances of DSpace including the current version 1.5.2 and all instances of EPrints up to 3.0, it is apparent that there exists a considerable need to consider the practical demands of backwards compatibility with such records; or else to consider how they could be batch-processed to fit a hypothetical future complex data model based on an E-R model. EPrints 3.1 does allow for relationships between digital objects, but not at present for an E-R model as complex as FRBR, on which the JISC DCAPs are based. Given that the future data models of DSpace and Eprints are speculative, and considering that the bulk of repositories in the UK HE sector are based on one of these two platforms, it seems wise to take the practical position that backwards compatibility will have to be enabled.
To resolve this, work has been undertaken at UKOLN to develop a possible means to emulate the complex data models based on FRBR that have been adopted by these DCAPs, by re-mapping the FRBR Group 1 entities. By removing the inheritance pattern from parent entities, it would become necessary to unify the Group 1 entities and thus duplicate metadata in related digital objects that would otherwise have been shared by related child entities. This would essentially be the approach that was recommended by Rachel Heery in her Digital Repositories Roadmap Review, by which FRBR is essentially removed from the data model . However, it is possible to re-factor the Group 1 entities as ordinary 'sideways' relationships between entities at the same level in related digital objects. This method would allow for backwards compatibility with records in the majority of existing repositories, but also allow for the relationships implied by FRBR to be preserved.
It has been acknowledged in the predecessor to this article that opinions about the appropriateness of FRBR in the repository environment differ widely, and that one of the present authors doubts that its usefulness as a model is as universal as it has often been assumed . The appropriateness of any E-R model and the relationships that it implies may be closely tied to the particular requirements of users within a particular application, and this may change as that application develops over time. It must be admitted, in all fairness, that neither view has yet been demonstrated, and that both views require testing. The E-R models that the original framers of the JISC DCAPs chose represent the status quo in the meantime. Moreover, a strength of this modular approach is that multiple entity models could potentially be supported, even concurrently – or easily replaced. It seems best to take the view that usability and suitability testing, evaluated within the real-world repository landscape, may be the best final arbiter.
In addition to providing a potential means to support FRBR-based DCAPs in existing repository architecture and working towards the gradual improvement of these records over time to comply with such complex data models, the proposed method allows the metadata elements and the entity-relationship structure to be treated as modular elements of a toolkit. It would be possible to allow for a basic level of implementation restricted to just the metadata elements, while allowing for a later upgrade to use the full structural model. Such an approach is not dissimilar to that described in the Dublin Core Interoperability Levels document, adding a level that could be described as 'structural interoperability' .
There is, however, an exchange to be made for this neat modular arrangement. The reduction in the complexity of the data model may also create unsynchronised duplication of metadata between related digital objects, whereas the inheritance model of the E-R model creates no duplication. The problem might be likely to arise in repositories that use traditional database design, e.g. EPrints and DSpace, rather than RDF triple stores, e.g. Fedora, where relationships are bidirectional and would be synchronised automatically. Even in the former hypothesis, the situation would be no worse than that in which most EPrints and DSpace repositories currently find themselves, where relationships between digital objects are often expressed purely as a URI in the dc:relation field, or in one of its qualified elements. Moreover, it would be possible to develop software that could check such relationships against a list of elements that should agree with each other, if such an entity is thus emulated, and to alert the repository manager in order to correct any errors that might have arisen. The means to upgrade to an eventual data model that natively supports entity-relationship structures would seem to be supported by such a mapping, creating the possibility of batch-processing of old records once newer releases of the software supported it.
It must be stressed that the emulation method is offered purely as one possible tool to implement the JISC DCAPs in certain repositories that cannot support E-R models. In order to assess whether this and other methods work in practice, UKOLN aims to arrange for these implementation methods to be demonstrated in live repository instances, either in test repositories or live services as may be appropriate, preferably across the range of repository software platforms in common use. This should be considered best practice, as technical repository staff, where available, can rarely justify the development time without prior evidence of practical benefits and are required to avoid disruption to live services. By providing demonstrations compatible with those services, it is hoped that the possible benefits of implementing an appropriate DCAP in a repository environment would be substantially easier to realise than is presently the case.
It cannot be stressed enough that implementation of metadata standards should be upon the basis of their demonstrated usefulness to a community of professionals, such as repository managers, who have the responsibility for offering and maintaining live services to their institution and to the general public. Theoretical models will be of no use to such a community unless they can be given not only the means to implement them, but also the clear benefits of doing so. If the testing process is to be successful, it must engage in an ongoing dialogue with service providers, and ideally users, and proceed on the basis of the evidence gathered from any usability testing to which they contribute. This means that no standard should be immune to radical review and re-engineering should user requirements demand it.
Furthermore, the interests of users in the process needs to be considered at all times, including their immediate purposes for attending events as well as their longer-term benefit from the eventual outcomes. The process needs to draw upon the hard-won experience of a wide variety of individuals in outreach projects within the community, and to serve the educational needs of service providers such as repository managers in terms of their professional experience with metadata and software issues. It is difficult to see how else they will be able to justify contributing to a programme of usability testing, if it does not offer jam today as well as tomorrow.
The aim of the iterative testing, development and user engagement effort that has been outlined here is to complement the plan for the development of DCAPs that was advanced in the Singapore Framework. The functional requirements, domain model and DSP were advanced as mandatory elements of a DCAP. It is proposed here that functional requirements are a fundamental pre-condition for the other two, and consequently they require considerable, ongoing analysis and usability testing.
It is also suggested in the Singapore Framework that usage guidelines and encoding syntax guidelines should be offered as optional elements. Ideally, of course, they should be provided wherever possible. However, this is made more complex by the variety of the circumstances in which different repositories may need to implement any particular DCAP, if it should prove its benefit to the service offered by that institution. Sufficient user documentation should however remain a core aim, and should be focussed on practical, implementable guidelines.
The lack of thoroughgoing evaluation within an interface engineering context and the lack of implementation to date should not necessarily be seen as a criticism of the current DCAPs. The effort to develop these standards was pioneering, and it was not yet fully understood that substantial time for such hands-on testing and community engagement needed to be allocated. The present initiative draws upon the experience of that work, and aims to uncover the practical benefits to software services, notably repositories, that can be derived from it.
The authors would particularly like to acknowledge the ongoing contributions and collaboration of Paul Walk, Technical Manager, UKOLN. The hard work done by the framers of the various DCAPs has made this effort possible, particularly in the case of SWAP, which has served as a precedent for the others to follow.