Web Magazine for Information Professionals

DC 2005

Robina Clayphan reports on the International Conference on Dublin Core and Metadata Applications: Vocabularies in Practice held at the University of Carlos III, Madrid in September 2005.

This year's DC Conference took place over four days in the excellent facilities of the University of Carlos III in Leganes, which is a few minutes train ride south of Madrid in Spain. In excess of two hundred people attended coming from 34 countries around the world. As always it was a busy conference with many parallel strands to choose between. The rest of this report therefore covers only a fraction of all the papers and work that happened there: it is a fairly personal selection drawn from some of the sessions I attended.

The Dublin Core Metadata Initiative (DCMI) celebrated its tenth birthday in March this year. From small beginnings in 1995, when the idea of defining a core set of elements to facilitate resource discovery on the Internet was articulated, the activity developed from a series of workshops through to annual conference status five years later in 2001. The DCMI workshops were open events and, as awareness of the emergent standard spread, people turned up to them in increasing numbers. Their reasons for being there were not homogeneous - some were motivated to contribute to the development work, some to learn and some to share information about their particular implementation and the lessons it may have for other practitioners. To provide the parallel strands necessary to meet the needs of this growing community the conference series was established.

photo (61KB) : Padre Soler Building of the University Carlos III of Madrid

Padre Soler Building of the University Carlos III of Madrid

The main impact of accommodating these differing needs is that the annual DC conference usually has a lot going on at any one time. This year, for those coming to learn, a programme of Tutorials was offered at the start of each day. These covered the basics of syntax, semantics and application profiles plus a couple more covering the theme of this year, vocabularies in practice. The plenary sessions each morning opened with a keynote speaker, to set the landscape and context, followed by papers from practitioners addressing the main theme of the day. The afternoon schedule required attendees to make some tough decisions between three different strands: listening to the short papers describing projects and other metadata-related activities; participating in one of the working group meetings to help drive forward the work of DCMI; attending one of the Special Sessions focusing on a related activity. It is an encouraging development that the number of Special Sessions has increased over the years. DCMI and other groups working in areas such as semantics, sectoral standards and metadata have recognised converging or complementary interests and sought to harmonise development where possible.

Keynotes and Papers

Diverse Vocabularies in a Common Model: Dublin Core at 10 Years

Tom Baker, DCMI Director of Specifications and Documentation

To open the substantive part of the proceedings Tom Baker's address looked both backwards at the evolution of the Dublin Core Metadata Set (DCMES), its semantics, models and vocabularies and also outlined the priorities for the future.

With the realisation in 1994 that this new information resource, the World Wide Web, was not going to go away, librarians recognised that their normal practices in resource description and discovery were not going to scale to what was coming. Something simpler was needed that could be understood and created by non-experts. Several lively, cross-sectoral workshops later, the original core 15 terms were established and DCMES took its place amongst the many established metadata formats used to improve access to resources.

Early implementations indicated a desire for greater precision in the metadata: "Not just any Date, but Date Created, not just any subject but a Library of Congress Subject Heading." So the original DC vocabulary grew into a set of 15 terms each of which could be qualified either by a refinement that narrowed the meaning of the term or by use of an encoding scheme to indicate something about the interpretation of the associated value.

With increasing use of DC it became apparent that a small number of prescribed terms could never meet the requirements of diverse applications. Implementers were developing terms around the edges to meet the need for more precision for specialised purposes and adding local rules and guidelines. They were taking what they wanted from DC and creating what they needed to supplement it, the end result being a customised profile of DC.

These bottom-up developments have been complemented by top-down work to produce an abstract model against which such application profiles can be validated. The DCMI Abstract Model [1] was approved earlier this year and will underpin the DCMI shift in emphasis. This move will be away from the development of the underlying vocabularies to the establishment of application profiles that can be used by others and the provision of guidelines for how to make such application profiles. The profile being produced by the Collection Description working group will probably be the first to be put through the formal Usage Board review procedure.

All the foregoing results in a model of DCMI with a core vocabulary, an underlying data model and the development of application profiles. These elements reflect three identified legs of interoperability: shared semantics, a shared model and content-level agreement (for example,the same way to write a date (2005-09-12) or a name (Baker, Thomas). This last point regarding content-level rules was of particular interest to me in the context of the DC Libraries activity described below.

From User Queries and Actions to Metadata

Ricardo Baeza-Yates, University of Chile.

The second keynote was an interesting paper with the central theme that, in the world of information, context is everything. He started with the observation that the use of filenames was the imposition of the functions of human memory onto computers when computers can handle remembering a lot better than we can. They can not only retrieve directly from the content but can generate a lot of other metadata about dates and usage as well. Going further, some knowledge of users and their context (who and where they are, what they are doing and when) allows personalisation in information retrieval. Ricardo referred to the book "The Social Life of Information" by John Seely Brown and Paul Duguid which I have since acquired and can commend to anyone interested in the wider context of information and its uses.

The Semantic Web in Practice

Eric Miller, W3C

Eric is very familiar with DCMI and started this final keynote presentation with enthusiastic congratulations to DCMI for what it has achieved in the last ten years. DC is now a pervasive phrase in the world of metadata and has done a lot to raise awareness of the importance of metadata in that time. He continued with an overview of the key aspects of the semantic web including: data integration across applications, organisations and community boundaries; the need to represent data in a way that is free of the application that created it so it can be reused in many services; the need to relate concepts in a clean and consistent way with the use of URIs for each term; in summary, the move from a web of human readable documents to a web of data - exposing the data that is within the documents.

To help achieve this he identified a few key collaborations for DC amongst which are: Resource Description Framework (RDF), Simple Knowledge Organisation System (SKOS) in relation to DC Subject, the representation of Functional Requirements for Bibliographic Records (FRBR) in RDF (to bring together the library world and RDF) and the need for citation information for scholarly use.

Using Dublin Core Application Profiles to Manage Diverse Metadata Developments

Robina Clayphan and Bill Oldroyd, The British Library

The paper presented by my colleague and me exemplified the concept of Dublin Core application profiles. It is a solution we have been working with at the British Library (BL) for some years now in the context of a wider resource discovery strategy. The BL has a legacy of many different systems and bibliographic formats arising from its history of incorporation of a number of separate organisations. This mix has been compounded by the output of the on-going digitisation activities that began in the more recent past - many produced before interoperability became the watchword for digital information.

In an attempt to control the proliferation of such metadata formats and to bring a degree of harmonisation to the older legacy systems, a British Library application profile (B-LAP) has been developed. Consisting largely of terms from the DC vocabulary it is supplemented by a few coined locally and is documented according to the CEN DCAP Guidelines [2]. All digitisation projects are now asked to base their own metadata profiles on the B-LAP and to create additional terms only where needed. This ensures that there will always be an interoperable core of metadata to facilitate the implementation of cross-searching the library's resources.

To be useful, an application profile which is applied across an organisation's metadata requires a technical implementation that will take account of the complete database infrastructure. Using the recently developed SRU protocol in conjunction with both the B-LAP and a gateway that can translate between Z39.50 and SRU, the BL has been able to experiment with the provision of a uniform method of access across its collections.

Working Group Sessions

Libraries Working Group

Amongst the objectives in the Charter of this working group are the following two aspirations: to foster increased interoperability between DC-based metadata and other metadata used in libraries and to provide a platform for feedback from other bibliographic bodies. It was therefore rewarding to spend one of the working sessions of the meeting considering how the DC community (seen as embodying an emergent standard) could best contribute to the work of the Anglo American Cataloguing Rules (AACR) community (regarded as a traditional standard).

Earlier this year the AACR authorities had elected to reassess the principles behind the rules and develop a standard applicable to a wider set of resources than traditional library materials and designed for use in a digital environment. In this light Matthew Beacom had been invited to outline the development plans for what is now being called Resource Description and Access (RDA) [3]. Characterised as a library domain content-level agreement, this departure converges with one of the three aspects of interoperability referred to by Tom Baker and may help in identifying a way for users to share practice about how values are expressed within descriptions.

This work will be taken forward by a sub-group of DC Libraries which will participate in the review of the drafts of RDA.

Collection Description (CD) Working Group

Collection descriptions are seen as essential if meta-searching and portals are to work effectively. This working group has been very active over the past year developing an application profile to describe collections [4] and is closely aligned with the Collection Description Sub-Group of the NISO Metasearch Initiative (MI). The first three functional requirements for collection description, of discover, identify, and select a collection coincide with those defined for bibliographic records. Then the models diverge as the CD requirement is to identify the location of the collection and identify services that supply access to it. This recognises the reality in the meta-searching world, that more than one service can provide access to the same digital collection. A profile for describing such services is beyond the scope of the DC CD but is under development by the NISO initiative.

It is anticipated that the CD application profile will be the first to undergo the new review process proposed by the DCMI Usage Board. A few outstanding issues remain to be resolved in the coming year.

Conclusion

Through project work in partnership with other national libraries and UKOLN I learned about DC at a relatively early stage in its development. In 1999 I had the opportunity to attend what turned out to be one of the last of the workshop series and I have since been fortunate enough to attend the conferences. My principal interest in DC reflected the objectives of the initiative in that I needed to find a simple metadata set that could be applied in various digital contexts. Another aspect of my interest has been as an observer of the process: seeing an idea born in the early days of the Internet grow into a significant global activity responsible for an ISO standard. A process that started as an idea pursued by a small group of individuals has evolved into an entity with all the trappings of a formal international organisation and links with the other siginifcant communities in the information world. DCMI still retains the open, participative methods it started out with and although global in its reach, as demonstrated by the provenance of participants at this conference, it remains an admirable, grass-roots, collaborative initiative.

References

  1. DCMI Abstract Model http://dublincore.org/documents/abstract-model/
  2. CEN Dublin Core Application Profile Guidelines http://www.cenorm.be/sh/mmi-dc
  3. Resource Description and Access Prospectus http://www.collectionscanada.ca/jsc/rdaprospectus.html
  4. Current draft of the DC Collection Description Application Profile
    http://www.ukoln.ac.uk/metadata/dcmi/collection-application-profile/2005-08-25/

Author Details

Robina Clayphan
Co-ordinator of Bibliographic Standards
The British Library

Email: robina.clayphan@bl.uk
Web site: http://www.bl.uk

Return to top