Metadata and Interoperability in a Complex World
The 2003 Dublin Core conference, DC-2003, took place in Seattle, Washington, USA, from 28 September to 1 October . This was the eleventh Dublin Core meeting: the first eight events were categorised as 'workshops' and this was the third time it has taken the form of a 'conference' with peer-reviewed papers and posters, and a tutorial track. The 2003 conference attracted some 300 participants, from over 20 countries.
The event took place in the Bell Harbor Conference Centre, located directly on the waterfront overlooking Elliott Bay on the Puget Sound. The facilities were generally excellent, with wireless Internet access available in all the meeting rooms. The organisers had thoughtfully arranged the rooms' seating so that late afternoon speakers did not have to contend with the challenge of their audience's attention wandering to the spectacular views across the bay.
Stuart Sutton, University of Washington, and Jane Greenberg, University of North Carolina at Chapel Hill, Co-Chairs of the Programme Committee, welcomed us to Seattle, and sketched out the programme for the event. Stuart described the role of Dublin Core as contributing to the 'management of a messy world', a phrase that seemed to encapsulate several of the threads that surfaced over the course of the week. Jane emphasised the varied nature of the sessions that made up the programme, particularly the six topic-based 'Special Sessions' that were intended to be less formal than the conference paper sessions and structured so as to encourage dialogue. Tutorial sessions were available on encoding DC metadata in various syntaxes; application profiles; Creative Commons ; and the Faceted Application of Subject Terminology (FAST) .
At events such as this, it can be quite difficult to develop a feel for underlying common themes, and perhaps my strongest impression from DC-2003 was one of just how many distinct and diverse communities are working under (or around the edges of!) the DC 'umbrella' - so the following is necessarily a personal impression of a few selected topics and themes that surfaced over the four days.
The opening plenary session was given by Mary Lee Kennedy, Director of the Knowledge Network Group at Microsoft, and concentrated on how she and her team were tackling the problem of connecting Microsoft staff with the information they needed via their corporate intranet services.
She suggested that staff should be able to:
- find the right quantity and quality of information
- determine the relevance of information and understand its context
- trust the authority of information
- find out who else has relevant knowledge
- find the same information from multiple starting points
- learn about new resources of relevance as they become available
The focus of the presentation was on the steps taken to date to develop good practices for 'information excellence'. These included both technical measures (e.g. developing a directory of intranet sites, including metadata about the audience, subject, language and lifespan of the content) and non-technical (e.g. instilling an understanding of the information lifecycle). Particular emphasis was placed on capturing and disclosing information about people and their relationships to information resources: a user may be just as likely to want a human being to talk to as a document or dataset.
Mary Lee acknowledged that at Microsoft the task was incomplete, and several challenges remained:
- The expectation that all information is managed effectively, both for reasons of productivity and for regulatory purposes, is still only partly met
- Users need information content placed in context, and contexts change rapidly
- The use of automated tools can certainly help, but the key is in how individual tools are used together
- Increasingly information must be integrated from multiple sources, both inside and outside the organisation
Neil McLean, Director, IMS Australia, revisited some of these challenges in a session later in the week. Neil sought to move the focus beyond the organisation and to place these issues in a broader context of cross-community, cross-domain interoperability. He reiterated the importance of recognising the human and cultural facets of that interoperability. He also emphasised the 'multi-dimensional' nature of an information environment in which complex relations exist between people, services and information content, and the challenge of developing shared models for that world.
He noted that we continue to face the problems that arise because different communities have different perceptions and understandings of the challenges. He acknowledged that progress was being made: he gave the example of the widespread sense of the value and usefulness of 'shared services'. However, a note of caution was in order: in practice many existing services remain community- or domain- specific, even though many are potentially more broadly useful. During subsequent discussion, it was highlighted that while many parties may wish to use shared services, they are perhaps less ready to provide the resources to develop and sustain those services! Similarly, the concept of the 'application profile' - the idea that metadata standards are necessarily localised and optimised for specific contexts - was clearly valuable and had been widely adopted. But the development of application profiles should not be interpreted as a guarantee of interoperability; it should not be ignored that, in many cases, there remained a tension between solving local problems and addressing the wider service environment.
He also raised the question of the cost-effectiveness and sustainability of the approaches we adopt - the tension between the need for high quality metadata and the costs involved in producing that metadata was a recurring theme throughout the conference. Neil stressed that funders expect that as researchers and implementers we demonstrate the value of their investment, and he urged us to take practical measures to ensure that (good) tools do actually make the transition from the research labs to the user communities where they are urgently required.
An 'abstract model' for Dublin Core metadata
In the first of two meetings of the Architecture WG in Seattle, Andy Powell (UKOLN, University of Bath) presented a revised, more generic, version of his draft 'Dublin Core Abstract Model' document, which had been the subject of considerable discussion during the weeks before the conference  .
The document seeks to present a description of what constitutes a Dublin Core metadata description, and it does so without reference to the characteristics or constraints of any one representational form or syntax. Andy argued that we need to understand what information we are seeking to represent or encode in order to assess the relative capabilities of different syntactical forms to express that information. Furthermore, looking beyond the Dublin Core metadata community, such a model provides a firmer basis for making comparisons between DC and other metadata vocabularies - and the models within which those other vocabularies are formulated.
This work also highlights that some anomalies exist within the wording of the current definitions of DC metadata elements. A good deal more discussion will be required to move this work forward and to explore its implications - but perhaps facing that "messy world" does mean starting close to home.
Balancing cost, quality and functionality
Neil McLean emphasised that in the real world, interoperability involved trade-offs, and issues of the quality of metadata required to support functional services, and the costs associated with generating such quality metadata emerged in a number of discussions.
Sarah Currier (Centre for Academic Practice, University of Strathclyde) and Jessie Hey (University of Southampton) presented a well-received paper on issues of metadata quality within two communities of practice (the learning objects community and the e-Prints community) . Drawing on a number of case studies, Sarah, Jessie and co-author Jane Barton argued that metadata creation is not a trivial task and that metadata quality has a serious impact on the functionality of services that can be built. Perhaps contrary to expectations that resource creators will assume responsibility for metadata creation, some mediation and control - perhaps a collaborative approach between resource creators and information specialists - may be beneficial. While post-creation processing might offer some improvement, it is unlikely to compensate fully for the impact of low quality metadata. And, most importantly, many of these issues surrounding metadata creation are under-researched and require urgent attention.
The issues of the cost and complexity of metadata creation were noted in a number of other papers. In his consideration of metadata required to support preservation, Michael Day (UKOLN, University of Bath) acknowledged the potential costs, but also the need to balance those costs against the risk of data becoming inaccessible: there were, however, means of avoiding or minimising unnecessary costs (capturing the right metadata', automating capture where possible and reusing existing metadata) . Elaine Westbrooks emphasised that cost-effectiveness and the removal of redundancy were key imperatives for the Cornell University Geospatial Information Repository (CUGIR) initiative in the development of their approaches to generating metadata for geospatial objects .
As part of his concluding comments, Makx Dekkers, Managing Director, DCMI, emphasised cooperation between DCMI and other initiatives.
Several members of the DC community have contributed to the activity of the MMI-DC Workshop under the Information Society Standards System (ISSS) of CEN, the European Committee for Standardization, and further collaborative effort is planned for the future .
For the first time the tutorial programme at the conference included sessions by external contributors (one on Creative Commons from Mike Linksvayer and one on FAST by Ed O'Neill). DC-2003 also benefited from the fact that the IEEE Learning Technology Standards Committee (LTSC)  held its meeting in Seattle at the same time, enabling some joint meetings and discussions of opportunities for collaborative work in the area of metadata for describing educational resources.
Finally, the DCMI Affiliate Programme represents a channel to develop firmer links between local communities of practice and the DCMI, with Affiliates becoming the local custodians of the global DCMI brand and assuming a role in the governance of the DCMI .
In his introduction to the conference proceedings, Stuart Weibel, former Executive Director of DCMI, expressed his hope that the conference was an event of 'significant scholarship as well as community building', and DC-2003 certainly went a long way towards meeting those twin aims. The organisation of the event was excellent, both the content and delivery of the papers were very good, and the context provided plenty of opportunities for discussion.
My own experience of DC-2003 probably emphasised the 'workshop' elements more than the 'conference' aspects, and I had a firm sense that activities in Seattle were part of an ongoing process that had existed before the conference and would continue afterwards. There was also a clear sense that Dublin Core as a metadata standard and DCMI as an organisation were situated within a complex landscape, alongside other metadata vocabularies, within the frameworks of other standards and protocols, and the communities and organisations that develop and sustain them. The challenge of ensuring that the deployment of DC is effective is only partly technological. The successful development of truly shared services depends just as much on the ability of those communities and individuals working on standards to come together to articulate, compare and debate the assumptions and expectations that underlie them.
The papers and posters from DC-2003 are available from the conference Web site.
Thanks to Michael Day, Rachel Heery and Andy Powell (UKOLN, University of Bath) for sharing their comments on sessions that I did not attend.
- DC-2003: 2003 Dublin Core Conference: Supporting Communities of Discourse and Practice: Metadata Research & Applications
- Creative Commons
- Faceted Application of Subject Terminology (FAST)
- DC Architecture Working Group
- Andy Powell, Dublin Core Abstract Model. DCMI Working Draft. 11 August 2003
- Andy Powell, An abstract model for DCMI metadata descriptions. Presentation to DC Architecture WG meeting, DC-2003. 30 September 2003
- Jane Barton, Sarah Currier & Jessie M.N. Hey. "Building Quality Assurance into Metadata Creation: an Anlysis based on the Learning Object and e-Prints Communities of Practice", DC-2003
- Michael Day. "Integrating metadata schema registries with digital preservation systems to support interoperability: a proposal", DC-2003
- Elaine Westbrooks. "Efficient Distribution and Synchronization of Heterogeneous Metadata for Digital Library Management and Geospatial Information Repositories", DC-2003
- CEN MMI DC Workshop
- IEEE Learning Technology Standards Committee
Pete is a Research Officer at UKOLN, University of Bath. He is a member of the Dublin Core Advisory Board and Chair of the Dublin Core Collection Description Working Group.
Web site: http://www.ukoln.ac.uk/ukoln/staff/p.johnston/
Article Title: "Metadata and Interoperability in a Complex World"
Author: Pete Johnston
Publication Date: 30-October-2003
Publication: Ariadne Issue 37
Originating URL: http://www.ariadne.ac.uk/issue37/dc-2003-rpt/