Skip to Content

Metadata for the Masses

Printer-friendly versionPrinter-friendly versionSend to friendSend to friend

Paul Miller describes Dublin Core and several ideas for how it can be implemented.

Metadata. The word is increasingly to be found bandied about amongst the Web cognoscenti, but what exactly is it, and is it something that can be of value to you and your work? This article aims to explore some of the issues involved in metadata and then, concentrating specifically upon the Dublin Core, move on to show in a non-technical fashion how metadata may be used by anyone to make their material more accessible. A collection of references at the end of the article provides pointers to some of the current work in this field.


What is metadata?

The concept of metadata predates the Web, having purportedly been coined by Jack Myers in the 1960's (Howe 1996) to describe datasets effectively. Metadata is data about data, and therefore provides basic information such as the author of a work, the date of creation, links to any related works, etc. One recognisable form of metadata is the card index catalogue in a library; the information on that card is metadata about a book. Perhaps without knowing it, you use metadata in your work every day, whether you are noting down the publication details of a book that you want to order, or wandering through SINES or the History Data Unit in the hope of finding a particular data set of value to your research project.

Metadata exists for almost every conceivable object or group of objects, whether stored in electronic form or not. A paper map from the Ordnance Survey of Great Britain, for example, has associated metadata such as its scale, the date of survey and date of publication. With products such as maps, the metadata is often clearly visible on the map itself, and is expressed using standard conventions that are easily interpretable by the experienced user (Miller 1995).


A simple example of map metadata for the Vale of York
Figure 1: a simple example of map metadata (after Miller 1996a). Click on the figure (above) to see the whole map [118Kb GIF image]

In the unfathomable maze that is the Internet, things are not always as easy. These generalised standards do not yet exist, and it can be surprisingly difficult to actually find the information for which you are searching. The current generation of search engines are undoubtedly powerful, and capable of returning a large number of suggestions in response to any search, but it is almost impossible to cut through the irrelevant suggestions to find the ones you are actually interested in. A search for Ariadne on Alta Vista, for example, found 5,468 references, and returned 3,000 links. On the first page of links, there was a pointer to Issue 3, but nothing else relevant to my needs turned up until the very bottom of the third page. In this case, it was fairly straightforward to distinguish between the (relevant);

 

Ariadne: Issue 2 - Contents 

 

  • Contents Page for Issue 2. Welcome to issue 2 of Ariadne on the Web, 
    the World Wide Web version of the magazine for the discerning UK Library and... 
  •  http://www.ukoln.ac.uk/ariadne/issue2/contents.html - size 6K - 25 May 96 
  • and the (irrelevant?);

     

    Ariadne 

     

  • Ariadne --- A further development. 9th semester in Computer Science. 
    by: Henning Andersen. Jan M. Due. Peter D. Fabricius. Flemming Sørensen. Supervisor:.. 
  •  http://www.iesd.auc.dk/general/DS/Reports/1989/ariadneFurther.abstract.html - size 1K - 28 Jun 94 
  • This simple example illustrates some of the problems with finding information on the Web. It is perhaps analogous (or perhaps not!) to a paper-based list of contacts which, rather than being sorted conventionally by surname, is sorted simultaneously by the contents of every field (surname, company, street, etc). Of course, when you attempt to look up an address in this contact list, you have no way of knowing which field the result is coming from. Assuming you wish to contact our esteemed web editor to offer an article for Ariadne (hint!) and search for his surname (Kirriemuir), you don't really know whether the result you have found is really him, or part of the address of some long-forgotten relative from a small Scottish town just west of Forfar.

    To make your contact list useful, you need some metadata to describe what each string of text relates to (ie Kirriemuir is a

    SURNAME

    or Kirriemuir is a

    TOWN

    ).

     

    Most applications are, of course, more complex than this, but it is at least possible to demonstrate the principles using this simple case study. How, then, are the 'experts' currently approaching the description of metadata?


    Existing approaches to metadata

    A large number of standards have evolved for describing electronic resources, but the majority are concerned with describing very specific resources, and often rely upon complicated subject-specific schema that make either widespread adoption or easy accessibility to these records unlikely. Rachel Heery (forthcoming) offers a review of some of the major metadata formats in a forthcoming article.

    In an environment such as the traditional library, where cataloguing and acquisition are the sole preserve of trained professionals, complex metadata schemes such as MARC (MAchine Readable Catalogue) are, perhaps, acceptable means of resource description. In the more chaotic online world, however, new resources appear all the time, often created and maintained by interested individuals rather than large centrally funded organisations. As such, it is difficult for anyone to easily locate information and data of value to them and the large search engines - with all their faults - are often the only means by which new information may be found.

    In such an environment, there is an obvious requirement for metadata, but this metadata must be of a form suitable for interpretation both by the search engines and by human beings, and it must also be simple to create so that any web page author may easily describe the contents of their page and make it immediately both more accessible and more useful. As such, compromises must be made in order to provide as much useful information as possible to the searcher while leaving the technique simple enough to be used by the maximum number of people with the minimum degree of inconvenience.

    The expert approach

    A large number of techniques exist for the description of resources in an electronic medium, ranging from the various flavours of MARC (British Library 1980, Library of Congress 1994, Heery forthcoming) used in library cataloguing to the more specialised Directory Interchange Format (DIF) which provides metadata for satellite imagery and the like (GCMD 1996).

    Developments such as the Text Encoding Initiative (TEI) have gone a long way towards allowing a standardised description of electronic texts, and the ongoing review of the US National Spatial Data Infrastructure (NSDI) will hopefully succeed in realising a similar scheme for the complex issues involved in describing spatial data. In the United Kingdom, the provisionally named National Geospatial Database (Nanson et al 1995) is aiming to increase the integration between governmental and non-governmental spatial data holdings, and careful thought will need to be given to the construction of rational metadata schemes for this project over the next year or two.

    Each of these formats has been developed to operate within a narrowly defined field of work, and is poorly suited to the description of a wider range of resources. Many of these existing metadata schemes are also extremely complex, and are geared towards creation by experts and interpretation by computers, rather than both creation and interpretation by as wide a range of interested parties as possible.

    In cutting through the morass of existing - and often conflicting - metadata approaches, the work of eLib projects such as ROADS, ADAM et al will be well worth watching, as will the efforts of the Arts & Humanities Data Service (AHDS) to create a pan-subject metadata index that encompasses the current AHDS projects for Archaeology, History, Text and the Performing Art s, as well as any future projects. It is interesting to note that several of these projects (ADS, ADAM) have already adopted a form of Dublin Core description for at least some of their pages. As with this document, Dublin Core metadata is often stored in the <HEAD> </HEAD> area of a Web page, and may be viewed simply by selecting

    View...

    |

    Document Source

    from your Web browser's menu bar.

     

    The search engine approach

    Recognising the need for a means by which searches may be better tailored to actual user interests, a number of the current search engines have begun to include the ability to make use of the HTML <META> tag in Web documents. Alta Vista, for example, makes use of

    DESCRIPTION

    and

    KEYWORDS

    qualifiers to the <META> tag in order to index a given page. The

    DESCRIPTION

    is returned in response to a search, rather than the default (but usually far less useful) first couple of lines of text.

     

    eg

     

    <META NAME="description" CONTENT="The most useful paper on metadata ever written">
    <META NAME="keywords" CONTENT="Dublin Core, metadata">

     

    in the <HEAD> area of this document would cause Alta Vista to return the following in response to a search on any of the words stored in either

    DESCRIPTION

    or

    KEYWORDS

    ;

     

     

    Metadata for the masses 

     

  • The most useful paper on metadata ever written. 
  •  http://www.ukoln.ac.uk/ariadne/issue5/metadata-masses/ - size 51K - 9 Sept 96

  • The Dublin Core

    Notably different from many of the other metadata schemes due to its ease of use and interpretability is the so-called Dublin Core Metadata Element Set, or Dublin Core. This approach to the description of 'Document Like Objects' is still under development, and is the focus of a great deal of activity worldwide as researchers work to produce the most useful model they can, capable of describing the majority of resources available on the Internet as a whole, and suitable for inserting into a wide range of file types from the simple HyperText Markup Language (HTML) of the Web to Postscript files and other image formats (eg Knight 1996, Beckett 1996). Despite the emphasis of this, and other, papers (A.P. Miller 1996b, E. Miller 1996a, E. Miller 1996b, Weibel 1996) on the HTML implementation of Dublin Core, readers should remember that the concepts are equally applicable to virtually any other file format. In the case of this article, the HTML implementation is stressed because it is felt that this is the area in which the underlying concepts may most easily be demonstrated, and because it is in the provision of metadata for the many thousands of personal pages out on the Web that a structure such as Dublin Core may most rapidly make an impact of value to readers of Ariadne. With luck, once you have followed the examples here and filled your text web pages with Dublin Core metadata, you will then feel both sufficiently enthused and competent to further explore the references in order to add metadata to your more complex file formats.

    As Dempsey argues (1996b), Dublin Core metadata descriptions exist between the crude metadata currently employed by search engines and the complex mass of information encoded within records such as those for MARC or the Federal Geographic Data Committee (FGDC 1994).

    The Core Element Set

    The Dublin Core itself consists of thirteen core elements, each of which may be further extended by the use of

    SCHEME

    and

    TYPE

    qualifiers;

     

     

    Element Name Element Description
    Subject The topic addressed by the object being described
    Title The name of the object
    Author The person(s) primarily responsible for the intellectual content of the object
    Publisher The agent or agency responsible for making the object available
    OtherAgent The person(s), such as editors and transcribers, who have made other significant intellectual contributions to the work
    Date The date of publication
    ObjectType The genre of the object, such as novel, poem, or dictionary
    Form The data format of the object, such as Postscript, HTML, etc
    Identifier String or number used to uniquely identify the object
    Relation Relationship between this and other objects
    Source Objects, either print or electronic, from which this object is derived
    Language Language of the intellectual content
    Coverage The spatial locations and temporal duration characteristic of the object

    Table 1: The fields of the Dublin Core Metadata Element Set

    In creating metadata for insertion into Web pages, the HTML <META> tag is used to place the description within the page's < HEAD> < /HEAD> area, as shown below;

     

    <!DOCTYPE HTML PUBLIC "-IETF//DTD HTML 2.0//EN">
    
    <HTML>
    <HEAD>
    
    <TITLE>Metadata for the masses</TITLE>
    
    <META NAME="package" CONTENT="(TYPE=begin) Dublin Core">
    
    <META NAME="DC.title" CONTENT="(TYPE=long) Metadata for the masses: what is it, how 
    can it help me, and how can I use it?">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#title">
    
    <META NAME="DC.title" CONTENT="(TYPE=short) Metadata for the masses">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#title">
    
    <META NAME="DC.subject" CONTENT="(SCHEME=keyword) Dublin Core, Metadata, Warwick 
    Framework, Resource Description, Resource Discovery">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#subject">
    
    <META NAME="DC.author" CONTENT="(TYPE=name) Paul Miller">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    
    <META NAME="DC.author" CONTENT="(TYPE=email) A.P.Miller@newcastle.ac.uk">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    
    <META NAME="DC.author" CONTENT="(TYPE=postal) University Computing Service  
    University of Newcastle  Newcastle upon Tyne  NE1 7RU  UK">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    
    <META NAME="DC.author" CONTENT="(TYPE=phone) +44 191 222 8212">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    <META NAME="DC.author" CONTENT="(TYPE=fax) +44 191 222 8765">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    
    <META NAME="DC.author" CONTENT="(TYPE=affiliation) University of Newcastle upon 
    Tyne">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    
    <META NAME="DC.author" CONTENT="(TYPE=affiliation) Archaeology Data Service">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    
    <META NAME="DC.author" CONTENT="(TYPE=homepage) http://www.ncl.ac.uk/~napm1/">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author">
    
    <META NAME="DC.publisher" CONTENT="(TYPE=name) Ariadne">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#publisher">
    
    <META NAME="DC.publisher" CONTENT="(TYPE=email) ariadne@ukoln.bath.ac.uk">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#publisher">
    
    <META NAME="DC.publisher" CONTENT="(TYPE=homepage) 
    http://www.ukoln.ac.uk/ariadne/">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#publisher">
    
    <META NAME="DC.date" CONTENT="(TYPE=creation) (SCHEME=ISO31) 1996-09-02">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#date">
    <LINK REL=SCHEMA.iso31 REFERENCE="ISO 31-1:1992 Quantities & Units -- Part 1: space 
    & time">
    
    <META NAME="DC.date" CONTENT="(TYPE=current) (SCHEME=ISO31) 1996-09-09">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#date">
    <LINK REL=SCHEMA.iso31 REFERENCE="ISO 31-1:1992 Quantities & Units -- Part 1: space 
    & time">
    
    <META NAME="DC.form" CONTENT="(SCHEME=imt) text/html">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#form">
    <LINK REL=SCHEMA.imt HREF="http://sunsite.auc.dk/RFC/rfc/rfc1521.html">
    
    <META NAME="DC.identifier" CONTENT="(TYPE=url) 
    http://www.ukoln.ac.uk/ariadne/issue5/metadata-masses/">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#identifier">
    
    <META NAME="DC.relation" CONTENT="(TYPE=IsChildOf) (IDENTIFIER=url) 
    http://www.ukoln.ac.uk/ariadne/issue5/">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#relation">
    
    <META NAME="DC.language" CONTENT="(SCHEME=iso639) en">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#language">
    <LINK REL=SCHEMA.iso639 REFERENCE="ISO 639:1988 Code for the representation of names 
    of languages">
    
    <META NAME="package" CONTENT="(TYPE=end) Dublin Core">
    
    </HEAD>
    
    
    <BODY>
    ...{body of document}...
    

    In writing metadata such as this, the user may include as many of the elements from Table 1 as necessary, and each of these fields may be repeated several times in order to describe all relevant details. In the example above, elements such as

    Coverage

    and

    ObjectType

    have not been used at all, while those such as

    Author

    and

    Publisher

    have been used several times.

     

    As Beckett (1996) notes, the use of case (

    ABC...

    as opposed to

    abc...

    ) and whitespace (

    A B C...

    as opposed to

    ABC...

    ) is not strictly defined within the Dublin Core, and may be modified to suit individual user and project requirements.

     

    While not formally part of the Dublin Core definition, a recognised 'good practice' is evolving, whereby the Dublin Core element name is given in lower case, preceded by an identifier in upper case to denote that the element is from Dublin Core (

    DC.author

    , rather than

    DC.AUTHOR

    ,

    dc.AUTHOR

    ,

    DC.Author

    , etc). Also,

    META

    ,

    NAME

    ,

    CONTENT

    ,

    TYPE

    and

    SCHEME

    should be given in upper case, while the values of each should normally be given in lower case (or a mixture of the two, where proper names etc are involved).

     

    At the most basic, a Dublin Core entry coded within HTML should therefore take the form;

     

    <META NAME="DC.element name" CONTENT="value of element">

     

    eg

     

    <META NAME="DC.author" CONTENT="Paul Miller">

     

    Note the initial '<' and the final '>', as well as the use of " " to enclose the values of

    NAME

    and

    CONTENT

    .

     

    Although undoubtedly easier for the casual viewer to understand than many metadata schemes, the Dublin Core still presents scope for ambiguity in understanding, both of the core elements themselves and in the many

    SCHEME

    s involved in adding extra information.

     

    The solution adopted for overcoming these ambiguities is to include a reference to further information through the HTML <LINK> tag (Weibel 1996, A.P.Miller 1996b). For each occurence of a Dublin Core element, a <LINK> is provided to the definition of that element on the Dublin Core page at http://purl.org/metadata/dublin_core_elements, and for each use of a

    SCHEME

    a link is provided to an on- or off-line definition of the syntax used within that scheme.

     

    e.g.

     

    <META NAME="DC.identifier" CONTENT="(TYPE=url) 
    http://www.ukoln.ac.uk/ariadne/issue5/metadata-masses/">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#identifier">
    

    shows a simple use of the Dublin Core element,

    Identifier

    , with a <LINK> to its definition, while

     

     

    <META NAME="DC.language" CONTENT="(SCHEME=iso639) en">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#language">
    <LINK REL=SCHEMA.iso639 REFERENCE="ISO 639:1988 Code for the representation of names 
    of languages">
    

    illustrates a use of the Dublin Core element,

    Language

    . As this example includes the use of a

    SCHEME

    , an extra <LINK> is included to a definition of this schema.

     

    A <LINK> pointer to further information may take the form of a

    REFERENCE

    to an offline source or an

    HREF

    to another web page.

     

    eg

     

    <LINK REL=SCHEMA.iso639 REFERENCE="ISO 639:1988 Code for the representation of names of languages">

     

     

    <LINK REL=SCHEMA.imt HREF="http://sunsite.auc.dk/RFC/rfc/rfc1521.html">

     

    SCHEME

    s and

    TYPE

    s

    In order to better describe the resource, the basic thirteen elements may be further enhanced by the use of

    SCHEME

    and

    TYPE

    qualifiers. As special cases,

    OtherAgent

    also has a

    Role

    qualifier, and

    Relation

    an

    Identifier

    .

     

    The

    SCHEME

    qualifier identifies any widely recognised coding system used in the description of a specific Dublin Core element, and allows a degree of consistency and standardisation to be introduced to Dublin Core records. Instead of describing (in the

    Form

    element) a web page as being "a web page", "HTML" or "HyperText Markup Language", for example, it is far easier and more consistent to use the existing Internet Media Types (IMT) and describe it as "text/html". In Dublin Core's HTML syntax, this would be represented as;

     

     

    <META NAME="DC.form" CONTENT="(SCHEME=imt) text/html">

     

    and should also be provided with the necessary <LINK>s, as discussed above.

     

    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#form">
    <LINK REL=SCHEMA.imt HREF="http://sunsite.auc.dk/RFC/rfc/rfc1521.html">

     

    A

    SCHEME

    should only refer to the name of an existing coding system such as the Internet Media Type (IMT), or the International Standards Organisation standard on dates (ISO31), and should not be used for identifying, for example, that a use of the

    Author

    element is referring to a name, e-mail address, or whatever. For tasks such as this, the

    TYPE

    qualifier should be used. This suggestion differs from that given in the most comprehensive list of

    SCHEME

    s and

    TYPE

    s currently available (Knight & Hamilton 1996), but appears to create a more logical use of the two qualifiers.

     

    Knight & Hamilton (1996) suggest including the vast majority of qualifiers to a metadata entry within

    SCHEME

    and only use

    TYPE

    in a few cases. This author would suggest a different division, whereby only references to coding schemes appear in

    SCHEME

    and most other qualifiers appear in

    TYPE

    . As a simple rule of thumb, if a <LINK> can be included to an on- or off-line definition, then it is a

    SCHEME

    and if not, it is a

    TYPE

    . An early implementation of this model was produced by the author (1996b), and the beginnings of a second may be seen evolving at http://www.ncl.ac.uk/~napm1/ads/DC_ scheme_type.html, where a comprehensive list of

    SCHEME

    s and

    TYPE

    s will soon be available, along with guidance on usage for each.

     

    The

    TYPE

    qualifier, then, is mainly used where a Dublin Core element occurs more than once in a metadata description. You may, for example, use the

    Author

    element several times in order to provide name, address and telephone information. In a case such as this, the

    TYPE

    qualifier would be used to differentiate between each occurrence of

    Author

    .

     

    eg

     

    <META NAME="DC.author" CONTENT="(TYPE=name) Paul Miller">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

     

    <META NAME="DC.author" CONTENT="(TYPE=email) A.P.Miller@newcastle.ac.uk">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

     

    <META NAME="DC.author" CONTENT="(TYPE=postal) University Computing Service University of Newcastle Newcastle upon Tyne NE1 7RU UK">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

     

    <META NAME="DC.author" CONTENT="(TYPE=phone) +44 191 222 8212">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

     

    <META NAME="DC.author" CONTENT="(TYPE=fax) +44 191 222 8765">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

     

    <META NAME="DC.author" CONTENT="(TYPE=affiliation) University of Newcastle upon Tyne">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

     

    <META NAME="DC.author" CONTENT="(TYPE=affiliation) Archaeology Data Service">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

     

    <META NAME="DC.author" CONTENT="(TYPE=homepage) http://www.ncl.ac.uk/~napm1/">
    <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#author"> 

     

    Note that

    TYPE

    s and

    SCHEME

    s may be used several times within a Dublin Core description in the same manner as the Core Elements themselves. In the example above,

    Affiliation

    appears twice in order to effectively describe the affiliations affecting work on this project.

     


    Extending the Dublin Core

    Even with the great flexibility afforded by

    SCHEME

    s and

    TYPE

    s, the thirteen elements of the Dublin Core are not capable of describing all eventualities. If the core element set were extended in order to attempt this, it would rapidly become large and unwieldy, and ultimately one of the incomprehensibly complex metadata schemes that Dublin Core was created to avoid.

     

    The currently held view of Dublin Core is that it should not be directly extended itself, but that any necessary extensions should be included in a separate 'package', as proposed in the Warwick Framework (Lagoze et al 1996). Descriptions stored within this new 'package' may then either be from a totally different metadata scheme, such as DIF or FGDC, or they may be simple extensions to the thirteen Dublin Core elements, and described in a Dublin Core-like syntax.

    In the same way as the package of metadata known as the Dublin Core is enclosed within

     

    <META NAME="package" CONTENT="(TYPE=begin) Dublin Core">
    
    ...
    
    <META NAME="package" CONTENT="(TYPE=end) Dublin Core">
    

    so should any other package of metadata be denoted. Where the metadata scheme used is Dublin Core-like in syntax, a form for element names similar to the

    SCHEME.element name

    (eg

    DC.author

    ) of Dublin Core should also be used.

     

    eg

     

    <META NAME="package" CONTENT="(TYPE=begin) Dublin Core"> 

     

     

    ...Dublin Core metadata in here... 

     

     

    <META NAME="package" CONTENT="(TYPE=end) Dublin Core"> 

     

     

    <META NAME="package" CONTENT="(TYPE=begin) ahdsDescriptor"> 

     

     

    <META NAME="AD.precision" CONTENT="(TYPE=spatial) (TYPE2=recorded) 2">
    <LINK REL=SCHEMA.ad HREF="http://www.ncl.ac.uk/~napm1/ads/ahds_descriptor_elements#precision"> 

     

     

    <META NAME="package" CONTENT="(TYPE=end) ahdsDescriptor"> 

     


    What the future holds...

    Given rapid changes both in metadata and in the Web itself, it is difficult to predict exactly what the future holds, but for the Web/HTML version of Dublin Core described here to be most useful, the following developments need to be pursued:

     

    HTML

    The current practice of inserting Dublin Core metadata within HTML's <META> tag certainly works, but enhancements to the existing definition of this tag should be encouraged in order to enable more legible representations whereby the current

     

    <META NAME="DC.author" CONTENT="(TYPE=email) A.P.Miller@newcastle.ac.uk">

     

    might be replaced by

     

    <META NAME = "DC.author"
          TYPE = "email"
          CONTENT = "A.P.Miller@newcastle.ac.uk">
    

    Whilst the latter form is accepted by the current generation of Web browser, it breaks the Document Type Description (DTD) for HTML, and therefore does not pass the majority of HTML validation tools currently used by Web authors.

    Metadata creation

    At present, although tools exist for the creation of metadata conforming to some of the more complex schemes, Dublin Core-style metadata must be entered by hand. Work is currently underway within projects such as the European-funded DESIRE (McDonald pers comm) to investigate means by which much of this metadata creation may be automated (McDonald 1996). Such automation will undoubtedly make the creation and upkeep of useful metadata more straightforward, and therefore hopefully more commonplace.

     

    Search Engines

    As discussed above, many of the web search engines allow the inclusion of limited metadata within the <HEAD> </HEAD> area, but this metadata is only fully used if it is in the syntax recommended for that particular engine. While representatives of several of the search engine producing companies are involved in Dublin Core development, none has yet modified their software to make full use of Dublin Core-compliant web pages. Such a development cannot be far off in happening.


    Conclusion

    The world of digital metadata is a complex one, currently in a state of rapid flux. As I sit in sunny Newcastle typing the last of this paper, e-mail messages continue to arrive from various lists that threaten to force a rethink of my ideas. With deadlines looming, and demonstrating a remarkable degree of willpower, I ignore these latest ideas in order to actually get this article finished in time.

    As such, it is impossible to say that the implementation of Dublin Core demonstrated here is exactly the one that will be recommended six months down the road, but given all the hard work that has gone into deriving the current offering any evolution is likely to be slight. The next stage is to continue exploring different uses of the Dublin Core idea, and to approach standards bodies with a view to ratifying something in the near future.

    As exactly the type of person for whom Dublin Core could offer so much, it would be extremely useful if Ariadne readers could begin to implement Dublin Core metadata in their web pages, and report back on any of the shortcomings that they discover. If you start now, you'll be a part of a growing and exciting trend, whereby all the data available out on the Web might actually become information, and therefore of use to the wider community.


    A selection of useful references

    Not all of these references are actually cited in the article, but they do form a useful introduction to some of the issues behind the use of metadata in resource description.

    Beckett, D.J., 1996, Using Dublin Core Metadata, Draft 0.1, URL: http://www.hensa.ac.uk/pub/metadata/dc-encoding.html.

    Beckett, D.J., Knight, J., Miller, E. & Miller, A.P., forthcoming, A guide to implementation of the Dublin Core Element Set, URL: http://www.ncl.ac.uk/~napm1/ads/D C_implementation.html.

    British Library, 1980, UKMARC manual 2nd edition, British Library: London.

    Burnard, L., Miller, E., Quin, L. & Sperberg-McQueen, C.M., 1996, A syntax for Dublin Core metadata: Recommendations from the second metadata workshop, URL: http://info.ox.ac.uk/~lou/wip/metadata.syntax. html.

    Day, M., 1996, The UKOLN Metadata page, URL: http://www.ukoln.ac.uk/metadata/.

    Dempsey, L., 1996a, Meta Detectors, Ariadne 3, URL: http://www.ukoln.ac.uk/ariadne/issue3/metadata/.

    Dempsey, L., 1996b, ROADS to Desire: Some UK and Other European Metadata and Resource Discovery Projects, D-Lib Magazine July/August, URL: http://www.ukoln.ac.uk/dlib/dlib/july96 /07dempsey.html.

    Dempsey, L. & Weibel, S.L., 1996, The Warwick Metadata workshop: a framework for the deployment of resource description, D- Lib Magazine July/August, URL: http://www.ukoln.ac.uk/dlib/dlib/july96/0 7weibel.html.

    FGDC, 1994, Content standards for digital geospatial metadata, Federal Geographic Data Committee, 8 June.

    Global Change Master Directory, 1996, Directory Interchange Format (DIF) Writer's Guide, version 5, URL: http://gcmd.gsfc.nasa.gov/difguide/difman.html< /A>.

    Greenstein, D., 1996, AHDS: Arts & Humanities Data Service, Ariadne 4, URL: http://www.ukoln.ac.uk/ariadne/issue4/ahds/.

    Heery, R., forthcoming, Review of metadata formats, Program 30/4.

    Howe, D., 1996, Free on-line Dictionary of Computing (FOLDOC), URL: http://wombat.doc.ic.ac.uk/.

    IFLA, 1996, Metadata Resources, URL: http://www.nlc-bnc.ca/ifla/II/metadata.htm.

    Knight, J., 1996, MIME implementation for the Warwick Framework, URL: http://www.roads.lut.ac.uk/MIME- WF.html.

    Knight, J. & Hamilton, M., 1996, Dublin Core sub-elements, URL: http://www.roads.lut.ac.uk/Metadata/DC-SubElements.html.

    Lagoze, C., Lynch, C.A. & Daniel Jnr., R., 1996, The Warwick Framework: a container architecture for aggregating sets of metadata, URL: http://cs-tr.cs.cornell.edu:80/Dienst/UI/2.0/Describe/ncstrl.cornell%2fTR96- 1593?abstract=warwick.

    Library of Congress, 1994, USMARC format for bibliographic data including guidelines for content negotiation, Network Development and MARC standards office, Library of Congress: Washington DC.

    Lock, G. & Stancic, Z., (Eds.), 1995, Archaeology and Geographical Information Systems: A European Perspective, Taylor & Francis: London.

    McDonald, T., 1996, Welcome to the metatest site, URL: http://www.netskills.ac.uk/staff/m cdonald/metadata/index.html.

    Medyckyj-Scott, D., Newman, I., Ruggles, C. & Walker, D., 1991, Metadata in the Geosciences, Group D Publications, Ltd: Loughborough.

    Miller, A.P., 1995, How to look good and influence people: thoughts on the design and interpretation of an archaeological GIS, in Lock & Stancic, (Eds.), pp. 319-333.

    Miller, A.P., 1996a, The York Archaeological Assessment: an investigation of techniques for urban deposit modelling utilising Geographic Information Systems, unpublished doctoral thesis, Department of Archaeology, University of York.

    Miller, A.P., 1996b, An application of Dublin Core from the Archaeology Data Service, URL: http://www.ncl.ac.uk/~napm1/ads/metadata.html .

    Miller, E.J., 1996a, An approach for packaging Dublin Core metadata in HTML 2.0, URL: http://www.oclc.org:504 6/~emiller/publications/metadata/minimal.html.

    Miller, E.J., 1996b, Issues of document description in HTML, URL: http://www.oclc.org:5046/~ emiller/publications/metadata/issues.html.

    Nanson, B., Smith, N. & Davey, A., 1995, What is the British National Geospatial Database?, AGI'95 Conference Proceedings, pp. 1.4.1-1.4.5. Reproduced on the WWW at URL: http://www.ordsvy.gov.uk/osinfo/ge neral/agi95/nansmit.html.

    Richards, J.D., 1996, The Archaeology Data Service, URL: http://intarch.ac.uk/ahds/welcome.html.

    Schwartz, M., 1996, Report of the W3C Distributed Indexing/Searching workshop, URL: http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop/.

    Weibel, S., 1996, A proposed convention for embedding metadata in HTML, URL: http://www.oclc.org:5046/~weibel/html-meta.html.

    Weibel, S., Godby, J., Miller, E. & Daniel, R., 1995, OCLC/NCSA Metadata Workshop Report, URL: ht tp://www.oclc.org:5047/oclc/research/publications/weibel/metadata/dublin_core_report.html.

    Weibel, S. & Miller, E., 1996, Dublin Core Metadata Element Set WWW homepage, URL: http://purl.org/metadata/dublin_core.


    Acknowledgements

    Thanks are due to more people than I can sensibly mention here, so I'll just settle for thanking the global metadata community (!) for their continuing hard work in this field, and hope that I haven't managed to misrepresent too many of the ideas currently being discussed.

    A special mention is also due to Tony Gill of eLib's ADAM project, who is responsible for designing the informal Dublin Core logo that appears below. Maybe it's time we formally adopted it...?

    And finally thanks to the AHDS' Archaeology Data Service (ADS), my involvement in which finally gave me the necessary kick (or excuse?) to make me take a good look at issues which I'd always sort of thought about, but rarely elucidated...

    Date published: 
    19 September 1996

    This article has been published under copyright; please see our access terms and copyright guidance regarding use of content from this article. See also our explanations of how to cite Ariadne articles for examples of bibliographic format.

    How to cite this article

    Paul Miller. "Metadata for the Masses". September 1996, Ariadne Issue 5 http://www.ariadne.ac.uk/issue5/metadata-masses/


    article | by Dr. Radut