Ian Peacock describes what a URI is. This article appears in the Web, and not the print, version of Ariadne.
![]()
Users of the Web are familiar with URLs, the Uniform Resource Locators. A URL is a locator for a network accessible resource. Such a locator can be considered an identifier for the resource that it refers to. Depending on the interpretation of identification, various different attributes of a resource could be considered as an identifier for that resource. However, what comprises a functional resource identifier depends upon the context in which that identifier will be used. For example, in a group of five people, identifying individuals by weight is unlikely to be practical. In many situations, we assign a name to an object and use this attribute as the object identifier. Such names also have to be chosen with regard to the context in which it will be used in order to be functional. Back to the example of a group of people, we may refer to a particular person by a combination of their fore and surnames. This name label would probably adequately identify a particular person in a group of five.
The Uniform Resource Identifiers (URIs) are a set of character strings, defined by a generic URI syntax, that are used for identifying resources. A URI provides a simple and extensible means for identifying a resource that can then be used within applications. The URI specification implements the recommendations of various functional recommendations (see further information below).
URIs form a superset of three distinct groups of identifiers, which will be described further on. They are:
These identifiers, and the generic URI are formally specified in various IETF Working Drafts and RFCs.
We can consider that a resource is anything to which we can attach identity. A resource arises through a conceptual mapping to an identified entity. Such identity does not necessarily imply network (or other) accessibility. Since the mapping is conceptual, the entity itself may not be constant (e.g. a book being written changes over time) or even instantiated at any given time (e.g. the contents of a noticeboard could be empty).
Some examples of resources are listed below:
An identifier is an object that acts as a reference to something that has identity (i.e. a resource). The identifier may be used to dereference the resource if the resource is accessible. Note that at this level we have not specified that an identifier be unique.
Some examples of identifiers are listed below:
Confusingly, we sometimes identify an entity by a name that is also another attribute of the entity. For example, we could label a particular person as "Jim with brown hair".
In the context of URIs, the identifier is a set of characters conforming to the URI syntax. Restrictions to the syntax may further classify the URI as a URL, URN or URC, i.e. to different classes of identifier. Very broadly, a URN is a name, though this name should be globally unique, a URC is a resource description and a URL specifies the resource location.
The Uniformity of the URI is inherited by URLs, URNs and URCs. Uniformity refers to the strict syntax to which the URI must conform. A URL, URN or URC must each follow a more class specific syntax, designed to best facilitate the purpose of the class.
Uniformity provides a number of benefits:
The best known identifier is probably the URL. A URL identifies network accessible resources by a scheme (that conventionally represents the primary access mechanism), a machine name and a "path". The path is interpreted in a manner depending on the scheme.
URLs have the most varied use of the URI syntax and often have a hierarchical namespace (e.g. in specifying a directory path in an HTTP scheme URL). Currently, we confuse URLs as both a name and a location for a resource. This is bad practice, since URLs may be transient and a the location defines exactly one location even though a resource may exist in multiple locations. In the larger Internet information architecture, URLs will act only as locators.
Whereas a URL identifies the location or container for an instance of a resource, a URN, in the Internet architecture, identifies the resource. The resource identified by a URN may reside in one or more locations, may move, or may not be available at a given time.
The URN has two practical interpretations, both for network-accessible resources. The first is as a globally unique and persistent identifier for a resource, achieved through institutional commitment. The second interpretation is as the specific "urn" scheme, which will embody the requirements for a standardised URN namespace. Such a scheme will resolve names that have a greater persistence than that currently associated with URLs. Work is still in progress on standardising this scheme.
A functional requirements standard for URNs (RFC1737) lays down a number of properties that URNs should embody. This includes features such as global scope, global uniqueness and persistence.
A number of URN resolving services currently exist, see the applications mentioned below.
The Internet draft "URC Scenarios and Requirements" defines the URC:
The purpose or function of a URC is to provide a vehicle or structure for the representation of URIs and their associated meta-information.
Initially URCs were envisioned to be the intermediate that associated a URN with a set of URLs that could then be used to obtain a resource. Later it was decided that metadata should also be included so that resources could be obtained conforming to a set of requirements. URCs are essentially descrptions of resources available via a network.
Although work has been carried out by the IETF URC working group, URCs are still not in existence. It seems unlikely at present that URCs will become standardised.
The problems of using a locator (i.e. a URL) as a name have already been mentioned. URIs tackle addressing for the future Internet architecture. Resources will be identified by a URN, which will be resolved via a URN resolution service. Currently, it looks unlikely that URCs will have a large part to play in the process.
Official URI-related standardisation has been slow, though a number of URN resolution services now exist adhering to accepted conventions. For more details, see the TURNIP [9] pages.
A number of applications have been built around the URN concept, including: