The World Wide Web Consortium (W3C) was set up by Tim Berners-Lee in 1994 to preserve and enhance the public utility of the Web for everyone, to "lead the Web to its full potential". It is a consortium of industrial and institutional members (around 450 at the time of writing) who pay on a sliding scale proportional to size. It produces Recommendations which are widely recognised as de facto standards. The actual work of writing those standards is carried out by Working Groups mostly made up of representatives of members, aided by a permanent staff. At the moment there are over fifty active Working Groups, with over 700 members, working on around 100 documents at various stages of their progress towards Recommendation status. The permanent staff numbers around 60, attached to one of the three host institutions: the Massachusetts Institute of Technology, in Cambridge, MA, USA; the European Research Consortium for Informatics and Mathematics, in Sophia Antipolis, France and Keio University, in Tokyo, Japan.
The W3C manages its work according to a formal Process, with an emphasis on consensus and community review, which specifies a progression from Working Draft through Candidate Recommendation and Proposed Recommendation, before the Director (currently Tim Berners-Lee) seeks formal reviews from the membership and either approves publication as an official W3C Recommendation, or returns it to the Working Group for further work.
One of the responsibilities of the Director is to consider the architectural impact of Working Groups' output, particularly of Proposed Recommendations. As the consortium grew, and the scope of its work expanded, it became increasingly difficult for one person to bear the responsibility for articulating 'the architecture'. Working Groups needed a concrete expression of what came to be called "Web Architecture", to which they and others could refer as the basis for planning and decision making. In 2001 the membership agreed to create a Technical Architecture Group (TAG), to take on the task of identifying and documenting the architecture of the World Wide Web.
The TAG has nine members: the Director ex officio and eight others who serve two- year terms. Of these nine, three are appointed by the Director and five are elected by the W3C membership (although they need not be associated with the W3C themselves). Although the Director is nominally the Chair, in practice he delegates this responsibility to one of the appointees.
The following photograph shows the current TAG membership, with the exception of Dave Orchard of BEA Systems:
"[T]he mission of the TAG is stewardship of the Web architecture. There are three aspects to this mission:
In practice this has meant that a lot of the TAG's work has been a kind of industrial archaeology: exploring and analysing the ways in which the technologies which comprise the World Wide Web are used and abused, to try to articulate what is important and what is not, what really underpins the success of the Web so far, what is incidental and what actually threatens the success of the Web going forward.
The primary focus of the first three years of the TAG was on documenting in a clear and easily understood manner the architectural foundations of the Web. The result was published at the end of 2004 as Architecture of the World Wide Web, Volume One  often referred to as 'WebArch'. It is written in a relatively informal style, with illustrations, and many of its conclusions are expressed in succinct 'principles', 'constraints' and 'good practice notes', such as:
Principle: Global Identifiers Global naming leads to global network effects.
Good practice: Identify with URIs To benefit from and increase the value of the World Wide Web, agents should provide URIs as identifiers for resources.
Constraint: URIs Identify a Single Resource Assign distinct URIs to distinct resources.
As these examples show, WebArch tries hard to address the basic issues of web architecture clearly and straightforwardly, and as a result it has proved useful not just for the Working Groups of the W3C, but for teachers, students and the general public.
A short note on terminology: The TAG distinguishes three crucial participants in the thing at the heart of the Web, that is, links:
The starting point. The TAG focuses on http: URIs, for example http://weather.example.com/oaxaca.
The end point, which we say is identified by a URI. It can be anything at all.
Something that can be sent in a message, typically from a server to a client, in response to a request.
WebArch includes the following picture of the relationship between these three:
WebArch also distinguishes an important subclass of resources, called information resources, as those resources 'all of [whose] essential characteristics can be conveyed in a message.' Most of the URIs we browse, search for and author, identify information resources: Web pages, images, product catalogues, etc., but URIs can also be created for non-information resources, such as:
typically in the context of the Semantic Web.
Since the publication of WebArch, the TAG has been in more reactive mode, responding to requests from within and outside W3C to address issues and reconcile competing practices. Some of the issues which have been raised and addressed, usually by publishing short documents known as 'findings', are listed below:
The TAG is currently engaged with a number of issues. In some cases draft findings are available, in others things are still at the preliminary fact-finding and discussion stage. The following sections give brief summaries of these issues and where the TAG is in its consideration of them.
The TAG has been working on a number of aspects of the complex problem of versioning and extensibility for formally defined languages in general, and XML languages in particular, for over three years. The work has been both analytical -- trying to pin down what the language-evolution aspects of HTML have been and to give clear and well-grounded definitions for the relevant terminology -- and proactive, trying to identify and recommend good practice both for the schema languages which are used to define languages, and for the languages themselves.
The work is currently expressed in two draft findings:
The TAG's analysis of how the Web works, building on previous work, has identified a few key properties of how http: URIs and the HTTP protocol combine and which are hugely powerful and beneficial. Accordingly the TAG is concerned by the number of new URI schemes (for example info:, xri:, doi:) and URN (sub-)namespaces (for example urn:nzl, urn:ietf:params:xml, urn:oasis:names:tc:ubl) being promoted for use in identifying resources on the Web, because they threaten not only to dilute that value for others, but also fail to deliver the intended benefits to their users.
Accordingly the TAG has undertaken to analyse the technical arguments most often advanced in support of new approaches to naming things on the Web, and, wherever possible, identify the ways in which these arguments misunderstand or misrepresent the properties of http:-based naming. This work, along with several extended examples, is available as a draft finding: URNs, Namespaces and Registries .
That user agents should not send passwords over the Internet in the clear, or trivially encoded, seems obvious; but formulating guidelines for user agents on when and how to warn users that they are at risk of doing so has proved surprisingly difficult. The current state of the TAG's efforts to express this can be found in a draft finding: Passwords in the Clear .
The TAG has recently begun discussions on the potential architectural impact of a proposal to introduce a new means of abbreviation for URIs, known as Compact URIs (or CURIEs, for short) . No conclusions have been reached so far.
One of the key aspects of the Web as the TAG understands it lies in the extent to which it supports, to put it informally, 'following your nose' to find things out. This is not just a matter of the way the Web allows a user to click a link to go from one Web page to another, but also in the way Web-accessible resources carry with themselves a kind of audit-trail concerning their own interpretation, via for example media types and namespaces. The phrase 'self-describing Web' refers to this part of the Web's value proposition. One draft finding has been published about one aspect of this, namely the question of what 'best practice' should be with respect to XML namespace documents. By this it is meant the information resource, if any, whose representation can be retrieved from an XML namespace URI. The TAG is still working to identify the best combination of current practice, particularly the use of RDDL  with the evident potential of the Semantic Web in this area. The most recent draft finding is somewhat out of date: Associating Resources with Namespaces , more recent discussion can be found on the www-tag mailing list namespaceDocument-8 background .
Another area where the TAG is in exploratory mode, where no draft finding has been issued, concerns the architectural background of the recent restructuring of the W3C's HTML work. The message announcing the TAG's interest in this area  introduced it as follows:
"Is the indefinite persistence of 'tag soup' HTML consistent with a sound architecture for the Web? If so, (and the starting assumption is that it is so), what changes, if any, to fundamental Web technologies are necessary to integrate 'tag soup' with SGML-valid HTML and well-formed XML?"
(By way of explanation: 'By "'tag soup' HTML" is meant documents which are not well-formed XHTML, or even SGML-valid HTML, but which none-the-less are more-or-less successfully and consistently rendered by some HTML browsers.')
As well as driving the issues summarised in the preceding section to a resolution, what else is the TAG looking forward to considering? Some topics which we hope to consider in the near future are:
The TAG carries out all its work in public. In particular, there are two public mailing lists where TAG business can be observed:
High-bandwidth. Discussion of open and potential TAG issues. Announcements of draft findings. Agendas and minutes for weekly telcons and quarterly face-to-face meetings. Open to anyone to subscribe (send 'subscribe' to firstname.lastname@example.org) and to post. Publicly-readable archives .
Low bandwidth. Announcements of findings, quarterly summaries of work undertaken. Open to anyone to subscribe (send 'subscribe' to email@example.com), but closed to public posting. Publicly-readable archives .
The TAG depends on the whole Web community to review its work and point it in new and fruitful directions -- please help!