Web Magazine for Information Professionals

Web Focus: The 7th World Wide Web Conference

Brian Kelly is interviewed about the 7th World Wide Web Conference upon his return from Brisbane.

Australia is a long way to go for a conference. What were you doing there?

I attended the conference in my role as UK Web Focus and the JISC representative on the World Wide Web Consortium. Attendenance at the World Wide Web conference provides me with an opportunity to monitor the latest Web developments and keep the community informed.

What were the highlights of the conference?

In a three letter acronym - RDF! RDF, the Resource Description Framework, was the highlight of Tim Berners-Lee’s keynote talk (which, of course, is available on the web [1]) as well as being covered in a panel session and on developer’s day.

In slightly more than three letters can you describe what RDF is and why it’s important?

My colleague Rachel Heery gave a summary of RDF in Ariadne recently [2]. Metadata has been described as the missing architectural component of the Web. We’ve got transport (HTTP), addressing (URLs) and a data format (initially HTML and now XML). But, so far, we’ve only had a very limited mechanism for providing “data about data”. RDF is intended to provide that infrastructure.

RDF is a framework for managing a variety of types of metadata?

That’s right. We’re currently reaching limitations imposed by the lack of extensibility of web protocols. PEP (Protocol Extension Protocol) is being developed to provide an extension mechanism for HTTP, just as XML provides an extensible markup language, which is not restricted to the fixed element set provided by HTML. RDF provides a similar extensible framework for metadata.

So RDF will enable me to define my Dublin Core metadata and my MARC metadata. I can understand how this will help with finding and managing resources. But can RDF do anything else?

Sure. RDF is a general metadata framework. RDF applications which are being developed by the W3C include digitial signatures (so you have a verifiable and potentially legally-binding signature giving not only the owner of a resource but also assertions about the resource), rating schemes (based on the PICS work) and privacy applications. RDF can also be used by various communities to develop their own application, without the need for a centralised agreement. We are already seeing the development of a sitemapping schema which has been implemented in Mozilla, the source release of the Netscape browser.

I’ve heard the term knowledge representation used. What does this mean and how does it relate to RDF?

The term knowledge representation has been used by Tim Berners-Lee for some time. At present we have marked-up information stored on the Web. Once XML becomes more widely deployed information will be marked up in a richer way. However we still have no “knowledge” about the information. RDF is based on a mathematical model in which relationships can be defined. As Tim Berners-Lee described in his talk, this will enable questions such as “Is there a green car for sale for around $15,000 in Queensland” and “What was the average temperature in 1997 in Brisbane?” to be answered - and not simply provide a list of resources which contain the keywords.

In issue 9 of Ariadne [3] you reported on the last WWW conference. You stated that “The major new talking point was XML, the Extensible Markup Language.” Has XML taken off yet? Has XML developed since last year?

There are three things to consider: (a) XML applications, (b) XML protocols and © XML tools. Over the past year or so a number of XML applications have been developed or proposed including the Maths Markup Language (which became a W3C Recommendation in April 1998), Chemical Markup Language (CML), Channel Definition Format (CDF), etc. These have been developed by companies such as Netscape and Microsoft - which is, of course, significant.

We’ve also seen many developments to the XML protocols. Although XML itself hasn’t changed much, we’ve seen significant changes to the hypertext linking proposals, with the XLink (previously known as XML-LINK), together with the XPointer working draft, being released in March 1998. XLink will provide a rich form of hyperlinking, including support for external links databases and links which can replace the existing document (similar to HTML’s <A> element), be embedded in the document (like the <IMG> element) or create a new window. XPointer aims to provide a mechanism for addressing the internal structures of XML documents

However although the demonstration of an application which made use of external links databases was impressive, such new features are only likely to be widely deployed once they are incorporated by the major browser bendors. Microsoft announced their support for XML a year ago, and provided limited support for XML with the CDF application in Internet Explorer 4.0. Although Netscape have not been renowned for their support for SGML (to put it mildly) Mozilla does appear to support XML. version 5 of the netscape browser is eagerly awaited.

Where does this leave backwards compatibility?. Does the web research community care about this? Analysis of many server log files shows that many people are still using very old browsers.

This is a question I regularly ask Web developers. Some argue that the new generation of browsers will be so clearly superior that everyone will willingly upgrade. This is, of course, what happened with Gopher.

Since web protocols are designed to be backwards compatible, it should be possible to introduce new tecnhologies without breaking existing browsers. For example, cascading style sheets can be deployed without harming browsers which are not aware of style sheets. Sadly, however, commercial pressures have forced browser vendors to release software with badly implemented features, which will cause interoperability problems.

It’s perhaps worth mentioning, however, that there are developments taking place which will help organisations which have large numbers of old PCs or would finding the upgrading of browsers a major task. In his keynote talk James Gosling, the father of Java described Sun’s Activator technology which aims to develop a Java Virtual machine (JVM) for 486 PCs. This should enable the functionality provided by Java applications to be accessible to 486 computers. As an aside it’s worth mentioning that many of the ideas described in papers at the conference were implemented in Java. Of relevance to the backwards compatibility issue was the paper on “An Extensible Rendering Engine for XML and HTML[4] which describes how support for new XML elements could be provided by downloadable Java applets (known as Displets), rather than having to upgrade to a new version of a browser.

We should also see developments in the market place - there’s potentially a vast market for web tools which will run on 486 PCs. The next release of the Opera browser (a browser designed to run on 486 PCs), for example, will provide support for style sheets. Dave Raggett, a W3C staff member and one of the editors of the HTML recommendation, gave a talk on mobile computing. The Japanese appear to be leading the world in developing networked mobile computers, including mobile phones, PDAs and laptops which can access web resources. The Dearing report, of course, mentioned the importance of portable computers for students.

Finally there is the role of intermediaries - ways of providing new functionality by adding new computational elements between the client and the server. This is often achieved by making use of a proxy. For example support for URNs (Uniform Resources Names) could be deployed in a proxy cache until native support is provided in browsers. A paper at the conference on “Intermediaries: New Places For Producing and Manipulating Web Content[5] introduced the term “intermediary” and gave a number of examples.

Was there much of interest to the UK HE community at the conference?

A number of papers analysed the major search engines. “What is a Tall Poppy Among Web Pages?[6] used a variety of statistical techniques to reverse engineer Lycos, Infoseek, Alta Vista and Excite. They build decision trees for the search engines based on a number of factors, such as the number of times the keyword occurs in the resource, the title, meta fields and in headings, the length of the document, etc. They concluded that Excite had the most complex decision tree. This was potentially a dangerous paper to publish, as the findings will no doubt be of interest to “index spammers”!

A Technique For Measuring the Relative Size and Overlap on Public Web Search Engines[7] also used statistical techniques to measure the coverage of the main search engines. They concluded that AltaVista had the largest coverage (160 million resources), followed by HotBot (100 million), Exite (77 million), Excite (32 million) and Infoseek (27 million). Based on their analysis they estimated that the static Web contained 200 million resources in November 1997.

The Anatomy of a Large-Scale Hypertextual Web Search Engine[8] described a prototype large-scale search engine known as Google which makes use of hypertext structure to improve searching results. The system is available at <URL: http://google.stanford.edu/>.

There were a number of papers of interest to the teaching and learning community including “Web-based Education For All: A Tool for Development Adaptive Courseware[9] which described a toolkit known as InterBook for developing courseware and “Delivering Computer Assisted Learning Across the WWW[10] which described the development of multimedia medical teaching applications at Aberdeen University.

Several short papers and posters also covered teaching and learning on the web. I was particularly interested in “Gateway To Educational Resources (GEM): Metadata for Networked Information Discovery and Retrieval[11], which made use of Dublin Core metadata with educational resources. Another paper from the UK HE community described how “Hypermedia Learning Environments Limit Access to Information[12]

Were there many papers from the UK?

As well as the one’s I’ve mentioned there were two short papers, as well as contributions to two panel sessions from Southampton University. The Arjuna research group from the Computer Science department at Newcastle University were another group who have given papers at several WWW conferences. Finally Dave Whittington from the Computer Science Department should be mentioned by name. Not only did Dave run a workshop on Teaching and Learning on the Web, but his short paper on “Task-based Learning Environments in a Virtual University[13] won a prize for the best poster. Congratulations Dave.

Any final comments?

WWW8 will be held in Toronto (see <URL: http://www8.org/announce.html> for further details). It would be good to see some more contributions in the conference from the UK. The submission date for papers will probably be 1 December 1998.

I should also add that a more detailed report on the conference is available at <URL: http://www.ukoln.ac.uk/web-focus/events/conferences/www7/>

Thank you

My pleasure.

References

  1. Evolvability, Tim Berners-Lee’s Keynote Talk at WWW7
    http://www.w3.org/Talks/1998/0415-Evolvability/overview.htm
  2. What Is RDF?, Ariadne issue 14
    http://www.ariadne.ac.uk/issue14/what-is/
  3. Web Focus Article, Ariadne issue 9
    http://www.ariadne.ac.uk/issue9/web-focus/
  4. An Extensible Rendering Engine for XML and HTML, WWW 7 Conference
    http://www7.conf.au/programme/fullpapers/1926/com1926.htm
  5. Intermediaries: New Places For Producing anbd Manipulating Web Content
    http://www7.conf.au/programme/fullpapers/1895/com1895.htm
  6. What is a Tall Poppy Among Web Pages?
    http://www7.conf.au/programme/fullpapers/1895/com1895.htm
  7. A Technique For Measuring the Relative Size and Overlap on Public Web Search Engines
    http://www7.conf.au/programme/fullpapers/1937/com1937.htm
  8. The Anatomy of a Large-Scale Hypertextual Web Search Engine
    http://www7.conf.au/programme/fullpapers/1921/com1921.htm
  9. Web-based Ediucation For All: A Tool for Development Adaptive Courseware
    http://www7.conf.au/programme/fullpapers/1893/com1893.htm
  10. Delivering Computer Assisted Learning Across the WWW
    http://www7.conf.au/programme/fullpapers/1876/com1876.htm
  11. Gateway To Educational Resources (GEM): Metadata for Networked Information Discovery and Retrieval
    http://www7.conf.au/programme/posters/1897/com1897.htm
  12. Hypermedia Learning Environments Limit Access to Information
    http://www7.conf.au/programme/docpapers/1941/com1941.htm
  13. Task-based Learning Environments in a Virtual University
    http://www7.conf.au/programme/posters/1848/com1848.htm

Author Details

Brian Kelly
UK Web Focus,
Email: B.Kelly@ukoln.ac.uk
UKOLN Web Site: http://www.ukoln.ac.uk/
Tel: 01225 323943
Address: UKOLN, University of Bath, Bath, BA2 7AY