Web Magazine for Information Professionals

Web Focus: Extending Your Browser

Brian Kelly on techniques for extending the capabilities of your browser.

The WebWatch project [1], which was based at UKOLN, involved the development and use of a variety of tools to analyse web resources and web servers. During the early development of the software, individual summaries (particularly of outliers in the statistical data) were often required in order to check that the software was working correctly. Initially summaries were obtained using simple Unix scripts. For example the urlget script displayed the HTTP headers for a resource as illustrated below:

% urlget www.ukoln.ac.uk
HTTP/1.1 200 OK
Date: Tue, 16 Feb 1999 09:46:43 GMT
Server: Apache/1.2b8
Last-Modified: Tue, 09 Feb 1999 12:24:10 GMT
ETag: “3fd0-28ba-36c028ea”
Connection: close
Content-Type: text/html

<HTML>

Figure 1: Use of the urlget Script

The urlget script simply provides a simplified interface to the telnet command.

As more sophisticated scripts became needed, the command line Unix interface became more of a barrier, especially for the occasional Unix user. In order to provide an improved user interface, and to develop a service which could be used outside UKOLN it was decided to provide a web interface to a number of the WebWatch tools.

WebWatch Web-Based Services

Two web-based services have been produced by the WebWatch project.

http-info
http-info provides information of the HTTP headers associated with a web resource.
doc-info
doc-infoprovides information on the components of a web resource, including the size of the resource (the size of the HTML file, embedded objects such as images and the size of the HTTP headers), details of the server software used to serve each object, a summary of metadata contained in the HTML resource, a listing of hyperlinks from the resource (together with details of broken links), a summary of the HTML elements found, the time taken to access the resource and embedded objects, details of the cachability of the resource and embedded objects and details of any redirects.

The http-info service is available at URL: <http://www.ukoln.ac.uk./web-focus/webwatch/services/http-info/> and the doc-info service is available at URL: <http://www.ukoln.ac.uk./web-focus/webwatch/services/doc-info/>

Use of doc-info is illustrated in Figure 2.

Use of the doc-info Service
Figure 2: Use of the doc-info Service

Limitations of these Services

Allow these web-based services provide a graphical interface to obscure command-line tools, and the web interface enables the services to be used by all web users, they still have a number of disadvantages:

  • Using the service means “leaving” the page you are viewing.
  • Accessing the services requires the address of the service to be provided, typically using a bookmark or by typing in the URL of the service.
  • Once the service has been accessing, the address of the page to be analysed has to be supplied, either by typing it in or by copying it from the clipboard (assuming you remembered to copy the address in the first place!)

We have recently addressed these limitations.

The Netscape Personal Toolbar

The Netscape Personal Toolbar is a little-known feature of the Netscape browser. As illustrated in Figure 3 the personal toolbar enables the end user to add their own links to the top of the browser.

The Netscape Personal Toolbar
Figure 3: The Netscape Personal Toolbar

In some respects, the personal toolbar provides a simple bookmark facility. However the personal toolbar enables the Netscape browser to be extended through the use of simple JavaScript. If the following code is added to the personal toolbar:

javascript:void(window.open(’http://www.ukoln.ac.uk/web-focus/webwatch/services/http-info/httpinfo.cgi?everything=1&mode=auto&display_format=rdf&url=+escape(window.location)))

clicking on the icon in the personal toolbar causes the http-info service (at the URL shown in red) to run in a new window, taking the address of the current page being viewed together with the pre-defined parameters (show in blue).

As an example click on the following link:
javascript:void(window.open(‘http://www.ukoln.ac.uk/web-focus/webwatch/services/doc-info/info.cgi?everything=1&mode=auto&display_format=rdf&url='+escape(window.location)))

A new window should be created, containing details of this page, as illustrated in Figure 4.

The doc-info Service
Figure 4: The doc-info Service

The WebWatch Services page [2] provides access to the http-info and doc-info services. In addition it enables the services to be added easily to the personal toolbar. You simply have to click on the link and drag it to the personal toolbar.

Adding A Service to the Toolbar
Figure 5: Adding A Service to the Toolbar

Once the service has been added to the toolbar, you can obtain details on any page you are viewing by simply clicking on the icon in the toolbar - there is no longer any need to leave the page and go elsewhere, or copy and paste URLs.

Futures

This article has described how a number of the WebWatch tools have been ported to the web and how use of the Netscape personal toolbar enables the tools to be used more easily. It is possible to imagine a range of web-based services which could be integrated with a web browser as described in this article, such as:

HTML Validation
Enables the HTML of the page being viewed to be validated.
Spell Checker
Run a HTML-aware spell-checker on the page being viewed.
Translate
Translate the page into a predefined language
Accessibility
Report on any accessibility issues for the current page.
Validate XML
Check the syntax of the XML page being viewed
Validate CSS
Check the syntax of the CSS (cascading style sheet) file linked to or embedded in the current document.

In fact many of these services are currently available, and can be integrated with your web browser. Links to a variety of services including HTML analysis services at DrHTML and WebSiteGarage, a spell checker at WebSter, HTML validation services at HENSA and W3C, CSS2 validator at W3C, an XML validator at XML.COM, CAST’s Bobby service for checking accessibility and the Babelfish translation service at AltaVista are included in the list below.

To test these links, simply click on them - a new window should be created containing the results of an analysis of this page.

If you find the service useful, then simply drag the link to your personal toolbar. The name and other properties can be edited using the bookmark editor, as illustrated below.

Editing Bookmarked Services
Figure 6: Editing Bookmarked Services

Note that some of the services mentioned above, such as the Babelfish translator, load an intermediate page with the specified input data. An extra step is needed to launch the service, as shown below.

Use of the Babelfish Translator
Figure 7: Using the Babelfish Translator

Issues

Although it would appear desirable for web service provide to include pages containing links to web services which could be easily integrated with a browser, as described, in practice there are several issues which should be considered:

  1. Can examination of the underlying HTML code and reusing links to CGI scripts constitute a breach of copyright?
  2. Can linking directly to a CGI script result in loss of visibility to the service?
  3. What are the maintenance issues if the format of the script processing the CGI form changes?
  4. If services wish to encourage use of their underlying service (and not just the font end to the service provided by the web interface) how should they announce this?

Services will, it should be hoped, address such issues at the design stage. For example naming conventions for input parameters should be adopted so that new versions of services can be deployed in a backwards compatible manner. In passing it should be noted that the HTML service mirrored at HENSA has moved to a new service (following the release of HTML 4.0 and the withdrawal of HalSoft’s mirroring service) but links to the original validation script continue to work (as illustrated below which shows use of the validation service for Jon Knight’s article in issue 1 back in January 1996 [3]).

Use of the HTML Validation Service in Ariadne Issue 1
Figure 8: Use of the HTML Validation Service in Ariadne Issue 1

In addition to designed the software to facilitate version control, thought should be given to output from the service. For example output from the script should provide a link to the service’s home page, so that the user is provided with information on the service provider and can easily access other services on offer.

Seeking Permission To Use The Services

The services mentioned in this article were approached and asked if they would be willing for their services to be used as described in this article. A summary of the responses is given below.

Responses
ServiceComments
HTTP-info and Doc-info (UKOLN)Agree to make the services available. Have documented arguments to CGI script.
Bobby (CAST)It’s a really good idea, and we’re happy to see you supporting the use of Bobby via this procedure. … We have no current plans to modify the way Bobby handles URL-passed page check requests, and if we do we will make sure it continues to support the old way, so that should not be a concern for the foreseeable future either.” - Michael Cooper, CAST
CSS Validator (W3C)I plan to make no change on the format of the script processing the CGI form and the current way will be always supported. I encourage to link directly the CGI script result. Your idea seems to be really good.” - Philippe Le Hgaret, W3C.
HTML Validator (W3C)Regarding version control, I intend to support any URIs used by the service forever. … This is a great idea, by the way; I did something like this years ago with XMosaic’s “user-defined menu” using a .mosaic-user-defs file (an example of which is http://www.ncsa.uiuc.edu/SDG/Software/XMosaic/ex.txt)“. ” - Gerald Oskoboiny, W3C
HTML Validation Service (HENSA mirror)Gave permission to use service
XML.COMAwaiting response to message sent on 17 Feb 1999.
WebSter’s spell checkerLink to copyright holder not available
WebSiteGarageAwaiting response to message sent on 16 Feb 1999.
  

References

  1. WebWatch, web page
    <URL: http://www.ukoln.ac.uk/web-focus/webwatch/>
  2. WebWatch Services, web page
    <URL: http://www.ukoln.ac.uk/web-focus/webwatch/services/>
  3. HTML: Which Version?, Jon Knight, Ariadne issue 1
    <URL: http://www.ariadne.ac.uk/issue1/knight/>

Author Details

Picture of Brian Kelly Brian Kelly
UKOLN
University of Bath
Bath
BA2 7AY

Email: b.kelly@ukoln.ac.uk

Brian Kelly is UK Web Focus. He works for UKOLN, which is based at the University of Bath