Many readers of this article will be involved in setting up new web sites, possibly for European or nationally-funded projects, for internal, institutional projects or perhaps for community projects. As the size of the web grows there is an increasing awareness of the need to be pro-active in promoting web sites - we can no longer simply sit back and expect visitors to arrive at our new site. This article describes a variety of approaches which can be taken to the promotion of a web site. The article is based on a presentation on "Promoting Your Project Web Site"  given at the "Consolidating The European Library Space" conference .
Many visitors to a web site will find the web site through use of a search engine. Although search engines can find new web sites automatically as they become linked into the web from existing web sites the growth in the size in the web is making it increasingly difficult for indexing robots to keep up. It is probably desirable to be proactive and submit resources to search engines when a web site is launched.
Many of the main search engines provide an option to "Submit a Resource". Figure 1 illustrates the interface for submitting a resource to AltaVista.
Figure 1: Submitting a Resource to AltaVista
Since there are a number of popular search engines and the search engines may limit the number of URLs which can be submitted it may be desirable to make use of a submission application or web service.
An illustration of one of these products (Web Position) is shown in Figure 2 (click to view enlarged image).
The products for submitting resources to multiple search engines typically provide other functions as well, such as analysing your pages, reporting on your position in search engines, creating metadata, etc.
Web directories such as Yahoo! are an alternative to search engines. They also provide a popular location for searching for resources. Unlike search engines web directories are compiled manually. Web directories also provide an interface for submitting resources, as illustrated in Figure 3.
Figure 3: Submitting a Resource to Yahoo!
A number of the submission programs will automate the submission of resources to web directories as well as search engines.
Can we solve the promotion of our web site by simply purchasing a submission program? Unfortunately not. Due to the sheer size of the web search engines and directory services do not attempt to index all resources they find.
Some possible solutions to the challenges listed above follow.
If a project has its own domain name it is more likely to be catalogued by a directory service such as Yahoo! In addition it is more likely to be fully indexed by a search engine than if it was part of a large web site.
Since search engines are likely to index only a small part of a web site it may be desirable to control the areas of the web site which are indexed. For example you may wish to exclude personal information, draft resources or experimental work from being indexed.
The Robot Exclusion Protocol (REP) enables a web site administrator to specify areas of the web site which should not be indexed. The REP makes use of a robots.txt file located in the root of the web server. A typical robots.txt file is shown in Figure 4.
User-agent: * # Following apply to all robots Disallow: /cgi-bin/ # Don't index /cgi-bin directory Disallow: /tmp/ # Don't index /tmp directory Figure 4: A Typical robots.txt File
The robots.txt file has a simple format and can be managed by hand. However a number of tools are also available to help you manage this file, such as RoboGen .
Although the Robot Exclusion Protocol is conceptually very simply, in practice it may be difficult to exploit since updating the robots.txt file is likely to be restricted to the web site administrator. Fortunately there is now a HTML feature which enables authors of HTML pages to control access to their pages. The following HTML element located in the HTML HEAD:
<meta name="robots" content="noindex, nofollow">
will prevent robots from indexing the resources and following links within the resource.
Further information on the Robot Exclusion Protocol and Robots META tag has been produced by Martijn Koster .
Avoid use of frames and splash screens in your web site design. As well as enabling indexing robots to access resources on your web site this also has additional accessibility benefits (visitors with browsers which do not support frames will still be able to access your web site).
Once the key pages in your web site have been indexed by a search engine you might expect a sensible query to retrieve the resources. Unfortunately the resource may fail to be located near the top of the search results. How can you improve the ranking?
Metadata may help to improve the ranking. Simple keywords and description metadata, as illustrated below is desirable since this metadata is used by a number of search engines, including AltaVista:
<meta name="keywords" content="exploit, web magazine, TAP, telematics"> <meta name="description" content="Exploit Interactive is a ..">
Dublin Core metadata provides a more comprehensive and standardised approach to metadata for resource discovery. Unfortunately it is not yet widely support by the major search engines. It is probably worth implementing Dublin Core metadata if you can make use of it to enhance local searching and you can address the maintenance of the metadata.
An example of an approach of the use of metadata to enhance local searching and the architecture to manage the metadata can be seen in the Exploit Interactive web magazine . The search interface is illustrated in Figure 5.
Figure 5: The Exploit Interactive Search Interface
As illustrated in Figure 5 the search facility can be used to search the full text of articles, the author of an article (using the DC.Creator Dublin Core attribute) or the description (using the DC.Description Dublin Core attribute).
The metadata is stored in a neutral format (as variables in an "Active Server Page"). A server side include (SSI) is used to transform the metadata to the appropriate format. Currently the metadata is transformed into <meta name ="DC.Creator" ...> and <meta name ="DC.Description" ...>. However in order to provide the metadata in, say, RDF, it would simply require a single update to the SSI script.
The approach taken by Exploit Interactive provides enhanced searching for visitors to the web site, Dublin Core metadata which could be used by third party applications and an architecture which helps to minimise ongoing maintenance.
So far we have considered techniques which will ensure that a web site is indexed and ways of improving the ranking. We should also take into account the citation of web sites - for example URLs which are included in articles (both online and print), used in publicity materials or spoken (e.g. when giving talks or presentations or on the phone).
The domain name for the web site can affect promotion of a web site in a number of ways. For example short and memorable domain names:
Use of separate domain names or qualified domain names - sometimes used by departments (such as http://www.scs.leeds.ac.uk/) and sometimes for a particular function (such as Student Home Pages at Loughborough University - see http://www-student.lboro.ac.uk/) - appears to be on the increase. This is probably due to (a) the ease and low cost of obtaining domain names and (b) the increase in expertise and knowledge of running web servers.
As well as having a short, memorable domain name it is also desirable to make use of short URLs. Before releasing your web site it is useful to develop guidelines for URL naming conventions. Some suggestions are given below:
Jakob Neilson's AlertBox column provides some valuable comments on the "URL as UI" .
As well as the various suggestions on ways in which you can enhance the visibility of your web site you may also wish to consider giving the web site away! For example you could:
Figure 6 shows an interface for searching for medical information on the web which is available on the OMNI web site .
Figure 6: The Interface for Searching for Medical Information on the Web at OMNI
This type of interface is probably more likely to generate search requests than a page simply containing links to the remote search interface. There are dangers in encouraging remote web sites to install a search interface to you web site search engine, in particular change control if you decide to introduce a new or updated search engine. However this is an option you may wish to consider.
You may wish to give your entire web site away. A mirror of your web site may enhance its visibility. If this is an option for your web site you may need to structure your web site so that it can easily be mirrored. This will include using directories to delineate areas of your web site which are to be mirrored, appropriate use of relative URLs and, if possible, ensuring that, if you use server-side scripting for management purposes, you hide (or rewrite) unusual URLs. Although these days sophisticated mirroring and replication software is available it will probably make the mirroring task much easier if the site has been developed with mirroring in mind. It should also be noted that this may also help in the digital preservation for a web site.
This article has described submission engines to search engines and web directories and described web architectures which will help to make web sites more accessible to search engines. In should be noted that articles about your web site can help in its promotion. Articles in print and web publications should obviously raise the visibility. In addition web magazines may submit their pages to search engines and links in the pages may be harvested. Web magazine may also be made available on CD ROM, in free text systems, citation reports, etc. As an example a number of Ariadne articles have been cited in Current Cites  and Ariadne itself features in PubList's Internet Directory of Publications .
If you have followed the various suggestions given in this article how can you evaluate the effectiveness and assess the benefits against the resources used?
One suggestion would be to monitor the number of links pointing to your web site. The LinkPopularity.com web site  enables the numbers of links, as recorded by a number of large search engines, to be measured as illustrated in Figure 7.
Figure 7: The LinkPopularity.com Web Site
Monitoring the number of links to your web site, and the growth of the number of links will be useful in evaluated the impact of your web site. It can also be of use if you wish to sell advertising space on your web site. As Roddy McLeod, manager of the EEVL gateway  mentioned in a posting to the lis-elib Mailbase list:
"I tried [LinkPopularity.com], pointing out to a potential advertiser that EEVL had, according to HotBot, 1099 sites linking to it, whilst there were only 18 sites linking to their site, and suggested that what they needed was more exposure. It seems to have worked, as they have agreed to buy an ad on the soon to be released new design EEVL site." .
Analysis of your web statistics can help in measuring the effectiveness of your web promotion strategy. A more thorough report on web statistics will be published at a later date. In this article mention will be made of analysis of access to web sites by robot software. The BotWatch software  can produce reports on access to your web site by robot software, as illustrated in Figure 8.
Figure 8: BotWatch
Ideally you will think about the promotion of your web site before the web site has been launched. A number of technical decisions which can help with web site promotion should be made before the launch as changes to a running service will be difficult to implement. However even if your web site is well-established many of the suggestions in this article will still be relevant.
Many of the suggestions given in this article on web site promotion will have additional benefits in other areas. For example:
Book reviews for "Poor Richard's Internet marketing and promotions: how to promote yourself, your business, your ideas online"  and "How to promote your Web site effectively"  have been published in the Internet Resources Newsletter.
A checklist of the points mentioned in this article follow.
UK Web Focus
University of Bath