Web Magazine for Information Professionals

Caching In?

George Neisser describes the National JANET Web Caching Service.

The UK JANET Web Caching Service, managed jointly by the Universities of Manchester and Loughborough, provides an efficient, state-of-the-art, caching facility for the UK Academic and Research community. Its primary objectives are to reduce unnecessary duplication of network traffic between the community and the Web, and to minimise the time taken to download Web pages from remote sites. The former results in a more cost-effective use of limited and expensive bandwidth; whilst the latter facilitates a more productive use of the Web. To-date the JWCS has made its biggest impact in reducing the amount of Web traffic across the Atlantic to the USA, (Web traffic amounts to approximately two thirds of the total JANET transatlantic traffic), which has delivered significant financial savings.

Since the inception of the JWCS, a JANET caching infrastructure has been developed. This has been achieved by encouraging all universities and other eligible organisations to establish their own local site caches and to link them in to the JWCS. We now have in the UK one of the most developed caching infrastructures in the world, the benefits of which will become increasing obvious in the coming years as demand and traffic levels continue to increase unabated.

We now describe very briefly the actual service and some of the development work being currently undertaken. Future articles will describe the various components of the JWCS in more detail and report on new developments.

The JWCS

The JWCS is supported by approximately 17 machines located at Manchester and Loughborough. Most of these are Intel based with 256MB of memory and approximately 25GB of disk capacity per machine, running FreeBSD and Linux, and the 'Squid' Web caching software. The service also comprises a Web site and machine dedicated to the production and analysis of cache statistics and associated reports.

At the last count we had some 150 organisations registered to use the service - a number that is likely to increase significantly as more eligible organisations join the service. Most of these sites run their own local caches which 'peer' with the JWCS in 'parent' mode, resulting in a 2 level cache hierarchy across JANET.

The JWCS currently handles approximately 20 million requests and transfers around 150GB per day, making it one of the largest Web caches in the world. Since the current service began on 1st August demand has increased tenfold, and we expect a further two or threefold increase by the summer of 2000.

Our 'hit-rates' (the percentage of requests for Web pages actually cached) fall between 25% and a respectable 40%. This means that if, on a given day, the JWCS transfers 100GB of data to clients, with a hit-rate of 40% the bandwidth saved is 40GB. In a 'charging' environment where institutions have to pay for their use of bandwidth, the potential cost savings are obvious.

Development of the JWCS

Part of our remit is to ensure that the JWCS utilises the best technologies and methodologies available to deliver the best possible service to the community. Our overall objectives are to maximise the cost-effectiveness of caching and minimise Web page retrieval times. Areas currently under investigation include:

  1. The improvement of the performance of the individual caching machines, by ensuring, where possible, we use the most reliable and fastest technologies.
  2. The improvement of inter-cache cooperation of the individual machines to minimise page retrieval times and improve hit-rates.
  3. The establishment of caches supporting particular content or domains by utilising 'content smart' switchs. For instance 'domain' caches based on Internet domains such as '.com', '.edu' and so on, may be expected to significantly improve hit-rates and reduce retrieval times.
  4. Introduction of very high speed machines capable of replacing several of our individual caches.
  5. How to incorporate Metropolitan Area Networks (MANs) and regional MANs into the national caching infrastructure.
  6. How best to develop the national caching infrastructure to take advantage of the latest developments in technology and methodology.

Conclusion

The JWCS is a rapidly expanding service with an emphasis on the evaluation of new technologies and methodologies to ensure it delivers the best possible service to its users, the UK Academic and Research Community.

Future articles will explore individual areas of caching in more detail. In the meantime we would encourage you to visit the JWCS Web site at: <URL: http://wwwcache.ja.net/>.

If you have any comments or suggestions please email: support@wwwcache.ja.net

Author Details

George Neisser
Manchester Computing
University of Manchester
Manchester
M13 9PL

Email: george.neisser@mcc.ac.uk
Web site: http://wwwcache.ja.net/