The BNBMARC Currency Survey

Ann Chapman describes the BNBMARC Currency Survey, a performance measurement survey on the supply of bibliographic records.

Performance measurement has been used as a research tool for many years, and as such has been used in some of the studies undertaken by UKOLN and its predecessor bodies (the Centre for Catalogue Research - CCR, and the Centre for Bibliographic Management). In the past few years, however, increasing demands for accountability to the public and to local and central government, combined with the requirements of compulsory competitive tendering, have led to performance measurement becoming an integral part of public sector management. This in turn focuses attention on independent performance measurement studies such as the BNBMARC Currency Survey.

The BNBMARC Currency Survey is a performance measurement survey on the supply of bibliographic records (the current term for what were known in pre-automation times as catalogue records, though the older term is still used). Begun in January 1980, this long-running survey still provides the only publicly available, externally measured data (as far as we know) on the performance of a national library. It originated in the period when libraries were beginning to invest in automation and machine-readable bibliographic records, and the British Library started to provide such records. Libraries recognised the cost benefits in purchasing records which were created centrally once, instead of each library producing their own. However, since the cost benefits of buying British Library produced records were only there if enough records were available when a library required them, libraries wanted to know what was the likelihood of finding the records they would want.

The BNBMARC Currency Survey was set up to investigate this by asking two questions:

To be able to answer these questions, the survey requires a representative sample of titles being acquired by UK libraries to be supplied by a representative population of UK libraries each month.

The title sample

The survey could have looked at an unrestricted sample of titles being acquired by UK libraries but there was good reason for restricting the sample. In 1974, the British National Bibliography Ltd., a non profit consortium of various bodies established in 1949, was absorbed by the British Library, forming the nucleus of the National Bibliographic Service. The British Library therefore was committed to producing bibliographic records for the publishing output of the UK and the Republic of Ireland from 1974. The survey in turn uses a sample of items published in the UK and the Republic of Ireland in and since 1974 to monitor the availability of the records.

The bibliographic records created by the British Library from the UK publishing output since 1974 are held in files with the prefix BNB in the British Library database. They are held in the MAchine Readable Cataloguing (MARC) format devised originally by the Library of Congress. The data they hold is recorded as required by the Anglo American Cataloguing Rules 2nd edition (AACR2) from the book ‘in hand’. These records are created by either the British Library, or, since 1990, by one of the other five copyright agency libraries in the UK (the libraries of the Universities of Cambridge and Oxford, of Trinity College, Dublin, and the National Libraries of Scotland and Wales). CIP records are Cataloguing-In-Publication records created using information supplied pre-publication by the publisher. Over the years these records have been created by a number of different organisations. Initially they were created by British Library staff, but since 1990 have been created under contract agreements with other bodies - Book Data in 1990-1991, J.Whitaker & Sons 1991-1995, Bibliographic Data Services 1996-. These records are in MARC format but because of their pre-publication status the information held may change before publication, or not be known when created. The second category of ‘formerly CIP’ records consists of records initially created as CIP records, which when the item concerned has been deposited as required by law, is checked for accuracy and omitted information, the record amended and the status changed from CIP to formerly CIP.

Apart from date and place of publication, there are a few further restrictions on which titles can form part of the sample. Certain material is excluded from the BNB files by the British Library under its exclusions policy, and is in turn excluded from the sample. Material so excluded comprises:

To measure the level of record provision by the British Library therefore, the title sample is restricted to those titles for which the British Library is committed to providing bibliographic records. Libraries are therefore requested to only send details of titles published in 1974 or later, with a UK or Republic of Ireland imprint or distributor, in monograph form which do not fall into any category of the exclusions policy.

The library sample

By now, most public library authorities and university libraries in the UK will have been asked, at some point since 1980, to participate in the BNBMARC Currency Survey carried out by UKOLN. By August 1996, a total of 294 (118 academic and 176 public) libraries had taken part, 108 of them (65 academic and 43 public) on two occasions. To the credit of the UK library community most of the requests for participation have been accepted. Where libraries have declined the request, they almost always indicate that they would be willing to take part on another occasion. Reasons for non-participation are generally shortage of staff at a particular period, short term heavy workloads due to automation or changes from one automated system to another, refurbishment or extension of library or move to new library buildings, and in the past 18 months, problems caused by local government re-organisation. I would like to record here UKOLN’s thanks to past and present participants in the survey.

In order to ensure a representative population of libraries to provide the title samples, libraries listed in the Library Association’s directory Libraries in the United Kingdom and the Republic of Ireland have been divided into a number of categories. The initial division is into an academic sector and a public sector. The academic (university) sector is subdivided into groupings by age of foundation - the ‘old’ institutions, the ‘redbrick’ institutions, the ‘new’ 1960’s institutions and the very recent 1990’s institutions (previously designated polytechnics). Oxford and Cambridge universities do not participate as they are legal deposit libraries. Libraries of colleges of further and higher education are not included since it was anticipated that the university sector already gave maximum title coverage for the academic sector. The public sector is likewise divided into groupings, this time by geographical area and library authority type. The groupings comprise: London boroughs, metropolitan authorities, English counties and unitary authorities, Welsh authorities, Scottish Authorities and Northern Irish authorities.

Libraries are selected on a random basis, and participate in the survey for six months, after which they are replaced by another library of the same grouping in their sector. So each month, six academic and six public libraries will participate, with one academic and one public library leaving the survey, and one of each type joining.

The monthly sample

Originally only one sample was taken, at the stage items were about to be catalogued, since this was the time when libraries would require records if they were to purchase them from outside suppliers. Later, however, increasing use of automation in order/acquisition departments prompted the addition, in February 1988, of a second sampling point at the order authorisation stage.

In order to make a representative reflection of patterns of cataloguing and ordering throughout the year, the survey sample is taken on a randomly selected date each month. Randomly selected dates falling at weekends or bank holidays are rejected and an alternative date is randomly selected for that month.

The libraries are issued with guidelines on material that can be included in the sample and a simple procedure to enable them to randomly select their sample titles. Libraries are requested not to include the same title more than once in a month’s sample (ordering three copies of an item counts as one title for the survey). They are also advised that no more than half a month’s sample can be from the same publisher in order not to skew the sample (libraries sometimes order enough titles from a single publisher at one time that the whole of a month’s sample could be from that publisher).

Each month the participating libraries each return a sample sheet recording 10 items they are about catalogue and 10 items they are about to order. The maximum sample each month therefore comprises 120 items at the cataloguing stage and 120 items at the ordering stage. It is not unusual for the sample to be somewhat smaller than the maximum for a variety of reasons. Firstly and most commonly, there may not be enough items awaiting cataloguing or ordering for a library to record 10 items on their sheet in a particular month. Secondly, the library may temporarily not be ordering items due to book fund limitations. Thirdly, samples may not be returned occasionally because the designated person has left the library, gone on maternity leave, is on sick leave or annual leave and no one else has been requested to deal with the survey. Finally, despite the guidelines issued, libraries do sometimes include in the samples titles which are outside the survey parameters.

Processing the sample

The sample sheets are returned to UKOLN each month. On receipt of the sheets, the titles recorded are checked that they fall within the survey parameters, and any titles which are outside the parameters are deleted. UKOLN staff then check the samples against the BNBMARC datafiles on the British Library on- line service Blaise. The initial search for items uses the International Standard Book Number (ISBN)’s of titles as the search term. If no record is found using this approach, or no ISBN is recorded on the sample sheet, keyword/author searches are made. The number of titles with no ISBN quoted on the sample sheet is usually small, around 1% for the cataloguing sample and around 4% for the ordering sample. A further 1% of items in each sample are likely to have either an incomplete ISBN recorded (eg. 9 digits) or an incorrectly transcribed ISBN (eg. because of digit transposition).

For each title in the sample, the record found is categorised as (a) full MARC record, (b) formerly CIP record, © CIP record or (d) no record are found on file. The sample sheets are marked up with the results, and used to compile the datafiles. Since results from a single month would be too small to be statistically valid, and would be biased by seasonal and other variations, each month the results are calculated using a conflation of the previous 12 months of results. The results produced from the analyses are known as the hitrate.

The survey search only counts as records found those records which have a date of creation at or before the sample date. Records are added to the BNBMARC files in a weekly update procedure, so it is possible to see at the time of searching that in fact a record was added, say three days after the sample date. So in addition to the main search, it was decided to investigate how many additional records had been added for sample items after six months. The results of this search are known as the recheck hitrate.

Results of the survey

Analysis of the data obtained from the samples produces an index of currency known as the hitrate. Each month the hitrate for the previous twelve months is calculated for each of the two samples - cataloguing and ordering. For each sample the hitrate is broken down in two ways. Firstly, the hitrate is divided by the type of library providing the sample. Thus in addition to the overall hitrate for the whole sample, there is one hitrate for the samples provided by academic libraries and another for the samples provided by public libraries. Secondly, the hitrate is divided by type of record found and this division is also available for the academic sample, the public sample, and the whole sample. As noted above, records found are divided into (a) full MARC records, (b) formerly CIP records, and © CIP records.

The long term accumulation of data means that it is possible to look at how currency of records has changed over the period the survey has been running. It also enables the investigation of possible correlations between changes in the hitrate and British Library policy and procedures. Over the years a number of additional analyses have been made and possible correlations investigated.

The inclusion of a second search six months after the sample date has provided data on how many more records from the sample have been added to the BNB files since the sample date. In 1980, the cataloguing hitrate was 63%. Over the next six months records were added at the rate of 1% per week up to 14 weeks after the sample, the rate of addition then dropped. The final recheck hitrate was 82%. Ten years later in 1990, the hitrate for both cataloguing and ordering samples was 75%, with recheck hitrates of 80 % cataloguing and 79% ordering. Thus the gap between hitrate and recheck hitrate has lessened, as has the rate of addition of extra records - around 1% every 6 weeks.

In 1983 the British Library revised its Cataloguing-In-Publication (CIP) programme in an attempt to improve on the currency of records. New procedures were put in place, the major change being the decision to no longer routinely recatalogue deposited items for which a CIP record existed. Recataloguing would only take place where the published item differed from the CIP to such an extent that it was better to start a new record from scratch. Otherwise, CIP records could be accepted with no change (then noted on the record as CIP confirmed) or accepted with some changes to the CIP record (then noted on record as CIP revised). Data from the currency survey was analyses to see what evidence there was for the success of the revised programme.

Prior to 1983, CIP entries accounted for one third of the records found in the cataloguing sample. By 1985, this had risen to nearly one half of the records found, and 1986 showed a continuation of this trend with CIP entries accounting for three quarters of the records found. (CIP entries in this analysis include the CIP revised and confirmed entries.) Despite the growth in the proportion of CIP records in the hitrate, the overall increase in hitrate was small at only 1-2%. On investigation it seemed likely there were causes for this seeming failure of the project. Firstly, the British Library was over-optimistic about the possible increase in hitrate that might be achieved, especially since at the time there was a backlog of around 40,000 titles awaiting cataloguing. Secondly, between 1981 and 1987 there was a 34% increase in the annual output of new titles published in the UK. Given these factors, it is possible that the CIP programme revision enabled the British Library to maintain the hitrate level at this time.

With an increasing processing backlog and having set a strategic goal of an 85% hitrate by 1990, the British Library then introduced its Currency with Coverage programme. The main proposals here were (a) catalogue to AACR2 level 1 for around half the items added to BNB files, and (b) omitting Library of Congress Subject Headings (LCSH). The programme was implemented in 1988, and the currency survey hitrates soon showed a trend upwards. Thus the cataloguing hitrate which had been at 62% rose fairly steadily over time to reach an all-time high of 87% in 1994; a subsequent trend downwards to 79% during 1995 was followed by another increase in 1996 to now stand at 84%. Some of the short term decreases in hitrate were probably due in part to factors such as the move of British Library Bibliographic Services from London to Boston Spa, which involved items being sent from one site to the other and back at some stages when not all sections had moved, and staff turnover since not all staff wished to relocate. Pressure was also put on the British Library to reverse some of their decisions, and from 1993 subtitle and place of publication were included in level 1 cataloguing, while LCSH were reinstated in 1995. From 1996, all records have been created to AACR2 level 2. The reversal of these decisions does not seem to have affected the hitrate so far.

While a sample has been collected at the cataloguing stage since 1980, it was not until 1988 that the survey was extended to the ordering stage. More libraries were using automated ordering and acquisitions systems at this point, and therefore requiring machine readable records. It was anticipated that the hitrate for the ordering sample would be lower than that for the cataloguing sample. Initial comparisons, however, found that there was little difference in the figures though those for the ordering stage were a little lower than those for the cataloguing stage. Over the years the gap has varied from 1% to 8% but is most typically 5%.

A cooperative project between the British Library and the five copyright agency libraries, known as the Copyright Libraries Shared Cataloguing Programme (CLSCP) began in 1990. The aim was to extend the coverage of legal deposit material, with each library undertaking to be responsible for a particular section of UK publishing output. The British Library contributes 70% of current catalogue records and the other five libraries contribute the remaining 30% between them. Since the aim was to increase coverage, it was not anticipated that there would be much effect on the hitrates. Between 1991 and 1993 the programme accounted for around 1% of the cataloguing hitrate and 2% of the ordering hitrate. Since then the proportion of the hitrate contributed by these records has increased. The non British Library contributed proportion of the cataloguing sample hitrate was 3% in 1994 and the first half of 1995, rising to 8% at the end of 1995, remaining at 8-9% in 1996. The non British Library contributed proportion of the ordering sample was 2% in 1994, rose to 3% in mid 1995 and to 6% at the end of 1995, and during 1996 has remained at 8%. Since records contributed by the agency libraries are later enhanced with additional data by the British Library and lose their identifying source note, it is not possible to estimate the number of such records as a proportion of the BNB file. To do this would require the records to be annotated as ‘formerly CLSCP’ in the same way that CIP records become ‘formerly CIP’ records.

Publication of results

The hitrates produced from the sample data have been made public from the start of the survey. For some years until 1993, they were published in the Library Association Record, but will now appear in the British Library newsletter Select. They are available on request from UKOLN, and can be found on the UKOLN web site at http://www.ukoln.ac.uk/bib-man-archive/surveys/bnbmarc/summary.html.

The future of the survey

UKOLN will continue to carry out the Currency survey, and at intervals to look at the data in a variety of ways to increase our knowledge of this area of bibliographic management. At present there are plans to move the data from an in-house suite of programs and files onto a Microsoft Access database, which should make it easier to add each month’s data and increase the number of reports and analyses generated by the system.

In the past, the survey has always been carried out on the British Library BNB files and this will continue. During its lifetime, however, the sample has been used for short periods to monitor the currency of records from other sources of bibliographic records usually under contract to the source concerned. Results of this type of work have, of course, been confidential. In this past year UKOLN has negotiated participation of eight sources of records other than the British Library in a multi-source study. For this, the ordering sample is used to search Book Data and Whitaker CD-ROMs, and the databases of Bibliographic Data Services, BLCMP, CURL, LASER, OCLC and SLS. After pilot tests, the first sample was that for September 1996. Results will not be available until twelve month’s time when enough data has accumulated for analysis. In addition to results in the form of hitrates and other numerical analyses, it is hoped that any reports produced will also include information on each source, the types of record they can provide, and what options are available for access.

