The purpose of this article is to introduce The Question Bank contents and situate the resource in the context of its Information Space, that is its relationship to other projects that aim to make social surveys more accessible.
I have the subsidiary aim of using this text to present the choices and decisions that need to be identified, preferably before undertaking the introduction of a medium sized web-based information resource. I aim to be decidedly non-technical, however many of the problems the Question bank team has overcome have been solved because of the increasing flexibility that newer software offers. This article itself breaks the golden rules of writing for the web, it is not concise, scannable or objective. But the recommendations it makes are all in-line with current entreaties to keep the web simple, fast and user-oriented.
Throughout this piece is a not so heavily veiled critique of overly complicated resources that claim for themselves the right to define rules of data management, but rarely deliver consistent or integral data storage or reliable and standardised rules of access. In effect this is a claim for the legitimacy of smaller, more focussed information sources, complementary to the larger, more resource-intensive operations, for resources that aim at being visible and accessible rather than making any great claim to comprehensiveness or authority.
Ideally such niche driven operations should be flexible and responsive, with the ability to alter course faster than the supertankers. This depends on the navigation skills of the crew, but also one needs a chart of the ocean. Adequate and fully functional user appraisal and feedback is the crucial element, and this is a hugely problematic concept in web resource provision where the user is predominantly a taker and not a participant. The opportunity to write this, slightly contentious, introduction for a new audience is very welcome. I have to state that the Question Bank is a resource in development and that any criticism of its structure or content, however harsh, is of course welcome.
Social surveys and data archives
Whether one agrees that quantitative methods are the best way to measure social features of the population or not it is a fact that large scale social surveys have become the major tool for gathering such data and are extremely influential upon government and social policy. Not only has the process of survey construction become highly professionalised, but also the techniques for data gathering, to say nothing of analysis and interpretation, and storage and dissemination, are becoming increasingly complex. There is appearing a perceptible gap between sociologists and survey professionals, and between students of sociology and social policy and those who create, interpret and act upon survey findings.
The theorist - practitioner gulf is exacerbated as much by organisational as by technological change. Downsizing, outsourcing, competitive tendering and the fragmentation of the large government agencies into autonomous business units has meant that independent survey organisations, such as The National Centre for Social Research in the UK, now compete favourably with state departments such as The Office for National Statistics (ONS) for data collection contracts, and even for the design of the major longitudinal national surveys. In such a situation documentation becomes a serious issue. What exactly was asked and in what context? From a more critical sociological perspective, as well as a conscientious professional one, there is often a need to know just what was the context of a given question or set of questions that led to a certain set of data and their various interpretations.
In addition to organisational Balkanisation there is, as there always was, the structural complexity of the survey. With the inclusion of multi-part and sub-group targeted aspects to nearly all the major data gathering exercises there arises a plethora of paper and electronic objects, be they questionnaires, special-interest group supplements, interviewer manuals and notes, or diverse visual material such as showcards or pictures. All the previous applies without even considering the complexities introduced with non-linear telephone and computer based interviewing techniques. For those interested in survey construction, secondary analysis or even survey interpretation to any degree of accuracy, wading through this mire of detail is truly mind-boggling, if and when they actually manage to find any of it.
A requirement of the Education and Social Research Council (ESRC) is that data and documentation generated by any of the projects or surveys that it funds should be lodged at the Data Archive, at the University of Essex. The Government Statistical Service (GSS) and The Office for National Statistics (ONS), along with other survey generating government agencies, including those responsible for the Census of Population every 10 years, tend to deposit their data with the Data Archive too, although they have their own data service, StatbaseTM.
Although the scale and breadth of such archives is impressive, users must often register or even pay for access to information, the search interfaces are often complex with steep and idiosyncratic learning curves, online documents and datasets are large with long download times, and documentation is often patchy and poorly hierarchised or explained. As a consequence of this general inaccessibility third party data providers, such as Manchester Information and Associated Services (MIMAS), have been funded to provide data subsets and selections, together with associated documentation, aimed specifically at academics and sociologists.
Whatever advances the above resources have made available to analysts, they all operate to data-driven agendas. Before 1996 there was scant and uneven attention paid to the documentation of the questionnaires themselves. In recognition the ESRC, in 1995, funded the establishment of The Centre for Applied Social Surveys (CASS). A large component of this virtual organisation was to be an online questionnaire resource, The Question Bank (Qb).
The Centre for applied Social Surveys was set-up, as a virtual organisation run jointly between three host organisations, The National Centre for Social Research (then SCPR), The University of Southampton and The University of Surrey. CASS is responsible for a popular series of courses on survey methodology, and for the Question Bank.
The Question Bank was introduced expressly to deal with questionnaires and questions and not data or datasets. It is an online resource that is freely available over the World Wide Web without registration or payment. First available in 1996 the resource has over the last six months undergone a radical overhaul of its interface and structure, and there is almost daily change and improvement to the content of the site. Development has not been untroubled, and though this article deals with an overview of the scope and use of the Question Bank, I shall continuous reference to the hard choices necessary in undertaking a project of this nature.
Given the brief to make surveys available online, something which in 1995 no other agency was doing in any systematic way, the directors of the Qb project Roger Thomas and Martin Bulmer had to make decisions for the project scope based on resources. Essentially the balance in such an endeavour boils down to expertise, delivery of material and available labour time, though there are subsidiary concerns that I shall touch upon.
The main imperative was that the resource should consist of whole questionnaires. The reasons for this were threefold:
Document and Resource Format and Structure
The Question bank is a website. This means it sees itself as part of the Internet, rather than simply using the Internet as a delivery system. The team tries to be conscious of this at all times and it was to a web constituency, using standardised web technology, and coping with familiar web based problems, that the initial decisions about document format were directed.
Essentially all material in the Question bank falls into one of two categories:
Using Adobe Acrobat Distiller most electronic documents can be converted to PDF with their initial format, layout, fonts, and graphic elements intact (relatively), and often at one-tenth the size of the original document. Such documents load faster, and are more cross-platform than HTML equivalents or even than the originals. For consistency and reliability, PDF is the format the Question Bank team chose for nearly all questionnaire material. Again it is possible to use simple non-labour intensive methods to produce PDF's from electronic documents, such as the PDF Writer plug-in for MS Word. Acrobat Distiller however is the only way to ensure that the contents are faithfully reproduced at maximum compression (smallest size) and with full searchability. The Question Bank optimises AND splits the finished PDF file so that few documents should take longer than 15 seconds (at 28.8 kB psec) to download. Splitting and cross-linking documents is a MANUAL task which few archives spare the time to perform.
Links to the Question Bank Information Space
Supposedly the confidence of a website is measured in its willingness to send viewers off to another place. This is the raison d'être of well-conceived gateways.
There are moves to try and delineate structure from the morass of the WWW, even attempts to build intelligent search agents able to rank the worth of a resource automatically using a mechanical process analogous to that used in human-mediated ranking (e.g. Yahoo and Alta Vista are really sets of lists collected and grouped by their content and utility). Such schemas would divide useful resources into one of two types:
If link density is to be an important criterion for visibility on the Internet the designer must decide how many, AND EXACTLY WHICH, links to collect in their website. Will users click and rush off, never to return? Will they understand that they have left at all?
The Question Bank team does not pretend to be producing a Gateway but we repeatedly discover some new resource or centre that we knew nothing about. It is not uncommon whilst teaching or demonstrating to find that the structure of the survey Information Space (meaning a kind of sub-domain of the WWW) is poorly understood. Furthermore repeatedly we are asked if we archive datasets.
Clearly the aim of the Question Bank is to become an 'authority' but since so many users ask to be redirected there is an inescapable trend toward becoming a 'hub' as well. The decision was taken early that this is almost a separate aspect of web resource provision and the Question Bank employs a part-time researcher at the University of Surrey to collate hyperlinks for us. Currently the results of her work are seen in the links and bibliography sections of each of the Qb topic regions.
We are also constructing an InfoSpace section in Qb the site. This is a diagrammatic representation of the data archives, government agencies, survey associated sites (some have their own website) and related research and co-ordination bodies that concern themselves with survey design and construction. The Question Bank is very easy to step out of.
On a technical level, if you are in framed site do you allow a hyperlink to replace your site in the browser window, or do you launch the remote website in a separate, new window. Current feeling concurs that confidence will out, and that you should always launch into THE SAME window, allowing users to carry on using the back button which gives them control over their mental map of their individual session history. This is generally the course chosen in the Question Bank.
Finding stuff is what it's all about on the web. Other than making as much material visible to external robots (general search engines) any site of any size really needs its own search engine. Important too is a structure and site design of minimum necessary complexity. Although many websites are very small their interfaces are often very busy. With a larger resource a more sedate approach is essential and here the accepted standards of the web, the general user paradigm, is best followed.
In the Qb technological frills have been kept to a minimum, and where clever stuff does exists, it is generally completely invisible to the user. As mentioned the largest single aid in finding material is actually the frameset which contains buttons linking to all main areas of the site, visible at all times. A drawback is that no individual document can be simply bookmarked (but try IE5, it does it!).
The Structure of the site is arranged around the unit of interest, The Survey, and the unit of description, The Topic, with subsidiary, or complementary areas such as the CAPI section, the author list, help pages, contact forms, etc. clearly indicated as autonomous units. Beneath this apparent structuration however there is multiple cross-linking and as much sharing of material as possible.
The Question Bank has been designed to help users by offering multiple routes and methods to find relevant material. There are however three main search strategies:
As stated earlier feedback is the single most crucial tool in website development. This has been the second least impressive area of performance in our record (the worst being the commissioning of topic material for the Question Bank, see later this article). Although there are multiple methods for online feedback from the Qb site response is very poor, and this even though the access logs indicate an impressive hit rate. It would seem that the web user is essentially a downloading machine, that invitations to contribute are almost always ignored, and that the provider of any resource would do well to build face-to-face feedback opportunities into any development program.
The Question Bank resource is under continuous and intensive development. We invite queries, complaints, suggestions for inclusion, or any other form of feedback, positive or negative. We have constructed several ways for users to comment on the site or its content, from quick observations to a considered critique:
The popular image of the social survey interviewer is that of someone carrying a clipboard or folder with a paper and pencil (PAPI) questionnaire that they complete in writing. For most of the surveys listed in the Survey section of the Question Bank, however, this has been replaced during the last decade by the interviewer carrying a portable computer on which the questionnaire resides as a program for Computer Assisted Personal Interviewing (CAPI). This represents a major technical advance in the survey process, but also poses a challenge to the professional survey researcher to make the CAPI interview intelligible to the lay person.
The Question Bank makes available on our site the version of the survey questionnaire published by the survey organisation producing the survey. Sometimes this is quite similar in appearance to a paper questionnaire. In the case of more complex surveys, for example the Family Resources Survey or the Health Survey for England, it bears less resemblance to a paper questionnaire, and has alternating sections, showing respectively the actual question wording and the routing followed through the questionnaire for different respondents according to the way in which they have answered earlier questions.
Both the National Centre with support from the ESRC Research Programme into the Analysis of Large and Complex Datasets (ALCDS) and ONS (TADEQ) are attempting to build software tools that generate and represent CAPI documentation in ways that users of different experience levels and with different requirements can use.
The Question Bank team is grappling with problems of making the way CAPI surveys work in the field as transparent as possible to Question Bank users, and is writing material to explain some of the features of these new styles of questionnaire for our site. This will become an increasingly important issue for the survey researcher as we move into the twenty-first century.
Metadata - Data about data
The proliferation of Internet repositories, each with its own navigational, location, storage and indexing systems means that the user often has to learn new skills each time a resource is discovered. Finding relevant material can become extremely time-consuming, institutional resources are wasted in duplication and crucial data is often hidden within a plethora of proprietary databases, accessible only through idiosyncratic gateways and invisible to general web searching tools.
In recognition of this problem many incentives are under way to make data and documents visible across a variety of processes, gateways and search tools. Metadata, data about data, are tags, descriptions or indexes attached to resource elements and are intended to unify or standardise the key attributes of information objects. Metadata conventions will allow researchers using diverse tools to classify, store, access and retrieve key information using the web as the common platform. Competition is intense and cross-competing claims to have produced the standard system are many.
Of current interest are the NESSTAR project which aims to enable users to search for relevant data across several countries in one action, and the Dublin Core Metadata Initiative which recommends a 15-element metadata set for describing Web resources. The World Wide Web Consortium (W3C®) are promoting the adoption of Extensible Mark-up Language (XML) to enable archivists to 'wrap' documents in metadata envelopes, thus improving their visibility to the next wave of XML capable search engines. The W3C are also involved in the development of the Resource Description Framework (RDF), a foundation for processing metadata that will allow machines (worms and intelligent agents) to 'understand' rather than just 'read' documents.
The Question Bank team is actively researching these developments and is in contact with those involved in key incentives in the fields of social surveys and sociological resource provision. Documents held in the Question Bank are consistently labelled, and since the resource is semi-structured and visible on the web, rather than hidden within a database architecture, could be easily assimilated by any, or several, of these schemes.
In developing the Question Bank we primarily aim to help:
I have already said a lot about how indexing can lead to an inflexible, brittle and user-unfriendly organisation schema. It is also a load of work. Users want lists though, and in recognition of this we have begun to develop keyword indexes and to extract question examples for the topic or other subject led pages. This is a slow process and essentially relies upon the contribution and time of experts, a rare commodity.
Research teams drawn from professional survey organisations and survey sponsoring organisations have created the Questionnaire facsimiles available in the Question Bank. Wherever possible, the details of the responsible survey organisation are given. These organisations support the idea that other researchers should be able to copy questions and use them in their own surveys. Material in the Question Bank is reproduced by special permission of the copyright holders of published documents in which the material appeared.
Question Bank staff have taken great care to reproduce the questionnaire instruments accurately, but the originators of the questionnaires are the authoritative source of knowledge about the questions and their development. Neither the originators of the questions nor the Centre for Applied Social Surveys can take any responsibility for use of the questions by others, or for providing advice to individuals on the design and use of questions.
The Question Bank contains questionnaire examples from the following surveys or survey groups:
Coming soon are:
For continuous surveys the aim is to hold copies of versions of the survey fielded since 1991 and, for both continuous and one-off surveys, to display all questionnaires (e.g. household, individual and proxy schedules) for the appropriate year. Surveys are being added to the Question Bank continuously. A number of academic surveys with national coverage on particular topics will be added during 1999.
If you would like to nominate a survey to add, please email the Question Bank at:
The criteria we have used in selecting surveys are:
For some questions special tests have been done, over and above standard piloting, to check that different respondents understand them in the same way and that the answers obtained are sufficiently valid, accurate and statistically reliable for their purpose. Such tests may involve, for example, cognitive testing of the way in which the questions are answered and controlled empirical comparison of the results obtained by different question forms or special validity checks. These methods require extra time, trouble and expense and are the exception, rather than the rule. Nevertheless the questions used in the Census of Population, for example, have been subject to very extensive formal testing of this kind and there have been similar question testing and evaluation programmes in certain other areas. The Question Bank contains some references to the results of question testing, where available.
Questions contained in the Question Bank are likely, on the whole, to perform better as means of collecting quantitative information for particular purposes than questions which someone coming fresh to a survey topic, without previous question drafting experience, might invent for themselves. However, there can be no such thing as the ideal question on a given topic for every application, only questions which are good relative to the purpose for which they were intended and within the constraints of a particular data collection situation. Users should read the commentary on the various approaches to questionnaire design exemplified in the example questionnaires.
The area in which the Question Bank or any medium sized resource should be strongest is in its meta-narrative. Material that elucidates the purpose and theoretical underpinning of a resource, including extensive illustration by example, is the bedrock of good website design. It is precisely to this type of online writing that the web, with its ability to hyperlink, most lends itself. The topic area has been the hardest section of the Question Bank to populate.
Reasons are threefold:
Those planning on creating a web resource should think carefully about where material will come from and even more critically about WHO will do the work to supply copy with which to populate the architectures in their minds.
In principle, the Question Bank aims to cover all topics of interest to social science that can be studied using the standardised quantitative social survey method. The potential range is very wide. In order to give some structure and make it easier for Question Bank users to find what they are looking for, we have provided a broad listing of 21 topic areas.
Currently 14 areas contain diverse material, often links to related websites outside the Qb, essays on key variable definition, or bibliographic lists:
Soon to be linked are 7 more topics:
In addition to the questionnaire material and the excerpts from other published documents bearing on particular topics, the Question Bank aims to contain specially written critical and explanatory commentary. The commentary is intended to help users to understand the conceptual structure of each topic area and the way it is reflected in the structuring of questionnaires. It may also make users aware of other concepts and questioning approaches that may be closely related to the ones that they had in mind when accessing the Question Bank. The commentary includes discussion of any available objective evidence on the validity, reliability etc of the measures produced by questions used.
For each topic area, we aim to provide a summary account of the main concepts involved and of current approaches to measuring them using survey questionnaire methods. We aim, where possible, to directly link the commentary to relevant examples of sections of questionnaires in the Question Bank surveys. In addition, we provide bibliographic references to relevant social science research literature on the topic, and to other Internet sites containing salient information.
An editorial board has been set up to monitor the quality of material in Question Bank, and to commission commentary from experts in their subject. The members of the Editorial Board are:
Experts in particular topics are currently writing and reviewing material for the Question Bank. A new policy decision is that all material written for the ‘topics and areas’ will be attributed to a named individual, so that the source of material may be clearly identified. This will further the aim of creating within the Question Bank site an electronic encyclopaedia about survey research methodology.
Authors can be found through a list on the website, and their articles accessed directly from there.
I don't intend to go into long descriptions about how to actually use the Qb site in this article. There are plenty of help pages on the site itself. I would like to point out the growing teaching area which makes available presentations that we have used to explain the purpose and use of the site and contains downloadable full-colour help-sheets and exercises. The aim is that teachers of research methods or survey methodology could integrate use of the Question Bank into their course schedule. As usual feedback would be nice.
Roger Thomas is the Director of CASS. He is a senior member of the Survey Methods Centre at the National Centre for Social Research.
Martin Bulmer is the Academic Director of the Question Bank. He is Foundation Fund Professor of Sociology at the University of Surrey and Associate Director of its Institute of Social Research.
Adam Guy is Manager of the CASS Question Bank in the Survey Methods Centre, The National Centre for Social Research.
Tom Johnson is responsible for the quality of material in the Question Bank and he also works in the Survey Methods Centre, The National Centre for Social Research.
Christina Silver is a researcher in the Department of Sociology at the University of Surrey and has been responsible for finding material related to Qb topics on the Internet.
Stuart Peters, based in the Department of Sociology at the University of Surrey, produced Sociological Research Online for three years and is now involved in EPRESS, a project aimed at helping others to publish journals online. He helps maintain the Question Bank search engine.
Before working on the Question Bank Adam managed the database, project led the website team and introduced an Intranet for a management consultancy. He holds an MSc in Social Anthropology from University College London. He has one daughter and another on the way.
The CASS Web address (URL):