'Does He Take Sugar?': The Risks of Standardising Easy-to-read Language
Brian Kelly, Dominik Lukeš and Alistair McNaught highlight the risks of attempting to standardise easy-to-read language for online resources.
The idea that if we could only improve how we communicate, there would be less misunderstanding among people is as old as the hills. Historically, this notion has been expressed through things like school reform, spelling reform, publication of communication manuals, etc. The most radical expression of the desire for better understanding is the invention of a whole new artificial language with the intention of providing a universal language for humanity. This has had a long tradition but seemed to gain most traction towards the end of last century with the introduction and relative success, at that time, of Esperanto.
But artificial languages have been a failure as a vehicle of global understanding. Instead, in about the last 50 years, the movement for plain English has been taking the place of constructed languages as something on which people pinned their hopes for clear communication.
Most recently there have been proposals suggesting that “simple” language should become a part of a standard for accessibility of Web pages alongside other accessibility standards issued by the W3C/WAI (Web Accessibility Initiative). The W3C/WAI Research and Development Working Group (RDWG)  is hosting an online symposium on “Easy-to-Read” (e2r) language in Web Pages/Applications“ (e2r Web) . This article highlights risks of seeking to develop standards when the complexities of language and understanding are not fully understood.
WAI’s Easy to Read Activity
The WAI’s Easy to Read activity page  provides an introduction to their work:
Providing information in a way that can be understood by the majority of users is an essential aspect of accessibility for people with disabilities. This includes rules, guidelines, and recommendations for authoring text, structuring information, enriching content with images and multimedia and designing layout to meet these requirements.
and goes on to describe how:
Easy to Read today is first of all driven by day to day practice of translating information (on demand). More research is needed to better understand the needs of the users, to analyze and compare the different approaches, to come to a common definition, and to propose a way forward in providing more comprehensive access to language on the Web.
It provides a list of potentially useful tools and methods for measuring readability: Flesch Reading Ease; Flesch-Kincaid Grade Level; Gunning Fog Index; Wiener Sachtextformel and Simple Measure Of Gobbledygook (SMOG).
The aim of this work is to address the needs of people with disabilities such as:
- People with cognitive disabilities related to functionalities such as
- Problem solving (conceptualising, planning, sequencing, reasoning and judging thoughts and actions)
- Attention (e.g. Attention deficit hyperactivity disorder – ADHD) and awareness
- Reading, linguistic, and verbal comprehension (e.g. Dyslexia)
- Visual Comprehension
- Mental health disabilities
- People with low language skills including people who are not fluent in a language
- Hearing Impaired and Deaf People for whom “incidental accretion” of vocabulary via overheard conversations and background media is much more limited. Sign language users may regard written English as a second language with an entirely different syntax and grammar.
Previous Work in This Area
The W3A WAI have previously sought to address this area. In March 2004 a draft of the WCAG 2.0 guidelines  for Web accessibility provided the following guideline:
Guideline 3.1 Ensure that the meaning of content can be determined.
and went on to describe level 3 success criteria which could demonstrate that this guideline had been achieved:
- Using the simplest sentence forms consistent with the purpose of the content
- For example, the simplest sentence-form for English consists of Subject-Verb-Object, as in John hit the ball or The Web site conforms to WCAG 2.0.
- Using bulleted or numbered lists instead of paragraphs that contain long series of words or phrases separated by commas.
Nouns, noun-phrases, and pronouns
- Using single nouns or short noun-phrases.
- Making clear pronoun references and references to earlier points in the document
If that version of the WCAG guidelines had been implemented, and had you required your Web site to conform with WCAG Level 3, you would have had to ensure that you avoided complex sentences, such as those with a sub-ordinate clause.
Conformance with Level 3 guidelines were intended to ensure Web resources are “accessible to more people with all or particular types of disability“. The guidelines explained how “A conformance claim of “WCAG 2.0 AAA” can be made if all level 1, level 2, and all level 3 success criteria for all guidelines have been met.”
Such guidelines would be helpful for people with cognitive disabilities: those with Asperger’s syndrome, for example, find it difficult to understand metaphors such as “It’s raining cats and dogs“. The guidelines seem to have been developed by those who wished to implement the vision of “universal accessibility“. But we can see that seeking to address accessibility in this fashion is flawed for reasons explained below.
Accessible or Patronising?
The early draft of WCAG 2.0 guidelines suggested that “John hit the ball” conformed with the goal of ensuring that the meaning of the content can be determined. Would WCAG 2.0 checking tools flag the passive formulation of “the ball was hit by John” as an accessibility error, meaning that the Web page could not achieve the highest accessibility rating? And what about the well-known UK sports headline: “Super Caley Go Ballistic Celtic Are Atrocious” – a headline which brings a smile if Mary Poppins was part of your cultural background and you recognise Celtic as a football team, but which is clearly not universally accessible.
Mandating reading levels as an accessibility requirement could be regarded as patronising as people with disabilities are no more coherent a group than people without disabilities. Matching the level, style and character of information to the intended user is simply good practice. Understanding your audience and recognising the range of people with different disabilities who are likely users of your Website is also basic good practice. However conflating ‘disabled people’ with ‘won’t understand very much’ is wrong, as has been highlighted in the past by the title of the former BBC Radio 4 programme “Does he Take Sugar?”. Equally inadvisable is applying the same rules to all contexts.
The key to the Easy Read guidance is in the phrase “Providing information in a way that can be understood by the majority of users”. Shopping, ticketing and core government services are contexts where it is entirely appropriate to make the reading level as accessible as possible to everyone. But there are many other contexts with very different user profiles. It is vital to distinguish between content, culture and context – nuances that can be easily perceived by a human reader but much less reliably tested by an algorithm. This is particularly so in the context of education where clear communication is only one of several desirable outcomes.
Other, potentially contradictory, outcomes might include:
- Linguistic precision: BBC Bitesize  has terms that are meaningless to many people (preposition, perfect continuous etc.) but essential in the context of language learning. This site has a far higher reading age than most of the UK daily newspapers. That does not make it inaccessible or inappropriate.
- Technical terms: this overlaps with linguistic precision but applies more widely. Here the accessibility argument can be reversed. Confident use of technical terms streamlines communication and enhances meaning. “Convection currents in the asthenosphere move tectonic plates together at subduction zones” means little if you are not studying geology but if you are then it is a very concise way of saying some quite complicated things. Many users benefit from shorter sentences – especially dyslexic users and screenreader users.
- Metaphor, simile and cultural references: communication is not primarily about words but about ideas. There are contexts where ideas can be more powerfully expressed by metaphors than by mechanical statements of fact or opinion. Hildegaard’s “Thus am I a feather on the breath of God” could be expressed in more functional dispassionate prose, but it would be neither Hildegaard, poetry or very memorable.
- Resources aimed at particular subcultures need to adopt appropriate language if they are to communicate with the target group.
As highlighted by JISC TechDis  accessibility in education does not lend itself to a blanket solution but to a more holistic approach: “the only way to judge the accessibility .. is to assess it holistically and not judge it by a single method of delivery” . Accessibility compliance will become meaningless if it does not reflect the real needs of real people in real contexts.
Why Is Language Inaccessible?
The problem is that most proponents of plain language (as so many would-be reformers of human communication) seem to be ignorant of the wider context in which language functions. There is much that has been revealed by linguistic research in the last century or so and in particular since the 1960s that we need to pay attention to (to avoid confusion, this does not refer to the work of Noam Chomsky and his followers but rather to the work of people like William Labov, Michael Halliday, and many others).
Languages are not a simple matter of grammar. Any proposal for content accessibility must consider what is known about language from the fields of pragmatics, sociolinguistics, and cognitive linguistics. These are the key aspects of what we know about language collected from across many fields of linguistic inquiry:
- Every sentence communicates much more than just its basic content (propositional meaning). We also communicate our desires and beliefs (e.g. “It’s cold here” may communicate, “Close the window” and “John denied that he cheats on his taxes” communicates that somebody accused John of cheating on his taxes. Similarly choosing a particular form of speech, like slang or jargon, communicates belonging to a community of practice.)
- The understanding of any utterance is always dependent on a complex network of knowledge about language, about the world, as well as about the context of the utterance. “China denied involvement” requires the understanding of the context in which countries operate, as well as metonymy, as well as the grammar and vocabulary. Consider the knowledge we need to possess to interpret “In 1939, the world exploded” vs. “In Star Wars, a world exploded”.
- There is no such thing as purely literal language. All language is to some degree figurative. “Between 3 and 4pm”, “Out of sight”, “In deep trouble”, “An argument flared up”, “Deliver a service”, “You are my rock”, “Access for all” are all figurative to different degrees.
- We all speak more than one variety of our language: formal/informal, school/friends/family, written/spoken, etc. Each of these varieties has its own code. For instance, “she wanted to learn” vs. “her desire to learn” demonstrates a common difference between spoken and written English where written English often uses clauses built around nouns.
- We constantly switch between different codes (sometimes even within a single utterance).
- Bilingualism is the norm in language knowledge, not the exception. About half the world’s population regularly speaks more than one language but everybody is “bi-lingual” in the sense that they deal with multiple codes.
- The “standard” or “correct” English is just one of the many dialects, not English itself.
- The difference between a language and a dialect is just as much political as linguistic. An old joke in linguistics goes: “A language is a dialect with an army and a navy”.
- Language prescription and requirements of language purity (incl. simple language) are as much political statements as linguistic or cognitive ones. All language use is related to power relationships.
- Simplified languages develop their own complexities if used by a real community through a process known as creolization. (This process is well described for pidgins but not as well for artificial languages.)
- All languages are full of redundancy, polysemy and homonymy. It is the context and our knowledge of what is to be expected that makes it easy to figure out the right meaning.
- There is no straightforward relationship between grammatical features and language obfuscation and lack of clarity (e.g. It is just as easy to hide things using active as passive voice or any Subject-Verb-Object sentence as Object-Subject-Verb).
- It is difficult to call any one feature of a language universally simple (for instance, SVO word order or no morphology) because many other languages use what we call complex as the default without any increase in difficulty for the native speakers (e.g. use of verb prefixes/particles in English and German)
- Language is not really organised into sentences but into texts. Texts have internal organisation to hang together formally (“John likes coffee. He likes it a lot.”) and semantically (“As I said about John. He likes coffee.”) Texts also relate to external contexts (cross reference) and their situations. This relationship is both implicit and explicit in the text. The shorter the text, the more context it needs for interpretation. For instance, if all we see is “He likes it” written on a piece of paper, we do not have enough context to interpret the meaning.
- Language is not used uniformly. Some parts of language are used more frequently than others. But it is not enough to understand frequency. Some parts of language are used more frequently together than others. The frequent concurrence of some words with other words is called “collocation”. This means that when we say “bread and …”, we can predict that the next word will be “butter”. You can check this with a linguistic tool like a corpus, or even by using Google’s predictions in the search. Some words are so strongly collocated with other words that their meaning is “tinged” by those other words (this is called semantic prosody). For example, “set in” has a negative connotation because of its collocation with “rot”.
- All language is idiomatic to some degree. You cannot determine the meaning of all sentences just by understanding the meanings of all their component parts and the rules for putting them together. And vice versa, you cannot just take all the words and rules in a language, apply them and get meaningful sentences. Consider “I will not put the picture up with John” and “I will not put up the picture with John” and “I will not put up John” and “I will not put up with John”
It seems that many advocates of plain language do not take most of these factors into account.
An Alternative Approach
The Need for Other Approaches
Do the concerns highlighted above mean that we should give up on trying to make communication more accessible? Definitely not! Rather, as outlined above, there is a need to understand the complexities of language and the limitations of the WAI model which attempts to shoe-horn best practice into a single set of universal guidelines. The challenge lies in providing the contextualisation needed to be able to respond to a diverse range of requirements.
BS 8878 Can Provide the Required Contextualisation
The need for contextual approaches which enable the complexities of Web accessibility to be addressed was highlighted by Sloan et al . A follow-up paper  emphasised the importance of policies and processes and coined the term “Accessibility 2.0” to describe such approaches. The paper defined the characteristics of Accessibility 2.0 as:
User-focussed: As with Web 2.0, the emphasis is on the needs of the user. Accessibility 2.0 aims to address the needs of the user rather than compliance with guidelines.
Widening participation rather than universal accessibility: The approach taken to Web accessibility is based on widening participation rather than in a belief based on the notion of universal access.
Rich set of stakeholders: In contrast with traditional approaches to Web accessibility, which places an emphasis on the author of Web resources and, to a lesser extent, the end-user, Accessibility 2.0 explicitly acknowledges the necessity of engaging with a wider range of stakeholders.
Sustainability: Accessibility 2.0 emphasises the need for the sustainability of accessible services.
Always beta: There is an awareness that a finished perfect solution is not available; and that, rather, the process will be one of ongoing refinement and development.
Flexibility: A good-enough solution will be preferred to the vision of a perfect technical solution.
Diversity: Recognition that there can be a diversity of solutions to the problem of providing accessible services.
Social model for accessibility: Rather than regarding Web accessibility based on a medical model, Accessibility 2.0 adopts a social model.
Devolved, not hierarchical: Solutions to Web accessibility should be determined within the specific context of use, rather than advocating a global solution.
Emphasis on policy, rather than technical, solutions: Although there are technical aspects related to Web accessibility, Accessibility 2.0 tends to focus on the policy aspects.
Blended, aggregated solutions: Users want solutions and services, but these need not necessarily be a single solution; nor need the solution be purely an IT solution.
Accessibility as a bazaar, not a cathedral: The Cathedral and the Bazaar analogy  can be used to compare Accessibility 1.0 and 2.0. The WAI approach is based on complex sets of guidelines which are difficult to understand. This results in developments which are slow-moving in responding to rapid technological change.
Accessibility as a journey, rather than a destination: Rather than regarding Web accessibility as something that is solved by providing AAA compliance, Accessibility 2.0 regards accessibility as a never-ending journey, in which richer solutions could always be provided.
Decision-making by broad consensus: Decisions on the extent to which accessibility is supported is determined by a broad consensus as to what is reasonable, rather than WAI’s definitions.
As summarised by Hassell , the UK’s BS 8878 Code of Practice subsequently provided a framework which addressed these approaches. The heart of BS 8878 document is a 16-step plan:
- Define the purpose
- Define the target audience
- Analyse the needs of the target audience (note this was not covered in PAS 78)
- Note any platform or technology preferences
- Define the relationship the product will have with its target audience
- Define the user goals and tasks
- Consider the degree of user experience the Web product will aim to provide
- Consider inclusive design and user-personalised approaches to accessibility
- Choose the delivery platform to support
- Choose the target browsers, operating systems and assistive technologies to support
- Choose whether to create or procure the Web product
- Define the Web technologies to be used in the Web product
- Use Web guidelines to direct accessibility Web production (ie to guide the production of accessible Web content)
- Assure the Web product’s accessibility through production (i.e. at all stages)
- Communicate the Web product’s accessibility decisions at launch
- Plan to assure accessibility in all post-launch updates to the product
It should be noted that step 13 is the only one which is directly relevant to WCAG guidelines. The BS8878 standard provides a more comprehensive framework which enables the complexities of Web accessibility to be addressed. We recommend use of BS 8878 by those working in the UK and suggest that BS 8878 would be an appropriate building block for the development of an international standard.
This article summarises previous work in seeking to provide guidelines on the writing style for Web content. We have identified dangers of providing recommendations on writing style in future versions of WCAG guidelines. It gives an alternative route to the adoption of emerging practice in which content providers can select appropriate guidelines based on processes described in BS 8878. The relevance of BS 8878 in enhancing accessibility of Web products has been described elsewhere  and reflects approaches provided in the TechDis Accessibility Passport .
In addition to understanding the relevance of BS 8878 in the context of readability of text we feel that current research in this areas focuses too much on comprehension at the level of clause and sentence. Further research, we suggest, should be carried out to assist comprehension at the level of text including the following areas.
How collocability influences understanding: How word and phrase frequency influences understanding with particular focus on collocations. The assumption behind software like TextHelp is that this is very important. Much research is available on the importance of these patterns from corpus linguistics but we need to know the practical implications of these properties of language both for text creators and consumers. For instance, should text creators use measures of collocability to judge the ease of reading and comprehension in addition to or instead of arbitrary measures like sentence and word lengths.
Specific ways in which cohesion and coherence affect understanding: We need to find the strategies challenged readers use to make sense of larger chunks of text. How they understand the text as a whole, how they find specific information in the text, how they link individual portions of the text to the whole, and how they infer overall meaning from the significance of the components. We then need to see what text creators can do to assist with these processes. We already have some intuitive tools: bullets, highlighting of important passages, text insets, text structure, etc. But we do not know how they help people with different difficulties and whether they can ever become a hindrance rather than a benefit.
The benefits and downsides of elegant variation for comprehension, enjoyment and memorability: We know that repetition is an important tool for establishing the cohesion of text in English. We also know that repetition is discouraged for stylistic reasons. Repetition is also known to be a feature of immature narratives (children under the age of about 10) and more “sophisticated” ways of constructing texts develop later. However, it is also more powerful in spoken narrative (e.g. folk stories). Research is needed on how challenged readers process repetition and elegant variation and what text creators can do to support any naturally developing meta textual strategies.
The benefits and downsides of figurative language for comprehension by people with different cognitive profiles: There is basic research available from which we know that some cognitive deficits lead to reduced understanding of non-literal language. There is also ample research showing how crucial figurative language is to language in general. However, there seems to be little understanding of how and why different deficits lead to problems with processing figurative language, what kind of figurative language causes difficulties. It is also not clear what types of figurative language are particularly helpful for challenged readers with different cognitive profiles. Work is needed on typology of figurative language and a typology of figurative language deficits.
The processes of code switching during writing and reading: Written and spoken English employ very different codes, in some ways even reminiscent of different language types. This includes much more than just the choice of words. Sentence structure, clauses, grammatical constructions, all of these differ. However, this difference is not just a consequence of the medium of writing. Different genres (styles) within a language may be just as different from one another as writing and speaking. Each of these come with a special code (or subset of grammar and vocabulary). Few native speakers never completely acquire the full range of codes available in a language with extensive literacy practices, particularly a language that spans as many speech communities as English. But all speakers acquire several different codes and can switch between them. However, many challenged writers and readers struggle because they cannot switch between the spoken codes they are exposed to through daily interactions and the written codes to which they are often denied access because of a print impairment. Another way of describing this is multiple literacies. How do challenged readers and writers deal with acquiring written codes and how do they deal with code switching?
How do new conventions emerge in the use of simple language? Using and accessing simple language can only be successful if it becomes a separate literacy practice. However, the dissemination and embedding of such practices into daily usage are often accompanied by the establishment of new codes and conventions of communication. These codes can then become typical of a genre of documents. An example of this is Biblish. A sentence such as “Fred spoke unto Joan and Karen” is easily identified as referring to a mode of expression associated with the translation of the Bible. Will similar conventions develop around “plain English” and how? At the same time, it is clear that within each genre or code, there are speakers and writers who can express themselves more clearly than others. Research is needed to establish if there are common characteristics to be found in these “clear” texts, as opposed to those inherent in “difficult” texts across genres?
All in all, introducing simple language as a universal accessibility standard is still too far from a realistic prospect. Our intuitive impression based on documents received from different bureaucracies is that the “plain English” campaign has made a difference in how many official documents are presented. But a lot more research (ethnographic as well as cognitive) is necessary before we properly understand the process and its impact.
- Research and Development Working Group (RDWG), W3C Web Accessibility Initiative
- Easy-to-Read on the Web, Online Symposium 3 December 2012, W3C Web Accessibility Initiative http://www.w3.org/WAI/RD/2012/easy-to-read/
- Easy to Read, W3C Web Accessibility Initiative http://www.w3.org/WAI/RD/wiki/Easy_to_Read
- WCAG 2.0, Working Draft 11 March 2004, W3C http://www.w3.org/TR/2004/WD-WCAG20-20040311/
- Bitesize, BBC http://www.bbc.co.uk/bitesize/
- Holistic Approach, JISC TechDis, (no date)
- Kelly, B., Phipps, L. and Howell, C. Implementing a Holistic Approach to E-Learning Accessibility, ALT-C 2005 12th International Conference Research Proceedings http://opus.bath.ac.uk/441/
- Sloan D., Kelly B., Heath A., Petrie H., Hamilton F. & Phipps L. Contextual Accessibility: Maximizing the Benefit of Accessibility Guidelines. Proceedings of the 2006 International Cross-Disciplinary Workshop on Web Accessibility (W4A) Edinburgh, Scotland, 23 May 2006. New York: ACM Press, pp. 121-131 http://opus.bath.ac.uk/402/
- Kelly, B., Sloan, D., Brown, S., Seale, J., Lauke, P., Ball, S. & Smith, S. Accessibility 2.0: Next Steps For Web Accessibility, Journal of Access Services, 6 (1 & 2). DOI: 10.1080/15367960802301028
- Raymond, E.S. The Cathedral and the Bazaar, essay, September 2000, v.3.0
- Hassell, J., BS 8878 web accessibility standards (supersedes PAS 78) – all you need to know http://www.hassellinclusion.com/bs8878
- Cooper, M., Sloan, D., Kelly, B. & Lewthwaite, S. A Challenge to Web Accessibility Metrics and Guidelines: Putting People and Processes First. In: W4A 2012: 9th International Cross-Disciplinary Conference on Web Accessibility, 16-18 April 2012, Lyon. DOI: 10.1145/2207016.2207028 http://opus.bath.ac.uk/29190/
- JISC TechDis Accessibility Passport: Building a Culture of Accessibility, (no date)
Web site: http://www.ukoln.ac.uk/
Accessibility papers: http://www.ukoln.ac.uk/web-focus/papers/#accessibility
Brian Kelly is UK Web Focus at UKOLN, University of Bath. Brian has published a wide range of peer-reviewed papers on Web accessibility since 2004.
Education and Technology Specialist
Web site: http://www.dyslexiaaction.org.uk/
Dominik Lukeš, Education and Technology Specialist at Dyslexia Action, has published research in the areas of language and education policy. He has also worked as editor, translator and Web developer.
Higher Education Academy
Web site: http:/www.jsictechdis.ac.uk/
Alistair McNaught is a Senior Advisor at JISC TechDis with considerable experience in using technology for teaching and learning.