Web Magazine for Information Professionals

Book Review: Making Software - What Really Works, and Why We Believe It

While acknowledging the genuine usefulness of much of its content, Emma Tonkin provides helpful pointers towards a second edition.

Published by O'Reilly, as part of the Theory In Practice series, this book is essentially academic in focus. It takes the form of thirty chapters. The first eight of these aim to provide an introduction to the area of software engineering, or more specifically, the collection and use of supporting evidence to support software engineering practices. These initial chapters are satisfyingly broad in scope, covering topics from human factors and personality to complexity metrics and the process of authoring a systematic review.

Evidence-based Software Engineering

Software engineering is a relatively young domain, and an affluent one; IT is big business, as are the various sub-domains that surround it. For some, IT management, development processes, the development (and marketing) of technology and business frameworks intended to support the process or improve its results are areas that represent research opportunities; for others, the potential is more effectively captured in a business plan. Furthermore, software development is a complex process: difficult, expensive, context-sensitive and fragile.

There is no shortage of available advice, and a wealth of (often mutually contradictory) suggestions to sort through. But that advice is seldom backed up with clear and directly applicable evidence. As a consequence, commentators in this work proposed 'evidence-based software engineering …'  to '… provide the means by which current best evidence from research can be integrated with practical experience and human values in the decision making process regarding the development and maintenance of software' [1]. This would provide:

Part 1: General Principles of Searching for and Using Evidence

The first part of the book begins with two chapters on the theme of evidence about software engineering. “The Quest for Convincing Evidence” discusses the collection and aggregation of evidence about software engineering. The second chapter, “Credibility, or Why Should I Insist on Being Convinced?” discusses what makes evidence convincing, and sketches out factors that make some people harder to convince than others. This second chapter sets the scene for much of the book, arguing that the problem of convincing an audience is context-specific: the evidence presented should be tuned to the interests and needs of the audience. Less theory is required, say Menzies and Shull [2]; instead, a call for more repositories of evidence, observations and laws.

Chapter Three discusses the process and practice of completing a systematic review (systematic literature review). The systematic literature review is likely to be more familiar to researchers than the majority of topics in this book, but the process is clearly explained and its strengths and weaknesses identified. This chapter does not appear to represent a core topic for Part 1 of the book.

Chapter Four is an introduction to the use of qualitative methods in software engineering, useful background for the reader.

On reaching Chapter Five, the title alone suggests that the reader is likely to find it harder going, as it contains no less than two acronyms that are explained only part way through the body of the piece: “Learning Through Application: The Maturing of the QIP in the SEL'. The QIP is a 'Quality Improvement Paradigm', described in the text as 'a version of the scientific method' (characterise your project; figure out what you want to know/accomplish; figure out what process might work; do it, and see what happens; analyse the data you collected, and store what you learned during the process). The SEL is NASA's Software Engineering Laboratory. This chapter, again, could as easily be placed in Part 2 of the book.

Chapter Six, “Personality, Intelligence and Expertise: Impacts on Software Development” returns to the goal of providing a good theoretical grounding to the reader. This time, it's all about the personalities; how to recognise a good programmer, and establishing the role of personality and intelligence, amongst other factors.

“Why Is It So Hard to Learn to Program?”, Chapter Seven, presents available evidence about the challenges faced in learning to program, possible strategies, such as visual programming, that facilitate the process, and the circumstances in which such strategies are most likely to improve matters.

The final chapter of Part 1, “Beyond Lines of Code: Do We Need More Complexity Metrics?” serves as a (detailed) introduction to complexity metrics, their purpose and application, concluding that lines of code correlate closely with the more complex metrics available – but that alternatives to the old metrics can improve the situation further.

Part 2: Specific Topics in Software Engineering

In the introduction to the second, longer section of the book, the editors explain the structure as follows: it is intended to 'apply the ideas in the previous part to several pressing topics in the field of programming and software developments' (p.143). Controversial topics in software development and engineering are to be held to the light, allowing the reader to explore evidence and the extent to which traditional ideas may be trusted.

A bewildering diversity of topics are covered during this part of the book - twenty-two chapters. These vary in length between four and over twenty pages. Some chapters are written in the style of scientific papers, presenting complex arguments and evidence in a manner that does little to make the information accessible to the novice reader. Other authors have approached the challenge differently, using accessible structure and familiar examples. If there is ever a chance to develop a second edition, it would be well worth performing a few readability tests, chapter-by-chapter, and revising the more tedious examples of the researcher's art into a more engaging form. This was not intended to be a textbook, but it's a shame that it isn't, because the information is all here – one simply needs to get past the academic structure to find it.

The second section of the book could benefit from a clearer structure. The chapters do not seem to be placed in any particular order.  A more cohesive grouping of topics might make it easier for the serial readers who prefer to make their way linearly through the book. The readers find themselves drifting between topics as they progress through the book, from automated validation methodologies to the question of upfront architecture versus iterative development, and then on to human factors, from which they find themselves reading about the comparison between implementations of a given software specification in various programming languages, and so forth. Yet as the reader works through the chapters, they will see the emergence of several topic areas; human factors and the workplace environment, evidence supporting software engineering best practices, benchmarking programmer performance, and so forth.

Topics in This Work

The origins of software flaws are explored Chapter 25 entitled “Where Do Most Software Flaws Come From?”. Other chapters, which paradoxically appear earlier in the book, explore systems intended to pick out likely locations for faults: “An Automated Fault Prediction System”, and “Evidence-Based Failure Prediction”.

Some topics are deeply tangential to the primary subject matter of the book; one such is the chapter by Whitecraft and Williams (p.221), entitled, “Why Aren't More Women in Computer Science?” It is an intriguing question, but nonetheless remains an unusual discussion to feature in a book on software engineering methods.

More obviously relevant are “Code Talkers”, Chapter 16, a study of communication between programmers, and “A Communal Workshop or Doors That Close”, discussing the ideal office layout for developers, working either alone or in a team. Chapter 20, “Identifying and Managing Dependencies in Global Software Development”, discusses the problems inherent in 'Global Software Development', apparently a variation on the familiar terms 'outsourcing' or 'offshoring'.

Several chapters involve direct discussion of software development methodologies, such as Chapter 10, “Architecting: How Much and When?”, Chapter 17, “Pair Programming”, Chapter 18, “Modern Code Review”, Chapter 21, “How Effective Is Modularization?”, Chapter 22, “The Evidence for Design Patterns” and Chapter 12, “How Effective Is Test-Driven Development?”

Chapter 15, “Quality Wars: Open Source Versus Proprietary Software”, seeks differences in quality between open source and proprietary software, but finds very little. Chapter 14, “Two Comparisons of Programming Languages”, performs a similar task by comparing implementations of given algorithms.

There are also chapters on “The Art of Collecting Bug Reports” (Chapter 24), the strengths and weaknesses of novice programmers, “Novice Professionals: Recent Graduates in a First Software Engineering Job”, and data mining - “Mining Your Own Evidence”, Chapter 27. Chapter 28 is a discussion of “Copy-Paste as a Principled Engineering Tool”. Chapter 29,  “How Usable Are Your APIs?” asks how, and to what extent, APIs themselves may sometimes be an obstacle to programmers, whilst the shortest and final chapter, Chapter 30, is a discussion of “Measuring Variations in Programmer Productivity”.


This is a genuinely good book, despite the inconsistency of tone and audience focus. Given the sum of its parts, however, it is not an easy book. It is not an easy linear read, due to the variation in chapter styles, content and difficulty, so it works more effectively as a sourcebook to be dipped into when the need or interest arises. No matter how smoothly written the chapters are, the subjects themselves are often challenging and complex, not least because the chapters are intended to challenge the reader's assumptions. If you have ever wondered about questions like the following:

then this book may be of interest to you. Not every developer myth comes under attack– the reader is spared the vi versus emacs versus eclipse debate, for example – but most developers, or people who work with them, will find that some of the points covered in this book will be of relevance to them.

Be warned, however: the book contains more observations than answers. On the above subjects and many others, it provides enough information to seed a very fruitful discussion. Nevertheless, the conclusion of the vast majority of chapters repeat very similar observations: not enough is known; there is much work to be done; the results are contextually bound and dependent on the audience, development subject, and context of use; and so forth. To return to an observation made in the very earliest chapters of the book: 'convincing evidence motivates change[...]. Generating truly convincing bodies of evidence would […] require a shift in the way results are disseminated. It may well be that scientific publications are not the only technology required to convey evidence'.

As detailed as this book is, it could not hope to contain evidence enough to help all readers to answer their questions. It is a starting point, and a valuable one. The book provides the reader with information such as concepts, resources, and dramatis personae, offering an entry point to the theoretical and practical world of software engineering in its broadest sense.


  1. Barbara A. Kitchenham, Tore Dybå, Magne Jørgensen: Evidence-Based Software Engineering. ICSE 2004: pp. 273-281
  2. Andy Oram and Greg Wilson (2010). Making software: What really works, and why we believe it. p. 11.

Author Details

Emma Tonkin
Research Officer
University of Bath

Email: e.tonkin@ukoln.ac.uk
Web site: http://www.ukoln.ac.uk/