I was recently invited to attend a 'Software Sustainability Workshop', organised by the National Science Foundation (NSF) and hosted by Indiana University at its University Place Conference Center in Indianapolis. The invitation, which included a call for position papers, described the event as follows:
The workshop will focus on identifying strategies to create sustainable models for use, support, maintenance, and long-term sustainability of cyberinfrastructure software that is developed and used by research communities working in areas related to the NSF mission. The implications are expected to be interesting and useful to the science and engineering community. Workshop goals include:
- Examination of current software evaluation and adoption models by labs and virtual organizations.
- Examination of long-term sustainability models; and mechanisms for supporting sustainability via funding organizations, open source, and commercialization.
The event attracted around seventy people, predominantly from the United States, with a few international delegates such as myself and began informally with an evening reception which I unfortunately missed after a rather problematic journey from the UK.
Brad Wheeler, Vice President for IT, Dean, & Professor, Indiana University suggested that the purpose of the workshop was to look for a "mutual understanding" of the problem outlined in the call for papers. He urged delegates to be "outcome-oriented", recommending that the workshop aim towards "swiftly producing a paper": as he put it, "we know we have a problem, let's start from this point". Brad also invited us to consider what we might be prepared to change, suggesting that we look at this aspect from the points of view of several stakeholders such as funders, developers, Principle Investigators and Virtual Organisations. With, perhaps, a little wryness, Brad insisted that the subject of "tenure" was out of scope of this latter exercise. As a visitor from outside the US I had a vague understanding of the US university system of tenure, but fellow (US) delegates kindly explained it in detail for me over lunch. I sensed that they saw the wisdom of ruling this out of scope as a candidate for change in the proposed discussions.
The first plenary presentation of the day was delivered by Jennifer Schopf of the Office of Cyberinfrastructure (NSF) and was called Sustainable Software as Cyberinfrastructure: Experience and Perspective from the NSF. Jennifer gave a commendably clear introduction to the problem space, offering useful definition and scope. She defined "sustainability" as the "ability to maintain a certain process or state". In the narrower context of the workshop, she identified two aspects of sustainable software: that it can be reused in "broad contexts", and that it is supported with funding models which "encourage long-term support". If software was to be deployed and supported as infrastructure, Jennifer argued, then the NSF needed to start treating it as it did other infrastructure such as hardware. This would necessitate something of a "culture shift", requiring changes to how software was procured, budgeted for, quality-controlled, maintained, replaced, etc. Jennifer also indicated some fundamental differences between software and and other infrastructure, pointing out that the software can be deployed for very long periods of time - longer than the average hardware "refresh" cycle of approximately three years. Furthermore, during its life-span a given software deployment will have a tendency to grow in size and complexity.
Jennifer went on to propose that if the NSF were to consider software as infrastructure, then those funded "principle investigators" (PIs) working in research which developed software as an output would also need to treat it as such. Currently, it would seem that development teams in those research domains funded by the NSF do not generally treat the development of software in such terms. Jennifer suggested a number of aspects which might need to be considered, such as reliability, proven quality, security and formal planning. She suggested that development teams should have a degree of "professional engineering" expertise and experience. She concluded this section of the presentation by underlining two points: improving software sustainability was going to require a culture change by both funders and research/development teams. Moreover, it would not be achieved by simply spending more money.
In the next portion of her presentation Jennifer changed tack to consider features and characteristics of successful infrastructure software and to examine in a little more detail "something that worked". Specifically, Jennifer singled out MyProxy  which provides, in her words, "an online credential repository for Grid portals and Globus Toolkit". Pointing to the lasting success of this software, she identified a series of aspects which she considered to have been important, citing the projects' close alignment with user requirements and feedback, a commitment to stability at every stage (including backwards-compatibility), a simple, coherent and open design, and substantial long-term support from the National Center for Supercomputing Applications (NCSA).
Drilling down into the issue of close alignment with users and their requirements, Jennifer identified that this is difficult - and that having a member of the user community working closely with the developers is crucial. On the subject of the avoidance of unnecessary complexity, she observed that if developers only manage to attract funding for adding features, then software will tend to become large and increasingly unsupportable. Software which is difficult to understand or support does not thrive. However, if funders should avoid merely funding new features, then they are still confronted with the difficult issues of deciding what actually to fund and, having so decided, devising an effective exit strategy which ensures that successful infrastructure software is sustained.
Introducing the Software Development for Cyberinfrastructure (SDCI) Programme  Jennifer outlined how the NSF planned to tackle the issues of education and provenance in encouraging the development of sustainable software through a "community approach" and the formation of a series of task forces.
A simple survey of US university-level software engineering courses revealed that some tended to be weak in certain areas fundamental to working in a serious production environment, such as team-based collaborative development, working with users, managing releases and issues, as well as operational issues, for example, performance and security. Where other courses of this type were better in these respects, Jennifer nonetheless wondered how many computational scientists would actually take such a course.
On the subject of provenance, Jennifer made the point that the ability to re-execute computational science software code is crucial in providing the basic scientific requirement of reproducibility, implying that software in this area must be carefully curated. However, she claimed that currently, "the majority of computational science applications cannot be run by another researcher, and results cannot be reproduced".
One particular item stood out for me: NASA provide Reuse Readiness Levels (RRLs)  which are used to indicate the "readiness" of software products for potential reuse. These, and the framework from which they come, looked to be a very useful resource.
Next up was Brad Wheeler, Vice President for IT, Dean, & Professor, Indiana University. Brad has close involvement with community development initiatives in the Higher Education sector such as Sakai  and the Kuali Foundation . He began by asserting that software is essential to cyberinfrastructure - and that in this context the software product is a "troublesome artefact". Through a series of revealing schematics, Brad displayed different views of cyberinfrastructure, outlining the components of such infrastructure from the points of view of both the scholarly community and the campus IT provider. In both cases, a picture emerged of a stack of components, ranging from generic infrastructure through to domain-specific systems, and in each Brad asked where the boundary between generic and domain-specific should lie. Or, to put it another way, at what level in the stack should components be considered part of shared cyberinfrastructure.
Pointing to a "software sustainability chasm", Brad indicated that software development in the context of the workshop is typically funded with research grants, allowing software products to get to the first couple of releases, before funding dries up and development slows to a halt. As an advocate of the community-based open-source approach to software development and sustainability, Brad suggested that the open-source approach is gradually "moving up the stack" from its early successes with relatively low-level infrastructure components, through the development of operating systems, to server software. He pointed to the development of applications as the next layer in this stack, and as a future objective for open source development.
The next part of Brad's talk was centred upon a model for sustainability for software development based on "Code, Coordination and Community". Invoking Eric Raymond's The Cathedral and the Bazaar, a treatise on the contrast between the (then) typical mode of commercial development and the open-source approach, he hinted that there might be a middle way - a compromise between these two radically different approaches.
Beginning with "code" and using the examples of Linux and Apache, Brad explained how it had been demonstrated that source code could be made open and be sustained, giving users the confidence that they would not be "locked into" a closed commercial product for which support could be removed at any point. With open-source code he argued, a real issue lay in how the code evolved. While the risk of a commercial provider deciding, unilaterally, to remove its support was not present in open-source developments, a new risk of the code "forking" in different directions was introduced. How could one identify the "canonical" release of a code-base with various "branches" for example? Brad suggested that these were questions of "coordination" and "community".
Moving on to "coordination", Brad again raised the examples of Linux and Apache, both of which have shown effective models of coordination. One of the important aspects of these open coordination approaches was that intellectual property (IP) was "unbundled" from support. In some cases, where the IP is owned by the community, coordination is sustained by partnering organisations, with options for additional commercial support for users.
Lastly, Brad pointed to the sustained and global communities which have been successfully cultivated by the Linux and Apache efforts. Claiming that such communities are shaped by the license they use to distribute their software products, he suggested that the "bazaar" was shaped around the GPL licence, which restricts how derivative products can be licensed and used, and that the "cathedral" was shaped by commercial licensing, which protects commercial investment. Somewhere between these two, Brad argued, was a space for what he called "open-open" licensing - a model which limited restrictions on the licensing and use of derivatives.
Neatly returning to his earlier hint of a "middle way" between the "cathedral" and the "bazaar", Brad suggested that a hybrid model, which he termed "community source" was needed. Quoting Wikipedia, he identified the distinguishing characteristic of community source as the fact that, "...many of the investments of developers' time, design, and project governance come from institutional contributions by colleges, universities, and some commercial firms rather than from individuals.". In terms of the cathedral bazaar analogy, Brad proposed "The Pub…the Place Between the Cathedral and the Bazaar".
Brad concluded his presentation with "an industry view of the sustainability challenge", in which he iterated through five models of software development, identifying stakeholders and important issues in each. They were laid out, one to a slide, in a very accessible manner - I recommend them if you have a particular interest in this area.
Dennis Gannon, Cloud Computing Futures Centre, Microsoft Research, gave a brief talk about Microsoft's plans in this area. Microsoft favours an established "framework" such as Eclipse and building upon it. This is, in fact, what IBM have done with their Websphere products. He pointed out that there is still confusion in the research community about what is meant by the term "infrastructure" in this context - with a failure to distinguish between software as infrastructure and software as a research artefact.
Responding to a question about how Microsoft resources projects internally, he characterised a typical project as having 20% of its resources devoted to management, 40% towards development with the remaining 40% reserved for testing.
Next up was Neil Chue Hong, Director, OMII-UK. Neil gave an engaging presentation about the activities of OMII-UK, an open-source distributed organisation based in the Universities of Southampton, Manchester and Edinburgh in the UK.
Neil began by giving a quick overview of the investment in eScience in the UK. Quoting John Taylor, Director General of Research Councils Office of Science and Technology, Neil pointed out that the development of infrastructure is a core component of the UK's funded eScience initiative. Neil offered a useful breakdown of the sorts of activities which fit into a spectrum of levels of maturity of software development in research, ranging from funding and "heroic research", through "everyday research" and production use, to commercial exploitation.
Detailing some of the software development which OMII-UK has supported, Neil singled out products like Taverna - a widely deployed workflow/job control system and PAG:AG - "video-conferencing for anyone".
OMII-UK maintains an online support and help-desk service, and actively engages with the research community to solicit requirements and feedback using structured interviewing techniques.
Neil indicated that different parties view the issue of sustainability of software differently. He suggested that computer science researchers do not want to have to deal with sustainability - they want someone else to take over any software they produce - where intellectual property arrangements allow this. Yet for those using software, sustainability is a big issue - such users are uncertain about the future of some of the systems they use, and therefore have reduced confidence in their ability to rely on such software. From Neil's perspective, the issue is about helping software "survive the transition" through stages of development and uptake. He identified these stages, in the context of software developed to support research, using the labels, "idea, prototype, research, supported, product".
To try to address some of these issues, OMII-UK has a Commissioned Software Programme (CSP)  which is designed to identify and help manage the development of appropriate software by engaging at a community level - scaling up from one set of users and developers to many sets of each. Neil described this as being "somewhere between venture capital and a foundation". He outlined these criteria for judging progress in CSP-funded work; that the demand for the software is understood; that the number of potential users has been increased by the work done; that the use of the software has contributed to a measurable increase in the research outputs; that the community participation around the software has increased.
As an example of how the CSP works, Neil presented some timelines portraying the progress over four years of the GridSAM Project . As projects such as GridSAM progress through the various stages of development, the CSP makes interventions at appropriate junctures. Neil was able to overlay related timelines such as the history of publications arising from this work, and the progression of deployments and "value-added" developments.
Neil was careful to explain that "there isn't a single best model for sustainability". Invoking Chris Anderson's The Long Tail, he pointed out that the "long tail" of software deployment requires investment to prevent decay over time. Citing a workshop held at the UK eScience All Hands Meeting in 2008 , he described how the delegates identified a number of issues which needed to be addressed as part of the process of embedding e-Infrastructure into the research process. The first of them was the recognition that there is "no single common e-Infrastructure".
Moving on to sustainability models, Neil made reference to what appears to be a comprehensive classification of open-source business models developed by the "451 Group"  before outlining a set of sustainability models for research software. The latter ranged from the variously grant-funded, through institutional and "mixed enterprise", to the foundation and, incongruously, the "T-shirt" model.
Neil concluded his presentation by starting with the assertion that "increasing participation is the key to long-term sustainability". The Engage Project  which aims to engage researchers with e-Infrastructure, has suggested that people will tend to "prioritise ease of use, support and continued development over a complete feature set". In order to meet this priority, Neil suggested that a sustainable community and trust between the users and the infrastructure providers are the two things which need to be developed. A sustainable community, Neil argued, demonstrates four key factors: cohesion and identity; tolerance of diversity; efficient use of resources; and adaptability to change.
I participated in a few breakout sessions, which mostly tried to unpick some of the broader issues raised in the presentations. Overall however, I think the presentations contributed the real value of this workshop - the discussions in those breakout sessions in which I participated became quite polarised, with one or two individuals with strongly held points of view dominating the discussion. However, those sessions did also serve to corroborate some of the points made in the main presentations about the community, its requirements, and its views on infrastructure. For example, there seemed to be a strongly held belief that the management and deployment of software as infrastructure could learn something from how hardware has been handled in this context.
My impression was that this community was rather more comfortable and experienced in managing hardware than software in this sense. I also heard the dichotomy between science and engineering raised several times: most people present at the workshop were scientists, not engineers. It might be fair to say that they think as scientists, but are becoming aware that they, or somebody at least, needs to apply some engineering expertise to the issue of developing software in a sustainable way. One aspect of several discussions which threw this dichotomy into sharp relief was the recurring confusion between the notion of the sustainability of software as a research output - to enable the reproducibility of scientific research, and the concept of the sustainability of software to provide stable infrastructure. This important distinction was identified more than once, but tended to get lost in the discussions quite easily: I think this community needs to do work more to be clearer on this.
My final observation was that, despite the idea of community engagement being raised by all of the main speakers (and emphasised by two of them), this aspect did not really figure in the free-flowing discussions of the breakout sessions I attended. Rather, those participants were more concerned with issues such as version control, quality assurance and certification of software products.
Overall, I found this to be a fascinating workshop. My role was a more passive one than I am used to as I am not from a science community, but I was able to contribute in the discussions around software development. Many of the issues around sustainability are not new - but I think that there are some interesting and particular issues in terms of infrastructure to support the sciences, as some of these disciplines seem to be heading into a future where a major part of research is based on computation. I'm not entirely sure that the community-source approach, as particularly espoused by Brad Wheeler, is going to work for this community as well as it has worked in some other sectors such as eLearning. However, Neil Chue Hong was able to point to examples of community engagement which seem to be heading successfully in this direction.
I learned a great deal from this workshop and would like to thank the organisers and some of the delegates (notably Jennifer Schopf, Neil Chue Hong, Elliot Metsger of Johns Hopkins University and MacKenzie Smith of MIT Libraries) who helped explain some aspects of practice in the US or in science research with which I was not familiar.
Some of the presentations are available in MS Powerpoint format .