Web Magazine for Information Professionals

UK Digital Preservation Needs Assessment: Where We Go from Here

Najla Semple and Maggie Jones outline the background and findings of the Digital Preservation Coalition's UK Needs Assessment and the Mind the Gap report.

The Digital Preservation Coalition (DPC) was formed in the belief that no single organisation can hope to address single-handedly all the challenges and issues associated with digital preservation. It was launched in February 2002 with an initial membership of 19, and has grown to 30 members as of June 2006. Its underlying principle is that intense collaboration and co-operation across and between sectors is essential as there is a far wider range of key players who need to be involved at various different stages in the life cycle of digital resources. The pressing and increasingly urgent need for advocacy and awareness-raising and to raise the profile of digital preservation among a much broader community was a further impetus for the formation of the DPC and these activities have formed a major part of the DPC work programme.

The UK Needs Assessment

A key priority for the DPC has been the gathering of reliable facts and figures which would enable informed planning for a national digital preservation infrastructure. This infrastructure needs to be able to manage and maintain the rapidly increasing quantities of important digital materials being created by a wide variety of constituents for a broad range of purposes. The UK Needs Assessment, undertaken by the DPC, aimed to gather and collate a picture of the status quo in the UK with regard to digital preservation; to assess and develop a clear way forward; and, last but not least, to communicate that message to a wide and diverse audience, including those who may not have considered that digital preservation was part of their remit. This needs assessment culminated three years later in a report, Mind the Gap: assessing digital preservation needs in the UK [1]. The report was launched at the Houses of Parliament on 15 February 2006 by Andrew Stott, Deputy Chief Information Officer & Head of Service Transformation, E-Government Unit. This article reports on the results of these studies, the implications of these findings for the digital preservation community, and how the DPC plans to take forward the recommendations from the report.

The UK Needs Assessment comprised the following three stages:

Stage 1: DPC Members' Survey

The DPC is a membership organisation, relying primarily on membership subscriptions to fund its activities, and resources are modest. To accomplish such an ambitious undertaking as the UK Needs Assessment called for more resources than we could gather in a single year and the DPC Board agreed that it should proceed in stages. The initial stage was to conduct a survey of DPC members in 2003. This provided us with some concrete information on what DPC members were doing in terms of managing their own digital collections, and in some cases, managing digital resources on behalf of others. It also drew attention to what DPC members regarded as key issues and a number of spin-off activities subsequently arose from this survey. These included gathering some facts and figures relating to loss (or as it turned out in most cases to be, potential loss) of digital information, and the development of digital preservation training. The training programme, with JISC funding, has now evolved into the successful Digital Preservation Training Programme (DPTP) led by University of London Computer Centre (ULCC) in partnership with Cornell University.

Outcomes from the DPC Members' Survey

Case Studies

A series of digital preservation case studies was compiled by Duncan Simpson from examples provided by DPC members. The study, Risk of Loss of Digital Data is available from the 'Members only' part of the DPC Web site [2]. This study compiled scenarios of risk and loss based on real-life practical experiences. It highlighted some costly data recovery procedures which could serve as a warning of the consequences of not managing data effectively from as early in the life cycle as possible. It revealed that many problems occur as a result of inadequate management. These might be a failure to curate the digital resources by the creators, or an interruption in the chain of responsibility. The study also illustrated that dramatic cases of outright loss of a resource are relatively rare but highlighted more often a complex chain of events which led to the unacceptable deterioration of resources. It further revealed that in many cases it had been possible to rescue data, but only after the expenditure of considerable time and effort in tracking down the missing parts of the puzzle. It need hardly be said that this will lead to extremely inefficient management of digital resources given the scale at which they are now being produced.

Digital Preservation Training Programme

Training had been flagged as a high priority for many DPC members. There was a pressing need for practical training which would equip participants with the skills and confidence to apply practical strategies to their own organisational requirements. Cornell had developed a week-long training programme which offered an excellent model for us to use in the UK, structuring the course around three building blocks of technical, organisational, and economic factors. A mutually beneficial partnership with Cornell has been made possible with JISC funding, and the pilot of the residential Digital Preservation Training Programme (DPTP) [3] was successfully piloted at Warwick in October 2005 and received excellent feedback. To date, further DPTPs have been run in Birmingham (March 2006) and another is planned for York in July 2006. We believe the programmes fill a much-needed gap in digital preservation training, offering in-depth but accessible training and helping to demystify standards and tools such as the Open Archival Information System (OAIS), preservation metadata, and life cycle management.

Stage 2: Sample Survey of Regional Organisations

The DPC Members' survey also highlighted a concern among several DPC members at the level of awareness and development of digital preservation activity within smaller local and regional organisations. The second stage of the UK Needs Assessment was therefore to commission a sample survey of just such organisations. The Museum, Libraries and Archives Council (MLA) agreed to fund a sample survey of two of their regional organisations, the Northeast MLA (NEMLAC) and the West Midlands [4]. With the completion of this survey in 2005, a subtle picture was beginning to emerge which indicated that, while there appeared to be no immediate cause for alarm in terms of imminent catastrophic loss of digital resources, there was no coherent plan in place, especially with regard to roles and responsibilities, both internally within individual organisations and externally. As with the earlier project to gather facts and figures on loss of digital resources, there were no dramatic results likely either to cause panic (which was good), or to engage the attention of senior decision makers and funders (which was not so good).

Stage 3: Completing the UK Needs Assessment

At a DPC Planning Day held in February 2005, it was decided to complete the UK Needs Assessment from DPC resources, building and extending on the data we already had. We had gathered a useful body of information but we were conscious that it needed to be condensed and presented in such a way that it would be read by a far larger audience than those already converted. We also felt that more information needed to be gathered, including from commercial companies, to complement and enhance what we already had. This led to the final stage of the project, which after a competitive tendering process was awarded to Tessella Support Services plc. The company has over 20 years of proven expertise in the area of reliable and authentic long-term preservation of electronic records. Tessella were charged with conducting desktop research, a third survey, and then (most challenging of all!) synthesising the results of all three surveys and desktop research into a report which would simultaneously depict the complexities and nuances of the current situation, while also ensuring that the report was accessible to as wide an audience as possible including those who might not otherwise be interested in digital preservation or with specialist expertise. This was no mean feat and required Tessella working closely with the Steering Group for the UKNA (UK Digital Preservation Needs Assessment) Project, which was drawn from representatives of the diverse organisations with membership of the DPC.

The principal aim of the report was to provide a detailed analysis of priorities for action in the UK. Other aims were to raise the profile of digital preservation in the UK and to establish what is required in terms of infrastructure to support preservation activities. A key issue was to raise awareness of the risks associated with the failure to address digital preservation challenges. An important aspect was to engage senior policy and decision makers who are involved in awarding funding for preservation activities. Ultimately it was considered fundamental to identify a list of key recommendations for further digital preservation action and to identify who might take these forward.

In addition to assessing and synthesising existing data, a further online survey was carried out by Tessella. Further qualitative information was gathered through interviews; the respondents represented a range of interests and organisations. Over 900 individuals were sent the questionnaire and more than 10% responded. While this might be seen as a low response rate, when the results were combined with the earlier survey results and desktop research, it allowed the development of a detailed picture with many common themes as well as some sectoral differences. We wanted more than a survey telling us that there were a lot of digital resources being produced but little happening to maintain them for the future. If the report was to act as a catalyst to accelerate progress, we needed to understand what specific needs and drivers would lead to further action.

Key Results from the Overall UK Needs Assessment Process

Given that respondents to all three surveys are those who, by definition, are interested in and sufficiently aware of the issue to bother to participate in the survey, then it can safely be assumed that the reality is worse than the results suggest. While there were no huge surprises revealed by the report, there was nevertheless a disturbing trend which, if unchecked, would inevitably lead to a gradual trickling away of valuable digital resources, rather than one single catastrophe.

A high level of awareness was revealed, for example, 87% of survey respondents in the latest survey recognised that a failure to address the issue of digital preservation would lead to loss of corporate memory or of key cultural documents [5]; over 60% felt their organisation could lose out financially. However, this awareness did not often translate into concrete action, with only 20% indicating any kind of digital preservation strategy in place. Evidence was found in all three surveys that many organisations have not yet assessed the volumes of material they need to preserve. This was true of over 33% of respondents to the DPC Members' survey and 55% of respondents to the latest survey. A telling quote from the Mind the Gap report is:

'Given that many organisations do not know the extent of the problem they face, it is not surprising that the loss of digital data is commonplace, and in some circumstances seems to be accepted as an inevitable hazard'. [6]

These results seem reminiscent of 'Boiled Frog Syndrome' in which a frog will sit in a pot of water which is heated very slowly and remain immobile until it is too late, whereas it will immediately leap out of boiling water. Because there is no evidence of dramatic losses of digital information, it may be tempting to feel complacent. It is however unsustainable to continue an approach in which most digital resources are retained in an ad hoc manner, with some material of little consequence being retained and more valuable resources being lost in what only can be described as the luck of the draw.

If those organisations which receive digital resources from a variety of sources continually need to look backwards to figure out how to manage them, it will become a practical necessity to ring-fence resources required for this kind of digital archaeology on a smaller scale than need be the case. It may still be technically possible to rescue digital resources (as evidenced by the DPC commissioned study, Risk of Loss of Digital Data), but it will not be practically feasible to do so if it becomes the norm to leave the task of their salvage to others at a later date.

The Current State of Play

The following areas were identified in the Mind the Gap report as reflecting the current state of digital preservation practice in the UK.

Commitment to Digital Preservation

The report revealed that only 18% of those responding to the 2005 survey had digital preservation strategies in place for their organisations. The earlier 2003 survey also indicated that digital preservation policies were not evident in the majority of formal business plans.

diagram (14KB) : Table 1: Commitment to digital preservation (source: 2005 DPC survey)

Table 1: Commitment to digital preservation (source: 2005 DPC survey).

Responsibility

Lack of clarity as to who is responsible for digital preservation is a clear impediment to progress. Over half of respondents to the 2005 survey indicated that that they were not clear about responsibility and that this hampered any preservation programme. This was also picked up in the two earlier surveys. This is characterised by the fact that many skills are required for preservation activities and the boundaries between these roles may be blurred in contrast to traditional roles. For example, it may be assumed that this is an IT problem, but the professional expertise of archivists and librarians in selecting and organising material for preservation is still valuable. Digital preservation needs to be cross-disciplinary and may require a change in the structure of some organisations unless, and until, 'hybrid skills', combining both subject and discipline specialisation with IT skills are developed in staff. Digital preservation confounds any attempts to segregate it neatly and tends to permeate many areas of an organisation which are likely to be structurally separate.

Volume of Data

The report revealed a lack of knowledge about the digital data that organisations hold. Only half of the respondents to the 2005 survey had assessed the volume of material they needed to preserve, however many reported that they needed to retain the data for over 50 years; a period in which a digital preservation strategy is crucial.

Data Loss

Loss of digital materials appeared as a frequent occurrence - 38% of respondents indicated that they had inaccessible and obsolete data. More worryingly perhaps, one third stated that they did not know if they had lost data or not. This is a very basic need, it is impossible to develop a sensible digital preservation strategy without knowing what material is held.

Data Source

Creators may also contribute to the digital preservation problem. Coping with unexpected file formats created outside the holding organisation is a significant problem as it is often the case that holding organisations can do little to persuade data creators to use file formats which are digital preservation-friendly. As a basic rule of thumb, the more control there is over creation of digital resources, the easier they are to manage over the longer term, which is why creators are so key to successful and cost-effective digital preservation strategies.

diagram (10KB) : Figure 2: Influence over formats of externally sourced data (Source: 2005 DPC survey)

Figure 2: Influence over formats of externally sourced data (Source: 2005 DPC survey). Also available in HTML

Solutions

Few organisations are implementing digital preservation strategies. It is some cause for concern that almost half of respondents claim to use portable media on which to store their data and as well as a means of long-term storage. A large number of organisations still print out digital data, which is not a recognised digital preservation strategy but is apparently still seen by some as the only pragmatic short-term solution . It was heartening to see however that almost 70% do use off-line back-up as part of a combination of solutions.

Metadata

The results revealed that only 31% of organisations felt that they were creating sufficient metadata for their digital objects and 41% of respondents claimed that they would have to add metadata to a digital object at the point of archiving which would be a considerable overhead for the organisation.

Needs and Recommendations

An interesting feature of the Mind the Gap report is its analysis of how different sectors approach digital preservation. For example, gaining funding for digital preservation for the commercial sector appears not to be a major issue providing there is a sufficiently strong business case. In contrast, the public sector often struggles for funding, even where there is a business case. Another area of interest is the regulated industries sector. Due to regulatory inspection, companies in the financial, pharmaceutical and food sectors have had to ensure long-term preservation of digital data by creating effective methodologies. For competition reasons many of these solutions are confidential and consequently not in the public domain.

The following needs and recommendations were highlighted in the report:

Conclusion

The DPC and its members have worked hard to raise the profile of digital preservation from that of a highly specialised activity affecting relatively few organisations to something that is relevant to all who create and acquire digital data. The report Mind the Gap is intended to accelerate the process of making digital preservation a standard activity for all. The study presents a detailed analysis of the status quo revealing the extent of the risk of loss or degradation to digital material held in the UK's public and private sectors. The report provides ample ammunition for the digital preservation community to use in strengthening its case, whether it is at organisational, local, or national level.

Not everyone needs to establish complex infrastructures capable of retaining digital data forever, but everyone does need to take responsibility at least for material they create themselves, at least for a period until others can take it over (this may be a few years or perhaps decades). The report highlights the importance of incorporating activities naturally at each stage of the life cycle rather than regarding them as something distinct and which can only be undertaken by specialists.

Mind the Gap is intended to win the hearts and minds of a much wider audience and to encourage all who are involved in creating and acquiring digital information, to seize the opportunity to develop a practical, coherent strategy for responsible management of the vast quantities of digital information being created and used in the UK today. As the DPC Chair Lynne Brindley noted at the launch of Mind the Gap ' ... many of the needs identified in the report are not "rocket science", they rely on little more than common sense and good management to implement, but there is nevertheless a significant gap between where we are now and where we need to be...'. Mind the Gap poses a challenge to all organisations to take steps to manage their digital materials according to good practice so that it will be simpler and more cost-effective for others to continue to manage them for the future.

References

  1. Mind the Gap: Assessing digital preservation needs in the UK. Digital Preservation Coalition, 2006.
    http://www.dpconline.org
  2. Risk of Loss of Digital Data: case studies and analysis. Digital Preservation Coalition, 2004. Available from the Members only area of the DPC Web site at:
    http://www.dpconline.org/members/main/ukneeds.html
  3. Further information on the DPTP can be found at: http://www.ulcc.ac.uk/dptp/
  4. Simpson, D. Digital preservation in the regions. Museums, Libraries and Archives Council, 2005.
    http://www.mla.gov.uk/resources/assets//M/mla_dpc_survey_pdf_6636.pdf
  5. Mind the Gap: Assessing digital preservation needs in the UK. Digital Preservation Coalition, 2006. Section 5.5 p. 18.
    http://www.dpconline.org
  6. Mind the Gap: Assessing digital preservation needs in the UK. Digital Preservation Coalition, 2006. Section 5.5 p. 18
    http://www.dpconline.org

Author Details

Maggie Jones
DPC Executive Secretary, April 2003-February 2006

Email: maggie.jones@talk21.com

Najla Semple
Executive Secretary
Digital Preservation Coalition

Email: najla@dpconline.org
Web site: http://www.dpconline.org

Return to top