Overview of content related to 'open source' http://www.ariadne.ac.uk/taxonomy/term/44/all?article-type=&term=&organisation=&project=&author=dave%20thompson&issue= RSS feed with Ariadne content related to specified tag en A Pragmatic Approach to Preferred File Formats for Acquisition http://www.ariadne.ac.uk/issue63/thompson <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue63/thompson#author1">Dave Thompson</a> sets out the pragmatic approach to preferred file formats for long-term preservation used at the Wellcome Library.</p> </div> </div> </div> <p>This article sets out the Wellcome Library's decision not explicitly to specify preferred file formats for long-term preservation. It discusses a pragmatic approach in which technical appraisal of the material is used to assess the Library's likelihood of preserving one format over another. The Library takes as its starting point work done by the Florida Digital Archive in setting a level of 'confidence' in its preferred formats. The Library's approach provides for nine principles to consider as part of appraisal. These principles balance economically sustainable preservation and intellectual 'value' with the practicalities of working with specific, and especially proprietary, file formats. Scenarios are used to show the application of principles (see <a href="#annex">Annex</a> below).</p> <p>This article will take a technical perspective when assessing material for acquisition by the Library. In reality technical factors are only part of the assessment of material for inclusion in the Library's collections. Other factors such as intellectual content, significance of the material, significance of the donor/creator and any relationship to material already in the Library also play a part. On this basis, the article considers 'original' formats accepted for long-term preservation, and does not consider formats appropriate for dissemination.</p> <p>This reflects the Library's overall approach to working with born digital archival material. Born digital material is treated similarly to other, analogue archival materials. The Library expects archivists to apply their professional skills regardless of the format of any material, to make choices and decisions about material based on a range of factors and not to see the technical issues surrounding born digital archival material as in any way limiting.</p> <h2 id="Why_Worry_about_Formats">Why Worry about Formats?</h2> <p>Institutions looking to preserve born digital material permanently, the Wellcome Library included, may have little control over the formats in which material is transferred or deposited. The ideal intervention point from a preservation perspective is at the point digital material is first created. However this may be unrealistic. Many working within organisations have no choice in the applications they use, cost of applications may be an issue, or there may simply be a limited number of applications available on which to perform specialist tasks. Material donated after an individual retires or dies can prove especially problematic. It may be obsolete, in obscure formats, on obsolete media and without any metadata describing its context, creation or rendering environment.</p> <p>Computer applications 'save' their data in formats, each application typically having its own file format. The Web site filext [<a href="#1">1</a>] lists some 25,000 file extensions in its database.</p> <p>The long-term preservation of any format depends on the type of format, issues of obsolescence, and availability of hardware and/or software, resources, experience and expertise. Any archive looking to preserve born digital archival material needs to have the means and confidence to move material across the 'gap' that exists between material 'in the wild' and holding it securely in an archive.</p> <p>This presents a number of problems: first, in the proliferation of file formats; second, in the use of proprietary file formats, and third, in formats becoming obsolete, either by being incompatible with later versions of the applications that created them, or by those applications no longer existing. This assumes that proprietary formats are more problematic to preserve as their structure and composition are not known, which hinders preservation intervention by imposing the necessity for specialist expertise. Moreover, as new software is created, so new file formats proliferate, and consequently exacerbate the problem.</p> <p></p><p><a href="http://www.ariadne.ac.uk/issue63/thompson" target="_blank">read more</a></p> issue63 feature article dave thompson microsoft mpeg wellcome library aggregation archives born digital cd-rom collection development data database digital archive digital preservation dissemination drm file format framework internet explorer jpeg jpeg 2000 metadata microsoft office open source openoffice preservation provenance real audio repositories software standards tiff usb video xml Thu, 29 Apr 2010 23:00:00 +0000 editor 1547 at http://www.ariadne.ac.uk