Overview of content related to 'big data' http://www.ariadne.ac.uk/taxonomy/term/15895/all?article-type=&term=&organisation=&project=&author=&issue= RSS feed with Ariadne content related to specified tag en Editorial Introduction to Issue 71 http://www.ariadne.ac.uk/issue71/editorial2 <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue71/editorial2#author1">The editor</a> introduces readers to the content of <em>Ariadne</em> Issue 71.</p> </div> </div> </div> <p>As I depart this chair after the preparation of what I thought would be the last issue of <em>Ariadne</em> [<a href="#1">1</a>], I make no apology for the fact that I did my best to include as much material&nbsp; to her ‘swan song’ as possible. With the instruction to produce only one more issue this year, I felt it was important to publish as much of the content in the pipeline as I could.</p> <p><a href="http://www.ariadne.ac.uk/issue71/editorial2" target="_blank">read more</a></p> issue71 editorial richard waller amazon birmingham city university digital repository federation jisc loughborough university oclc oregon state university ukoln university for the creative arts university of huddersfield university of oxford university of sussex wellcome library jusp kaptur scarlet accessibility agile development api archives augmented reality authentication big data blog bs8878 cataloguing content management controlled vocabularies curation data data management data set database digital library digitisation diigo ebook educational data mining framework google docs higher education html html5 infrastructure jquery learning analytics metadata mets mobile native apps open access open source portal preservation preservation metadata repositories research search technology software solr standardisation standards sushi tagging twitter url video wcag web 2.0 web app widget xml schema Wed, 17 Jul 2013 19:01:02 +0000 lisrw 2493 at http://www.ariadne.ac.uk The Potential of Learning Analytics and Big Data http://www.ariadne.ac.uk/issue71/charlton-et-al <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue71/charlton-et-al#author1">Patricia Charlton</a>, <a href="/issue71/charlton-et-al#author2">Manolis Mavrikis</a> and <a href="/issue71/charlton-et-al#author3">Demetra Katsifli</a> discuss how the emerging trend of learning analytics and big data can support and empower learning and teaching.</p> </div> </div> </div> <blockquote><p style="margin-left:18.0pt;">&nbsp;‘<em>Not everything that can be counted counts, and not everything that counts can be counted</em>.’ Attributed to Albert Einstein</p> </blockquote><p><a href="http://www.ariadne.ac.uk/issue71/charlton-et-al" target="_blank">read more</a></p> issue71 feature article demetra katsifli manolis mavrikis patricia charlton bbc google ieee jenzabar london knowledge lab algorithm big data browser cybernetics data data mining database doi e-learning educational data mining framework hadoop higher education identifier learning analytics learning design modelling mooc personalisation research search technology semantic web social networks software streaming video visualisation Mon, 08 Jul 2013 20:07:14 +0000 lisrw 2476 at http://www.ariadne.ac.uk eMargin: A Collaborative Textual Annotation Tool http://www.ariadne.ac.uk/issue71/kehoe-gee <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue71/kehoe-gee#author1">Andrew Kehoe</a> and <a href="/issue71/kehoe-gee#author2">Matt Gee</a> describe their Jisc-funded eMargin collaborative textual annotation tool, showing how it has widened its focus through integration with Virtual Learning Environments.</p> </div> </div> </div> <p>In the Research and Development Unit for English Studies (RDUES) at Birmingham City University, our main research field is Corpus Linguistics: the compilation and analysis of large text collections in order to extract new knowledge about language. We have previously developed the WebCorp [<a href="#1">1</a>] suite of software tools, designed to extract language examples from the Web and to uncover frequent and changing usage patterns automatically. eMargin, with its emphasis on <em>manual</em> annotation and analysis, was therefore somewhat of a departure for us.</p> <p>The eMargin Project came about in 2007 when we attempted to apply our automated Corpus Linguistic analysis techniques to the study of English Literature. To do this, we built collections of works by particular authors and made these available through our WebCorp software, allowing other researchers to examine, for example, how Dickens uses the word ‘woman’, how usage varies across his novels, and which other words are associated with ‘woman’ in Dickens’ works.</p> <p>What we found was that, although our tools were generally well received, there was some resistance amongst literary scholars to this large-scale automated analysis of literary texts. Our top-down approach, relying on frequency counts and statistical analyses, was contrary to the traditional bottom-up approach employed in the discipline, relying on the intuition of literary scholars. In order to develop new software to meet the requirements of this new audience, we needed to gain a deeper understanding of the traditional approach and its limitations.</p> <p style="text-align: center; "><img alt="logo: eMargin logo" src="http://ariadne-media.ukoln.info/grfx/img/issue71-kehoe-gee/emargin-logo.png" style="width: 250px; height: 63px;" title="logo: eMargin logo" /></p> <h2 id="The_Traditional_Approach">The Traditional Approach</h2> <p>A long-standing problem in the study of English Literature is that the material being studied – the literary text – is often many hundreds of pages in length, yet the teacher must encourage class discussion and focus this on particular themes and passages. Compounding the problem is the fact that, often, not all students in the class have read the text in its entirety.</p> <p>The traditional mode of study in the discipline is ‘close reading’: the detailed examination and interpretation of short text extracts down to individual word level. This variety of ‘practical criticism’ was greatly influenced by the work of I.A. Richards in the 1920s [<a href="#2">2</a>] but can actually be traced back to the 11<sup>th</sup> Century [<a href="#3">3</a>]. What this approach usually involves in practice in the modern study of English Literature is that the teacher will specify a passage for analysis, often photocopying this and distributing it to the students. Students will then read the passage several times, underlining words or phrases which seem important, writing notes in the margin, and making links between different parts of the passage, drawing out themes and motifs. On each re-reading, the students’ analysis gradually takes shape (see Figure 1). Close reading takes place either in preparation for seminars or in small groups during seminars, and the teacher will then draw together the individual analyses during a plenary session in the classroom.</p> <p></p><p><a href="http://www.ariadne.ac.uk/issue71/kehoe-gee" target="_blank">read more</a></p> issue71 tooled up andrew kehoe matt gee ahrc amazon birmingham city university blackboard british library cetis d-lib magazine google ims global ims global learning consortium jisc niso university of leicester university of oxford wikipedia accessibility aggregation ajax api big data blog browser data database digital library ebook free software html interoperability intranet java javascript jquery metadata moodle plain text repositories research search technology software standards tag cloud tagging tei url vle web browser wiki windows xml Thu, 04 Jul 2013 17:20:45 +0000 lisrw 2467 at http://www.ariadne.ac.uk Mining the Archive: The Development of Electronic Journals http://www.ariadne.ac.uk/issue70/white <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue70/white#author1">Martin White</a> looks through the <em>Ariadne</em> archive to trace the development of e-journals as a particular aspect of electronic service delivery and highlights material he considers as significant.</p> </div> </div> </div> <p>My career has spanned 42 years in the information business. It has encompassed 10,000-hole optical coincidence cards, online database services, videotext, laser discs, and CD-ROMs, the World Wide Web, mobile services and big data solutions. I find the historical development of information resource management absolutely fascinating, yet feel that in general it is poorly documented from an analytical perspective even though there are some excellent archives.</p> <p>These archives include the back issues of <em>Ariadne</em> from January 1996. <em>Ariadne</em> has always been one of my must-reads as a way of keeping in touch with issues and developments in e-delivery of information. The recently launched new <em>Ariadne</em> platform [<a href="#1">1</a>] has provided easier access to these archives. Looking through its content has reminded me of the skills and vision of the UK information profession as it sought to meet emerging user requirements with very limited resources.&nbsp; The archives have always been available on the <em>Ariadne</em> site but the recent update to the site and the availability of good tags on the archive content has made it much easier to mine through the archive issues.</p> <p>The <em>Ariadne</em> team, in particular Richard Waller, has given me the opportunity to mine those archives [<a href="#2">2</a>] and trace some of the developments in electronic service delivery in the UK.</p> <p>Indeed working through the archives is now probably too easy as in the preparation of this column I have found myself moving sideways from many of the feature articles to revel in the other columns that have been a feature of Ariadne. This article is a personal view of some of these developments and is in no way intended to be a definitive account. Its main purpose is to encourage others to look into the archive and learn from the experiences of the many innovators that have patiently coped with the challenges of emerging technology, resource limitations and often a distinct lack of strategy and policy at both an institutional and government level.</p> <p style="text-align: center; "><img alt="Figure 1: Optical coincidence card, circa 1970" src="http://ariadne-media.ukoln.info/grfx/img/issue70-white/image1-optical-coincidence-card.jpg" style="width: 171px; height: 289px;" title="Figure 1: Optical coincidence card, circa 1970" /></p> <p style="text-align: center; "><strong>Figure 1: Optical coincidence card, circa 1970</strong></p> <h2 id="e-Journal_Development">e-Journal Development</h2> <p>Arriving at the University of Southampton in 1967 my main surprise was not the standard of the laboratories but the quality and scale of the Chemistry Department library. School does not prepare you for reading primary journals and how best to make use of Chemical Abstracts, but I quickly found that working in the library was much more fun than in a laboratory. I obtained an excellent result in one vacation project on physical chemistry problems by reverse engineering the problems through Chemical Abstracts! Therefore, as it turned out, I had started my career as an information scientist before I even graduated. By 1977 I was working with The Chemical Society on the micropublishing of journals and taking part in a British Library project on the future of chemical information. &nbsp;Re-reading the outcomes of that project makes me realise how difficult it is to forecast the future. Now my past has re-asserted itself to good effect as I have both the honour and excitement of being Chair of the eContent Committee of the Royal Society of Chemistry.</p> <p style="text-align: center; "><img alt="Figure 2: Laser disc, circa 1980" src="http://ariadne-media.ukoln.info/grfx/img/issue70-white/image2-laserdiscs.jpg" style="width: 336px; height: 312px;" title="Figure 2: Laser disc, circa 1980" /></p> <p style="text-align: center; "><strong>Figure 2: Laser disc, circa 1980</strong></p> <p>So from my standpoint, in seeking to identify distinct themes in the development of information resource management in <em>Ariadne</em>, a good place to start is with the e-markup of chemical journals. In Issue 1 Dr Henry Rzepa wrote about the potential benefits of the semantic markup of primary journals to provide chemists with access to the content of the journal article and not just to a contents page and title [<a href="#4">4</a>]. The immediate problem you face reading this admirable summary of the potential benefits of markup is that many of the hyperlinks have disappeared. History has been technologically terminated. Almost 15 years passed by before the Royal Society of Chemistry set up Project Prospect and turned semantic markup into a production process [4]. Dr Rzepa is now Professor of Computational Chemistry at Imperial College, London.</p> <p>By the mid-1990s good progress had been made in e-journal production technologies and the first e-only journals were beginning to appear. Among them was <em>Glacial Geology and Geomorphology</em> (GGG) which existed in a printed version only in as far as readers could print out a selection from it. One aim of GGG is therefore to provide the benefits of electronic transfer as well as other value added products in an accepted academic, peer-reviewed system. The author of the article describing the project [<a href="#5">5</a>] was Dr. Brian Whalley, who went on to become a Professor in the Geomaterials Research Group, Queens University of Belfast. As you will discover from <a href="../author/brian-whalley-author-profile">his author profile</a> (another <em>Ariadne</em> innovation), Brian is still active though retired from formal education. What struck me about this article was the author’s vision in January 1996 of how e-journals could be of benefit in university teaching.</p> <p></p><p><a href="http://www.ariadne.ac.uk/issue70/white" target="_blank">read more</a></p> issue70 feature article martin white andrew w mellon foundation british library hefce imperial college london institute of physics intranet focus ltd jisc mimas portico stm ukoln university of glasgow university of manchester university of sheffield university of southampton jisc information environment accessibility archives big data blog content management copyright database ebook ejournal higher education intellectual property jstor licence mobile open access research resource management search technology standards Thu, 06 Dec 2012 15:50:18 +0000 lisrw 2401 at http://www.ariadne.ac.uk 23rd International CODATA Conference http://www.ariadne.ac.uk/issue70/codata-2012-rpt <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue70/codata-2012-rpt#author1">Alex Ball</a> reports on a conference on ‘Open Data and Information for a Changing Planet’ held by the International Council for Science’s Committee on Data for Science and Technology (CODATA) at Academia Sinica, Taipei, Taiwan on 28–31 October 2012.</p> </div> </div> </div> <p>CODATA was formed by the International Council for Science (ICSU) in 1966 to co-ordinate and harmonise the use of data in science and technology. One of its very earliest decisions was to hold a conference every two years at which new developments could be reported. The first conference was held in Germany in 1968, and over the following years it would be held in&nbsp; 15 different countries across 4 continents.</p> <p><a href="http://www.ariadne.ac.uk/issue70/codata-2012-rpt" target="_blank">read more</a></p> issue70 event report alex ball codata datacite dcc elsevier icsu jisc library of congress national academy of sciences niso oais orcid royal meteorological society sheffield hallam university stm ukoln university college london university of bath university of edinburgh university of queensland university of washington dealing with data europeana ojims accessibility algorithm api archives bibliographic data big data blog cataloguing cloud computing creative commons crm curation data data citation data management data mining data model data set data visualisation database digital archive digital curation digitisation dissemination doi dvd e-learning facebook framework geospatial data gis google maps handle system identifier infrastructure intellectual property interoperability java knowledge base knowledge management licence linux lod metadata mobile moodle oer ontologies open access open data open source operating system optical character recognition portfolio preservation privacy provenance repositories research restful search technology sharepoint smartphone software standardisation standards tagging usb video visualisation vocabularies web resources web services widget wiki xml xmpp Sat, 15 Dec 2012 12:41:16 +0000 lisrw 2430 at http://www.ariadne.ac.uk Online Information 2012 http://www.ariadne.ac.uk/issue70/online-2012-rpt <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue70/online-2012-rpt#author1">Marieke Guy</a> reports on the largest gathering of information professionals in Europe.</p> </div> </div> </div> <p>Online Information [<a href="#1">1</a>] is an interesting conference as it brings together information professionals from both the public and the private sector. The opportunity to share experiences from these differing perspectives doesn’t happen that often and brings real benefits, such as highly productive networking. This year’s Online Information, held between 20 - 21 &nbsp;November, felt like a slightly different event to previous years.</p> <p><a href="http://www.ariadne.ac.uk/issue70/online-2012-rpt" target="_blank">read more</a></p> issue70 event report marieke guy amazon dcc google jisc microsoft mimas oclc ukoln university of bath university of dundee university of edinburgh university of manchester university of sheffield university of sussex datashare dmponline rdmrose scarlet schema.org wikipedia worldcat algorithm augmented reality bibliographic data big data blog cataloguing cloud computing copyright data data management data set database digital curation digital library digital repositories facebook flickr framework higher education identifier interoperability junaio library data licence linked data marc metadata mobile oer open data open source operating system privacy qr code rdfa remote working repositories research search technology software streaming twitter uri video vocabularies youtube Sun, 16 Dec 2012 17:10:56 +0000 lisrw 2437 at http://www.ariadne.ac.uk Book Review: Information 2.0 http://www.ariadne.ac.uk/issue70/dobreva-rvw <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue70/dobreva-rvw#author1">Milena Dobreva</a> reviews the newly published book of Martin de Saulles which looks at the new models of information production, distribution and consumption.</p> </div> </div> </div> <p>Writing about information and the changes in the models of its production, distribution and consumption is no simple task. Besides the long-standing debate on what information and knowledge really mean, the world of current technologies is changing at a pace which inevitably influences all spheres of human activity. But the first of those spheres to tackle is perhaps that of information – how we create, disseminate, and use it.</p> <p><a href="http://www.ariadne.ac.uk/issue70/dobreva-rvw" target="_blank">read more</a></p> issue70 review milena dobreva amazon jisc university of brighton university of malta archives big data blog cloud computing data data mining digital library digital preservation digitisation google search information society institutional repository mobile podcast research search technology video wiki Thu, 13 Dec 2012 22:49:00 +0000 lisrw 2414 at http://www.ariadne.ac.uk Eduserv Symposium 2012: Big Data, Big Deal? http://www.ariadne.ac.uk/issue69/eduserv-2012-rpt <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue69/eduserv-2012-rpt#author1">Marieke Guy</a> attended the annual Eduserv Symposium on 10 May 2012 at the Royal College of Physicians, London to find out what are the implications of big data for Higher Education Institutions.</p> </div> </div> </div> <p>The annual Eduserv Symposium [<a href="#1">1</a>] was billed as a ‘must-attend event for IT professionals in Higher Education’; the choice of topical subject matter being one of the biggest crowd-drawers (the other being the amazing venue: the Royal College of Physicians). The past few years have seen coverage of highly topical areas such as virtualisation and the cloud, the mobile university and access management.</p> <p><a href="http://www.ariadne.ac.uk/issue69/eduserv-2012-rpt" target="_blank">read more</a></p> issue69 event report marieke guy amazon cetis dcc eduserv google jisc orcid oreilly oxford internet institute ukoln university of bath university of bristol university of california berkeley university of leicester university of oxford webtrends wellcome trust dealing with data impact project accessibility algorithm big data blog cloud computing curation data data management data set database digitisation gis google analytics google trends hadoop higher education infrastructure intellectual property internet explorer irods learning analytics mobile nosql oer open data open source remote working research twitter usb Mon, 30 Jul 2012 17:48:45 +0000 lisrw 2370 at http://www.ariadne.ac.uk The Future of the Past of the Web http://www.ariadne.ac.uk/issue68/fpw11-rpt <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue68/fpw11-rpt#author1">Matthew Brack</a> reports on the one-day international workshop 'The Future of the Past of the Web' held at the British Library Conference Centre, London on 7 October, 2011.</p> </div> </div> </div> <p>We have all heard at least some of the extraordinary statistics that attempt to capture the sheer size and ephemeral nature of the Web. According to the Digital Preservation Coalition (DPC), more than 70 new domains are registered and more than 500,000 documents are added to the Web every minute [<a href="#1">1</a>]. This scale, coupled with its ever-evolving use, present significant challenges to those concerned with preserving both the content and context of the Web.</p> <p><a href="http://www.ariadne.ac.uk/issue68/fpw11-rpt" target="_blank">read more</a></p> issue68 event report matthew brack bbc british library bsi dcc digital preservation coalition google hanzo archives institute of historical research iso jisc kings college london library of congress nhs oxford internet institute the national archives university of oxford university of sheffield wellcome library arcomem internet archive memento uk government web archive aggregation algorithm api archives big data blog browser cache curation data data mining data model digital asset management digital curation digital library digital preservation digitisation dissemination doi flickr identifier interoperability library data lod metadata preservation repositories research search technology social web software tag cloud twitter ulcc uri url visualisation warc wayback machine web resources wordpress youtube Mon, 27 Feb 2012 12:06:52 +0000 lisrw 2236 at http://www.ariadne.ac.uk eSciDoc Days 2011: The Challenges for Collaborative eResearch Environments http://www.ariadne.ac.uk/issue68/escidoc-rpt <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue68/escidoc-rpt#author1">Ute Rusnak</a> reports on the fourth in a series of two-day conferences called eSciDoc Days, organised by FIZ Karlsruhe and the Max Planck Digital Library in Berlin over 26-27 October 2011.</p> </div> </div> </div> <p>eSciDoc is a well-known open source platform for creating eResearch environments using generic services and tools based on a shared infrastructure. This concept allows for managing research and publication data together with related metadata, internal and/or external links and access rights. Development of eSciDoc was initiated by a collaborative venture between FIZ Karlsruhe – Leibniz Institute for Information Infrastructure and the Max Planck Digital Library (MPDL) and was funded by the German Federal Ministry of Education and Research.</p> <p><a href="http://www.ariadne.ac.uk/issue68/escidoc-rpt" target="_blank">read more</a></p> issue68 event report ute rusnak fiz karlsruhe jisc archives authentication big data browser copyright curation data data management data set database digital library digital preservation digital repositories digitisation dissemination e-research ebook ejournal fedora commons framework higher education infrastructure internet explorer interoperability knowledge management licence metadata open source preservation provenance repositories research rich internet application soa software virtual research environment visualisation web services Mon, 27 Feb 2012 20:20:52 +0000 lisrw 2239 at http://www.ariadne.ac.uk Book Review: Innovations in Information Retrieval http://www.ariadne.ac.uk/issue68/white-rvw <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue68/white-rvw#author1">Martin White</a> reviews a collection of essays on a wide range of current topics and challenges in information retrieval.</p> </div> </div> </div> <h2 id="Information_Retrieval_and_Enterprise_Search">Information Retrieval and Enterprise Search</h2> <p>For much of 2011 I worked on a project commissioned by the Institute for Prospective Technological Studies, Joint Research Centre, European Commission, on a techno-economic study of enterprise search in Europe.&nbsp; There is no dispute that the volume of information inside organisations is growing very rapidly, though much of this growth is the result of never discarding any digital information.&nbsp; The scale of the problem is well documented by the McKinsey Global Institute (MGI) in its report on 'Big Data' [<a href="#1">1</a>].</p> <p><a href="http://www.ariadne.ac.uk/issue68/white-rvw" target="_blank">read more</a></p> issue68 review martin white intranet focus ltd university of sheffield aida big data data document management higher education information retrieval internet explorer intranet research search technology tagging video Thu, 23 Feb 2012 17:11:39 +0000 lisrw 2232 at http://www.ariadne.ac.uk Editorial Introduction to Issue 64: Supporting the Power of Research Data http://www.ariadne.ac.uk/issue64/editorial <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue64/editorial#author1">Richard Waller</a> introduces Ariadne issue 64.</p> </div> </div> </div> <p>In these cash-strapped times among all the admonitions to save money here, and resources there, I rather hope to hear much about the necessity of protecting and building the knowledge economy if the UK is to make its way in the globalised world, since we cannot pretend to compete easily in other areas of endeavour. Hence research has to be regarded as one of the aces remaining to us, and thus I hope the importance of gathering, managing and preserving for long-term access research outcomes will be widely appreciated and supported.</p> <p><a href="http://www.ariadne.ac.uk/issue64/editorial" target="_blank">read more</a></p> issue64 editorial richard waller bbc cerlim google ifla intute national library of australia rnib automatic metadata generation itunes u archives bibliographic data bibliographic record big data blog cataloguing curation data data management data set database digital curation digital library digital repositories digitisation drupal dspace e-science electronic theses fedora commons framework frbr google scholar higher education infrastructure interoperability ipad iphone itunes metadata mobile national library preservation repositories research search technology social networks software standards twitter vim web 2.0 Thu, 29 Jul 2010 23:00:00 +0000 editor 1559 at http://www.ariadne.ac.uk Retooling Libraries for the Data Challenge http://www.ariadne.ac.uk/issue64/salo <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue64/salo#author1">Dorothea Salo</a> examines how library systems and procedures need to change to accommodate research data.</p> </div> </div> </div> <p>Eager to prove their relevance among scholars leaving print behind, libraries have participated vocally in the last half-decade's conversation about digital research data. On the surface, libraries would seem to have much human and technological infrastructure ready-constructed to repurpose for data: digital library platforms and institutional repositories may appear fit for purpose. However, unless libraries understand the salient characteristics of research data, and how they do and do not fit with library processes and infrastructure, they run the risk of embarrassing missteps as they come to grips with the data challenge.</p> <p>Whether managing research data is 'the new special collections,'[<a href="#1">1</a>] a new form of regular academic-library collection development, or a brand-new library specialty, the possibilities have excited a great deal of talk, planning, and educational opportunity in a profession seeking to expand its boundaries.</p> <p>Faced with shrinking budgets and staffs, library administrators may well be tempted to repurpose existing technology infrastructure and staff to address the data curation challenge. Existing digital libraries and institutional repositories seem on the surface to be a natural fit for housing digital research data. Unfortunately, significant mismatches exist between research data and library digital warehouses, as well as the processes and procedures librarians typically use to fill those warehouses. Repurposing warehouses and staff for research data is therefore neither straightforward nor simple; in some cases, it may even prove impossible.</p> <h2 id="Characteristics_of_Research_Data">Characteristics of Research Data</h2> <p>What do we know about research data? What are its salient characteristics with respect to stewardship?</p> <h3 id="Size_and_Scope">Size and Scope</h3> <p>Perhaps the commonest mental image of research data is terabytes of information pouring out of the merest twitch of the Large Hadron Collider Project. So-called 'Big Data' both captures the imagination of and creates sheer terror in the practical librarian or technologist. 'Small data,' however, may prove to be the bigger problem: data emerging from individual researchers and labs, especially those with little or no access to grants, or a hyperlocal research focus. Though each small-data producer produces only a trickle of data compared to the like of the Large Hadron Collider Project, the tens of thousands of small-data producers in aggregate may well produce as much data (or more, measured in bytes) as their Big Data counterparts [<a href="#2">2</a>]. Securely and reliably storing and auditing this amount of data is a serious challenge. The burgeoning 'small data' store means that institutions without local Big Data projects are by no means exempt from large-scale storage considerations.</p> <p>Small data also represents a serious challenge in terms of human resources. Best practices instituted in a Big Data project reach all affected scientists quickly and completely; conversely, a small amount of expert intervention in such a project pays immense dividends. Because of the great numbers of individual scientists and labs producing small data, however, immensely more consultations and consultants are necessary to bring practices and the resulting data to an acceptable standard.</p> <h3 id="Variability">Variability</h3> <p>Digital research data comes in every imaginable shape and form. Even narrowing the universe of research data to 'image' yields everything from scans of historical glass negative photographs to digital microscope images of unicellular organisms taken hundreds at a time at varying depths of field so that the organism can be examined in three dimensions. The tools that researchers use naturally shape the resulting data. When the tool is proprietary, unfortunately, so may be the file format that it produced. When that tool does not include long-term data viability as a development goal, the data it produces are often neither interoperable nor preservable.</p> <p>A major consequence of the diversity of forms and formats of digital research data is a concomitant diversity in desired interactions. The biologist with a 3-D stack of microscope images interacts very differently with those images than does a manuscript scholar trying to extract the underlying half-erased text from a palimpsest. These varying affordances <em>must</em> be respected by dissemination platforms if research data are to enjoy continued use.</p> <p>One important set of interactions involves actual changes to data. Many sorts of research data are considerably less usable in their raw state than after they have had filters or algorithms or other processing performed on them. Others welcome correction, or are refined by comparison with other datasets. Two corollaries emerge: first, that planning and acting for data stewardship must take place throughout the research process, rather than being an add-on at the end; and second, that digital preservation systems designed to steward only final, unchanging materials can only fail faced with real-world datasets and data-use practices.</p> <p></p><p><a href="http://www.ariadne.ac.uk/issue64/salo" target="_blank">read more</a></p> issue64 feature article dorothea salo california digital library dcc google oai university of wisconsin hydra algorithm api archives bibliographic data big data blog cookie curation data data management data set database digital curation digital library digital preservation digitisation dissemination drupal dspace dublin core eprints fedora commons file format flickr google docs infrastructure institutional repository interoperability library management systems linked data marc metadata mods oai-pmh open source preservation rdf repositories research search technology software standardisation standards sword protocol wiki xml Thu, 29 Jul 2010 23:00:00 +0000 editor 1566 at http://www.ariadne.ac.uk E-Curator: A 3D Web-based Archive for Conservators and Curators http://www.ariadne.ac.uk/issue60/hess-et-al <div class="field field-type-text field-field-teaser-article"> <div class="field-items"> <div class="field-item odd"> <p><a href="/issue60/hess-et-al#author1">Mona Hess</a>, <a href="/issue60/hess-et-al#author2">Graeme Were</a>, <a href="/issue60/hess-et-al#author3">Ian Brown</a>, <a href="/issue60/hess-et-al#author4">Sally MacDonald</a>, <a href="/issue60/hess-et-al#author5">Stuart Robson</a> and <a href="/issue60/hess-et-al#author6">Francesca Simon Millar</a> describe a project which combines 3D colour laser scanning and e-Science technologies for capturing and sharing very large 3D scans and datasets about museum artefacts in a secure computing environment.</p> </div> </div> </div> <h2 id="Introduction:_The_Evolving_Field_of_Artefact_Documentation">Introduction: The Evolving Field of Artefact Documentation</h2> <p>Digital heritage technologies promise a greater understanding of cultural objects cared for by museums. Recent technological advances in digital photography and image processing not only offer a high level of documentation, they also provide powerful analytical tools for conservation monitoring of cultural objects.</p> <p><a href="http://www.ariadne.ac.uk/issue60/hess-et-al" target="_blank">read more</a></p> issue60 feature article francesca simon millar graeme were ian brown mona hess sally macdonald stuart robson ahrc british museum jisc ukoln university college london university of cambridge ahessc e-curator archives big data cataloguing cloud computing curation data data management data set database digitisation dissemination e-science file format gpl graphics identifier infrastructure internet explorer licence metadata multimedia namespace open source preservation provenance rdbms research software standards visualisation Wed, 29 Jul 2009 23:00:00 +0000 editor 1491 at http://www.ariadne.ac.uk