Time to Change Our Thinking: Dismantling the Silo Model of Digital Scholarship
There is no longer anything exotic about digital humanities projects. Almost every humanities faculty has at least one. But like humanities disciplines themselves, digital projects too often exist in lonely splendour, each in its own sub-disciplinary silo. Classicists have their project(s), Middle English scholars post Chaucer and Langland manuscripts, while French medievalists have sites for major genres or authors from the troubadours to Christine de Pizan, and beyond. The situation is not appreciably different for digital humanities projects dealing with modern topics. Even within disciplines, teams pursue their objectives independently more often than not. Neither the scholars nor the sites they create interact with one another in any meaningful way.
Such attitudes survive from the early days of digital scholarship, when each group had to struggle to get its project off the ground. Even if they are not that old, the fact remains that digital humanities are no longer at the pioneering stage. To put it crudely, we have won the battle for access to manuscripts and documents, and for distribution over the World Wide Web. In a major shift in thinking, museums and libraries now accept digitisation as best management practice, even though some repositories remain wary of allowing their material to be posted to scholarly Web sites.
Even here, we have made progress. In recent years, we have seen innovative partnerships between scholars and repositories. Take, for example, the unprecedented agreement between the Bibliothèque nationale de France (BnF) and Eisenhower Library at Johns Hopkins. By December 2009, the BnF and the digital curators at Johns Hopkins will have completed digitising some 140 manuscripts of the most popular vernacular romance of the Middle Ages (the Roman de la Rose)—and integrating them onto its Web site www.romandelarose.org .
Over the last decade or so, the Hopkins project has developed tools to allow intensive study of manuscripts, while also encouraging their use in teaching. Even before all of the French manuscripts have been posted, data show that as the site has evolved over the last decade, it has radically transformed the teaching and scholarship of this work.
Because it is the largest digital manuscript library of a single work—approximately 100,000 images—the site also offers graphic testimony to the mode of existence of vernacular literature in the centuries before printing. Because so many Rose manuscripts incorporate painted miniatures of scenes from the work on their pages, as well as fascinating images on the bottom of the page—pictures portraying subjects not obviously related to the text (including erotic images)—art historians have also profited from this trove of manuscripts.
While critical editions of the text strive to represent the romance as it might have appeared when completed around 1285 C.E. (though without the all-important illuminations), the manuscripts on the Rose Web site offer dramatic testimony to the ever-evolving styles of book making, manuscript painting, and the demographics of readership during the 250-odd years of this work's pre-modern existence. To study Rose manuscripts from different periods of the late Middle Ages is to understand instinctively how the work could so profoundly influence literary works composed in the following centuries... and why it would be fruitful for the Rose Web site to be able to interact with others devoted to late medieval authors.
The collaboration between Hopkins and the Bibliothèque nationale de France will soon be extended to the museum-going public. This winter and spring will witness exhibitions at the BnF in Paris and at the Walters Art Gallery in Baltimore that feature displays of Rose manuscripts interspersed with computer terminals linked to the Rose Web site. Readers will be able to look at rare manuscripts carefully preserved in display cases, but then, thanks to the interactive nature of the Web site, experience the thrill of seeing for themselves an entire manuscript instead of only the two folios typically viewable in a museum display case.
Before considering why it has been so difficult to promote a culture of interoperability, let us look at one other example of innovative collaboration: 'Parker on the Web.' This project, linking Stanford University Library with Corpus Christi, Cambridge, has made it possible to digitise the Parker Library, one of the most significant Renaissance collections of knowledge in the world. Matthew Parker (1504-1575) was a powerful figure in the English Reformation: chaplain to Anne Boleyn, Master of Corpus Christi, Vice-Chancellor of Cambridge University, and Archbishop of Canterbury (1559-1575). When Henry VIII dissolved Catholic monasteries during the Reformation, Parker acquired manuscripts from monastic libraries, particularly, though far from exclusively, those of Anglo-Saxon provenance, which he hoped would provide evidence of an early English-speaking Church independent of Rome. The Matthew Parker library contains some 600 manuscripts as well as printed books and documents ranging from the 6th to the 16th century. Parker thus contains examples of many historical, literary, and theological works and commentaries that formed the basis for medieval intellectual life. Not surprisingly, it has manuscripts of works, such as Boethius (c. 480-c.525 C.E.) and Chaucer (c. 1325-1400) that influenced or were influenced by the Romance of the Rose, to name but one example. So why not extend the collaboration between Stanford University Library and Corpus Christi a bit further to include Hopkins's Rose project?
Scholars would find it extremely rewarding while working with the Rose Web site to have access to Parker at Stanford. To compare a late classical reference cited in the Rose with a contemporary manuscript from which the work might have been quoted could tell us much about the logic of inter-textual reference in literature of the period. It also offers an opening for historians and literary scholars to collaborate with their colleagues in theology and philosophy who can be expected to have at least a passing interest in understanding how their authors were read and used at that time. How it is that works written a thousand years earlier could seem so 'present', so up-to-date to 13th century thinkers? One answer, illustrated by 'Parker-on-the-Web' is the two-fold status of late antique authors in the medieval period. On the one hand, they had the unmistakable stamp of authority—after all, they had survived as long as the Bible—and, like the Bible, they also could seem 'modern' since they circulated in manuscripts produced relatively recently and written in Latin, which 13th century authors themselves spoke and wrote fluently. Classical authors were thus ancient and modern at the same time: a double guarantee of status.
Putting the Cogito Back into Digital Humanities
We can all agree, presumably, that Parker-on-the-Web and the Rose digital manuscript library represent a useful model of collaboration. Neither would be nearly so impressive - if indeed they could exist at all - without the work of teams at Stanford and Corpus Christi, or, for the Rose, at Hopkins and at the BnF. So they exist; but do they think, as Descartes might have said? The problem with many digital humanities projects is that they often tend to 'put tools before cognition.' Understandably so, because without appropriate tools, Web sites cannot function. But tools and technical protocols are not the reason why someone originally conceived a need for the data provided by the site. We make information available on the Web for scholarly use. Somewhere between perceiving the need for the data and the complex task of making them available, we may lose sight of the project's 'cogito,' the 'work of thinking' that led us to it in the first place.
That is hardly surprising because digital humanities represent a very different approach to research from analogue scholarship. The difference is as stark as that between the typewriter and the computer. The French did not have a word for typewriter, but a descriptive phrase: machine à écrire, 'a machine for writing.' The French expression emphasises writing, which in turn presupposes thinking, and the research that nurtures both thought and writing. What comes out of the typewriter is what the writer mechanically enters... one letter at a time. The typewriter is neither interactive, nor a source of information. We often forget that the pen and typewriter shaped the phenomenology of humanities research. It all stemmed from a dedicated individual willing to immerse him- or herself in the pursuit of information: one document at a time, one archive at a time, following tenuous threads of information from one source to another, often across oceans and continents. The individual researcher first made notes and then formulated conclusions in written arguments. So what has changed?
Well, scale, for one thing, and, for another, the speed and accessibility of masses of data available to the scholar through the collaboration of computer and the World Wide Web. Think of the 100,000 images of the Rose we spoke of a moment ago. Or imagine the even larger database of books and manuscripts rescued from English monasteries in the Parker library. These examples tell us that the Internet allows us to aggregate large amounts of data. One person, working alone in the tradition of analogue scholarship cannot begin to sort through such masses of information. But if the unfamiliar challenge of working with quantitative data represents a sea change for traditional humanities scholarship - a prospect that gives pause to many humanists (even those who avidly embrace their computers and in theory approve increased accessibility to data) - it is an evolution that goes far beyond scholarly protocols.
Scale transforms content and requires that scholars formulate new questions based on novel assumptions even when dealing with the most familiar objects. But can we speak of objects as 'familiar' once they have been radically scaled? Writing a treatise on optics in 1263, Roger Bacon points to the metamorphosis of objects when viewed from different perspectives. In particular, he notes how proximity and distance alter our perception of the same object. A mountain viewed from afar appears small, easily taken in by the eye. From its base, however, the viewer cannot begin to perceive the mountain as such. It cannot be measured, described, or analysed from a single position or by a single individual standing at the foot of the mountain.
What is true of a mountain viewed from far and near holds true, with the respective differences having been considered, for medieval literary works. The modern critical edition is the equivalent of viewing a mountain at a distance. It offers a global view of the work, in a manner that bolsters the sense of authorial identity and control—the editor's assumption is that the text represents as faithfully as possible the text composed by the poet. Unfortunately, the modern critical edition has little to do with the reality of medieval literary practice; it is an artefact of analogue scholarship based on print technology whose only feasible option was to choose a base manuscript, transcribe its text, and make notes of interesting variants from other manuscripts. In the print context, there was simply no way to make the manuscripts available themselves. If truth be told, there was also little desire to do so. Many scribes, who were often regarded as careless or even incapable of reproducing the author's text word for word, were thought to introduce errors—or even their own thoughts—into the work they were copying; thereby, so the belief went, 'corrupting' the author's intention.
The Internet has altered the equation by making possible the study of literary works in their original configurations. We can now understand that manuscripts designed and produced by scribes and artists—often long after the death of the original poet—have a life of their own. It was not that scribes were 'incapable' of copying texts word-for-word, but rather that this was not what their culture demanded of them. This is but one of the reasons why the story of medieval manuscripts is both so fascinating, and so very different from the one we are accustomed to hearing. But it requires rethinking concepts as fundamental as authorship, for example. Confronted with over 150 versions of the work, no two quite alike, what becomes of the concept of authorial control? And how can one assert with certainty which of the 150 or so versions is the 'correct' one, or even whether such a concept even makes sense in a pre-print culture?
If scale can change entrenched attitudes about something as fundamental as 'authorship,' it must also affect other traditional views of medieval literature, like language, for example. For well over a century-and-a-half, received opinion has held that medieval French—on the model of Latin—had a 'standard' literary form, and a much more varied vernacular stemming from 'Vulgar Latin'—with local dialects making communication difficult from one region to another. On this view, it was the literary language that bound medieval French culture together, creating a courtly language that could be understood throughout the realm. In the last thirty years, classicists working on the so-called 'dark ages' from the 5th to the 10th centuries, have sharply questioned the truth of the Vulgar Latin / literary Latin dichotomy. What was thought to be 'Vulgar Latin,' supposedly spoken by the illiterate masses, has now been shown to be simply classical Latin slowly evolving into the various Romance tongues.
What becomes of the concept of a literary language, distinct from spoken dialects, when confronted with a data mass of some 160 manuscripts of a work some 22,000 lines in length? We have never had so extensive a body of linguistic evidence from a single work that will offer linguists, syntacticians, grammarians, and phonologists the chance to test the concept of a koiné [*] that had regional variations but was otherwise conformist. If their research confirms the concept that has ruled medieval studies since the mid-19th century, then it will become unassailable. One suspects, however, that their findings will lead to far more remarkable observations.
Then again, there is the challenge posed by digital datasets to traditional ways of organising scholarly units in the university. Simply by aggregating manuscripts, one realises the geographical and chronological extent of vernacular literature, especially of a work as popular as the Rose. That data in turn leads to the recognition of comparable artefacts elsewhere. Since medieval French romance in general, and the Rose in particular, was translated into other European languages—Middle English, Spanish, Italian, German, Icelandic, and so on—why would scholars not want to collaborate in exploring this extended data? In the case of the Rose project, we have involved scholars trained in medieval Spanish and Middle English. They have extended the range of the project not simply by locating and describing examples of the Rose in Spain and England, but also by identifying its engagement with and influence on other literary works of the period in those countries. Such examples illustrate how manuscript datasets challenge the disciplinary and national boundaries that partition scholarly inquiry in the academia .
All of this points to a sea change in humanities scholarship that gives pause to many. This is not surprising since humanists typically have neither training nor experience in confronting large-scale data. Accustomed to working with a discrete corpus, they do not have the same comfort level with data mass as scientists; nor have they typically had occasion to work collaboratively. Indeed, humanities research protocol, based on the model of evaluating the individual researcher, has typically discouraged collaboration as somehow beneath one's dignity 'infra dig.'
While attitudes more favourable to the needs of digital humanities projects are slowly evolving, we have yet to see a general acceptance of new approaches. Indeed, even where digital projects have been embraced, evidence suggests that attitudes from traditional or analogue scholarship continue to influence the way projects are evaluated, a practice that younger, untenured colleagues often find intimidating. At least as far as the demands of humanities credentialing are concerned, the dominion of the typewriter has yet to give way to that of the computer, metaphorically speaking.
It is not news in 2009 that digital humanities require a wholly new mind-set. That is what is meant by 'the cogito' of digital humanities. We cannot continue to focus simply on digital projects while ignoring the intellectual and social context in which they take place. We must begin by accepting the very different social, intellectual, and institutional context fostered by data-driven research. Digital scholarship creates a potentially productive network at many levels, and it entails significant change at the level of the individual scholar, in terms of operational methods, and in the kind of intra- and extra-institutional partnerships required.
The typical digital project cannot be pursued, much less completed by the proverbial 'solitary scholar' familiar to us from the analogue research model. Because of the way data is acquired and then scaled, digital research rests on a basis of collaboration at many levels:
- First, as a partnership between scholars and IT professionals;
- Second, as a dynamic interaction between scholars of the same and different disciplines, since the data is too large to be handled by a single scholar, and too varied to be encompassed by a single discipline ;
- Third, in concert with a team of IT professionals responsible for designing the site, developing functionality as requested by the scholars, posting the data, and, not least of all, assuring access to end-users around the world.
It would be a mistake to ignore the fundamental changes to the professional relationships within institutions implied by such intensive collaboration. Similarly, it would be naïve to imagine that they have no impact on what we might term the 'etiquette' factor of scholarship. Traditionally etiquette was based on social conventions which were, in turn, largely predicated on hierarchical arrangements: i.e., the concept of 'a guest of honour,' alternating 'ladies' and 'gentlemen' at table, the precedence of women in social situations, but not necessarily in other real-life contexts. Analogue scholarship had its own set of conventions, which shared with etiquette the core principle of 'decorum,' which translated to a rigid code of conventions for research practices, scholarly methods, and above all the authority of the individual scholar.
Not all of this makes sense, or is even possible, in the digital context. Since no single individual can create the conditions or manage the data that constitute a digital environment, the Web is thoroughly democratic. If the scholarly end-user is the one who conceives and publishes articles about the content of a Web site, such publications are not the only ones to emerge from the collaboration of scholars and IT professionals. The latter, too, produce innovative articles about technical innovations they made in creating the site. Indeed, it is often the case that technical staff achieve breakthroughs that end-users have neither the expertise nor, frankly, the interest to fathom. The scholarly end-user must understand, however, that since his or her research depends upon the responsiveness of the IT professionals to the users' needs, both are enmeshed in a social network configured, if not quite as a matrix, than certainly as a flattened hierarchy.
Since the quasi-matrix model of the scholarly digital world is obvious to anyone who stops to think about it, it ought to serve as a guide to the design and use of Web sites themselves. But there is something of a contradiction in the way Web sites and their functions are often conceived. Frequently, they find themselves at cross-purposes with the inherent collaborative potential of the Internet by opting to use dedicated tools limited to the particular needs of the project. Such proprietary thinking may be appropriate for commercial sites, but does it make sense in the context of the intellectual world where interoperability and co-operation can do so much more to advance knowledge, not to mention extending the range of projects?
Is it not the case that the 'cogito' of scholarly Web sites begins with making them 'smarter' by designing them to facilitate collaboration from the ground up, as it were? One way to achieve this goal is to make projects 'tool-agnostic.' Rather than creating tools specifically for a given set of material, one can make platforms tool-agnostic: meaning simply that the site is designed to accommodate varied content. The capacity of a site to host multiple projects invites collaboration among scholarly groups who would otherwise each be putting up its own separate site. This in turn will promote scholarly communication and collaboration …in short, true interoperability. Technically such a model is not difficult to achieve; the problem lies elsewhere: in convincing scholars and IT professionals to think imaginatively and proactively by creating an 'ecumenical' platform for their original content, i.e. one that is general in its extent and application. This is precisely what we have done at Johns Hopkins for the Rose project. As a result, we are ready to incorporate material from other sources—existing or planned—that will transform what is currently a single-focus site into a full-fledged Digital Library of Manuscripts and Incunabula.
Making Web sites 'ecumenical' in this way sets conditions where interoperability is feasible. This step alone, however, cannot assure a truly collaborative environment. If digital projects have not been able to interact more effectively, it is in large measure because they remain proprietary to a sub-discipline. To encourage other projects and their organisers to see the advantage of collaboration, scholars should look beyond the specificity of their topic to the larger rubric under which both may be subsumed. In short, we need to change our thinking when conceiving scholarly projects. Simple awareness that the utility of any Web site can be enhanced by its ability to interact with related data can alter the approach to digital projects. As with any co-operative endeavour, the same spirit should motivate each component. In this case, it is the spirit of the scholarly method that is at issue. If the scholars who establish a project adopt a comparative approach from the outset—foreseeing the benefits of interacting with related projects, such as Parker-on-the-Web and Rose, or medieval and Renaissance projects in general where applicable—the expectation is that, from the outset, they will conceptualize their endeavour from a comparative and a collaborative viewpoint. To maintain the parallelism with the technical side, we can call such a comparative approach 'use-agnostic,' in the same way that the site platform is 'tool-agnostic.'
What are the implications of a use-agnostic approach to digital projects? To begin with, it brings the scholarship in line with the digital environment by recognising the social network of collaboration between scholars, students, IT professionals, and even the general public. It recognises that the concept of scholarly 'ownership' is an artefact of analogue protocol. Use-agnostic projects involve scholars from other disciplines who become involved through the data which contribute to the design of the site, thereby assuring that all the expertise needed to parse the breadth and depth of the data will be brought into play. A further benefit of the 'use-agnostic' approach is that it will inhibit sub-disciplinary prejudices from 'capturing' or unduly dominating the site, thereby decreasing its usefulness to others with a legitimate interest in the material.
Lest this proposal be misunderstood, 'use-agnostic' should not be construed as offering licence to engage in irresponsible exploitation of site materials. It does encourage scholars, students, teachers, and interested members of the general public to use them productively. The fact that a site is use-agnostic by no means implies that that project participants cannot derive discrete personal gain – individual publications, etc. – from the project. One imagines, however, that such publications will be enriched by access to the comparative data, which the site makes possible. Similarly, a site so designed can enhance potential collaborative publications by scholars working on related data from several projects. That is certainly a strong rationale for interoperability. Data-level interoperability is a major area of computer science research and for scholars can and should form the basis for the most productive kinds of comparative critical study.
No matter how compelling the rationale for making digital projects interoperable, good reasons do not in themselves produce change. We must go further to reflect on the means of encouraging new ways of thinking. What factors might we consider likely to support this aim? To begin with, we must ask what motivates scholarship? What do scholars hope to achieve? If one can then show how the change we are talking about can help to realise those goals more effectively, then incentive can be matched to motivation. This may sound relatively straightforward, but one should never underestimate the resistance of many colleagues, even relatively young ones, to digital practices. For digital sceptics, what they refer to as 'technology' seems orthogonal to 'scholarship,' as they define it.
So, there is no question but that we need to educate colleagues. And even before we begin to illustrate how collaborative digital projects can produce innovative scholarship, we will have to prove to their satisfaction that 'digital' and 'scholarship' are not contradictory concepts. Nothing succeeds like a detailed practical demonstration: showing by doing, if you will. This is one case where interactive workshops, a lot of them, at strategic locations cannot but advance such education. Such workshops would require meticulous planning to assure participation by key scholars from related disciplines. The demographics of such workshops will be crucial to their success. The mix should include senior scholars and junior scholars who have compelling projects, and who can demonstrate their advantages persuasively. More difficult to manage, and perhaps delicate to arrange, will be assuring the presence of a target audience of sceptical colleagues, i.e. for whom 'technology' and scholarship represent antithetical states.
But energy should not be expended simply on 'proselytising.' Equally urgent is the need to organise workshops to demonstrate to colleagues who already have digital projects why it might be to their advantage to join forces with other teams. Defining and demonstrating the advantages of inter-site collaboration will help the community to understand better and think more creatively (and productively) about these issues. Once again, it is important for workshops to include a mix of senior scholars with their younger peers and recent PhDs. Bearing in mind the flattened nature of digital scholarship hierarchically-speaking, one can imagine the younger participants contributing as much to the education of their older colleagues as the other way around.
Finally, there is the problem of proliferating projects with little or no communication with other ventures, even those falling within the same or related domains. It is a fact that growing numbers of sites compete for the same funding resources. We are at or near a zero-sum game. One might envisage solving the problem of unco-ordinated proliferation and diminishing funding by aggregating projects that fall within related spheres. As we have seen, the advantage of tool-agnostic and use-agnostic approaches is their potential to encourage collaboration.
Few digital projects require dedicated servers or platforms to carry out their purpose. Just as 'cloud computing' has increasingly become the choice for storing one's documents, so data curation centres specialising in related projects might be designated to consolidate projects based at a wide variety of locations. If the Medieval Digital Library at Hopkins, for example, were to become the curation site for a number of programmes dealing with medieval manuscripts and incunabula from around the U.S. and even other countries, scholars working on these projects would have the same level of access and control over them as if they were on servers at their own institution. But they would enjoy a further distinct advantage in avoiding the expense and logistical problems of organising and maintaining a 'back office' (i.e., all the technical aspects of their project). They could concentrate on the scholarly enterprise itself.
Consolidation would also offer the advantage of true interoperability with or without collaboration. Given a number of authors, manuscript types and genres preserved on the same platform, the teams of scholars responsible for the different projects could work comparatively with each others' resources. Whole new aspects of medieval culture and intellectual life might be discovered serendipitously, even accidentally. And what is true for medieval and Renaissance projects can also be envisioned in the case of modern digital enterprises.
One other advantage of aggregating curation centres comes to mind. A finite number of centres would prove easier to fund aggressively at levels that would allow them to offer a menu of technical services, and even to provide custom functionality. Current models of funding just do not allow individual projects to envisage such beneficial features. In essence, these consolidated curation centres would operate as consortia, with a steering committee or board drawn from contributing projects or institutions. Were one to begin to realise this concept, many more advantages would undoubtedly emerge. Other research disciplines have reached similar conclusions. None, perhaps, articulates the need for and benefits of cross-disciplinary research and collaborative data analysis than does an ARL report published in 2006: Long-term Stewardship of Digital Data Sets in Science and Engineering .
These proposals do not begin to respond comprehensively to the questions posed. They do, however, point to problems that need to be faced sooner or later. I would contend that we can ill afford to delay their implementation for much longer.
- Roman de la Rose Digital Library http://www.romandelarose.org/
- In this connection, it is instructive to note the demographics of the Rose community as analysed at recent scholarly meetings in the United States in 2008. Countries represented are: Italy, Spain, Germany, the Netherlands, France, Canada, Australia, New Zealand, Japan, the U.K., and the U.S.
- Analysis of the Rose community conducted in 2008 demonstrates that the site has attracted a broad constituency consisting of literary scholars (36%), Art historians (16%), students of medieval costume design (9%), medieval cultural historians (8%), as well as medieval re-enactors (7%).
- To Stand the Test of Time - Long-term Stewardship of Digital Data Sets in Science and Engineering - A Report to the National Science Foundation from the ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe, 26–27 September, 2006, Arlington, VA http://www.arl.org/bm~doc/digdatarpt.pdf
Stephen G. Nichols is the James M. Beall Professor of French and Humanities in the Department of German and Romance Languages at Johns Hopkins University. He is also Chairman of the Board of Directors of the Council on Library and Information Resources (CLIR). He specialises in medieval literature in its relations with history, philosophy, and history of art. One of his books, Romanesque Signs: Early Medieval Narrative and Iconography, received the Modern Language Association's James Russell Lowell Prize for an outstanding book by an MLA author in 1984. Another, The New Philology, was honoured by the Council of Editors of Learned Journals in 1991. Author, editor, and co-editor of 24 books, Nichols conceived and is co-director of the Digital Library of Medieval Manuscripts and Incunabula at the Milton S. Eisenhower Library of Johns Hopkins. He has lectured and written on digital scholarship in the Humanities, e.g. "From Parchment to Cyberspace," "Digital Scholarship, What's all the Fuss?" "'Born Medieval:' Manuscripts in the Digital Scriptorium," "Manuscripts and Digital Surrogates: Sibling or Counterfeit?", "There's an Elephant in the Room: Digital Scholarship and Scholarly Prejudice."