Following the success of the inaugural event last year , the Mashed Museum day was again held the day before the Museums Computer Group UK Museums on the Web Conference. The theme of the conference was 'connecting collections online', and the Mashed Museum day was a chance for museum ICT staff to put this into practice.
Earlier this year I received an email that read:
You are invited to a day of coding, thinking and idea sharing with a select group of museum colleagues. Mashed Museum 2008 will be a day of free-form thinking and doing with only enough structure to make sure we actually get something out of the (considerable) collective brainpower in the room.
A few weeks later, a small group of people who work in museum or cultural heritage ICT met at the University of Leicester to spend a day prototyping and using lightweight application development tools and 'Web services to experiment with museum data in innovative ways.
Delegates included 'Web and database programmers, academics and digital media mavens' * from a range of museums, universities, and from the commercial sector (with the occasional contribution from international colleagues who were watching the live video feed, or following the twitter conversations).
As many of us work in relative isolation, the company of our peers can be as inspiring as it is informative. We work closely with cultural heritage data and often have a good understanding of those who use our content, but we rarely have time to experiment, to update our skills, or to try new tools, programming frameworks or development methodologies. The Mashed Museum day was an invaluable chance to work on interesting projects in the company of our peers.
We experimented with a variety of technologies and data sources, including MIT Simile, Yahoo! Pipes, Freebase, Wikipedia, IBM's Many Eyes, Yahoo!'s term extraction service, Poly9, OpenCalais, FireEagle, twitter, PicLens, RSS ('really simple syndication' or 'RDF site summary'), XML, unofficial or internal museum APIs (application programming interfaces) and an aggregated feed of screen-scraped museum collection items .
These tools were variously used to create maps, timelines, geo-located RSS feeds, games, several new ways of browsing and visualising museum data and the links between individual records, and to add meaning and layers of metadata to existing data. These were quite impressive results, but perhaps the most important outcome of the day was the range of experimentation, and the conversations and cooperation between participants.
As with last year's 'mashed museum' event, the lack of reusable data with clear use and rights statements was an issue. However, progress has been made since last year, though some supplied data sources could not be used outside the experimental day itself.
Mike Ellis, the organiser of the day in conjunction with Ross Parry, has made a ten-minute video in which some participants demonstrate and explain their experiments . You can also find blog posts, images and videos from various attendees by searching the Web for the tag 'ukmw08'.
This year's UK Museums on the Web conference was concerned with questions of 'how (and why) should museums connect their online collections' . The issue was tackled by speakers from within and outside the museum sector, by practitioners and theorists and by representatives of large- and small-scale projects.
Tom Loosemore of Ofcom gave the keynote speech. It was bound to be of interest, as Ofcom, the 'independent regulator and competition authority for the UK communications industries', had recently caused a stir in the digital cultural heritage world for its assessment of the extent to which public sector Web sites delivered on 'public service purposes and characteristics' in its review of public service broadcasting, entitled The Digital Opportunity .
He began by asking, 'how many of you are on the main board of your institution?'. He sees the Internet as a platform for public service and enlightenment and would like to see museums contribute more. However, the crucial missing link between being able to realise the potential of museums and the cultural sector in public service broadcasting is the lack of leadership and vision, and the lack of recognition of the potential of the Web by organisational heads. Later, in the discussion after the first session, he stated that, 'letting go [of data] is how you win, but it's a profound challenge to institutions and their desire to maintain authority'.
He discussed the BBC's 15 'Web Principles , and urged delegates to think about the 'native opportunities' that could only happen on the Internet. How could they help their institution achieve its purpose? He talked about the importance of 'measuring enlightenment' and of providing real evidence for funders of the value of online material. If value is defined as 'reach multiplied by quality', how do you measure quality? He discussed the use of the 'net promoter' method at the BBC. e discussed the use of the 'net promoter'  method at the BBC.
Overall, his keynote speech was a great start to the conference and provided an interesting perspective from outside the museum sector. One might question the applicability of a broadcast model in a 'participatory' online environment, and look for a version of the 'net promoter' model that works with extremely niche audiences, but he left the audience with some important challenges: 'how do you take the opportunity to digitise your collections and reach a whole new audience?' 'How can you make better use of cultural objects that were previously constrained by physicality?'.
Lee Iverson of the University of British Columbia framed his talk as 'semantic pragmatics' - the what, how and why of semantic technologies for museums. He said that museums have a significant opportunity to push things forward, but they must understand the potential and limitations of the environment. Museums can be an order of magnitude better if they can federate and aggregate content. He discussed the benefits of connecting between museums, and from museums to outside the sector.
His steps for becoming connected were summarised as: expose your own data from behind presentation layers; find other data; integrate with that other data, and engage with users.
He provided an overview of 'the pragmatics of standards' that could be summarised as 'just do it' - make agreements, get the project to work, and then engage in the standardisation process. He also discussed the practical differences between XML, RDF (Resource Description Framework), and RDFa (Resource Description Framework attributes). His advice, 'ignore the term "ontology" - it's just a way of talking about a vocabulary' is extremely useful because connected collection projects often struggle with the issue of ontologies.
Paul Marty of Florida State University asked, 'What does it mean to say x% of your collection is online? For whom is it useful?'. He said that research shows audiences want engagement, and that there is a virtuous circle between physical museum visits and Web site visits. He believes museums should not just give the general public a list of collection items but rather 'give them a way to engage'. It's not without challenges - 'engaging a community around a collection is harder than providing access to data about a collection'. Paul expanded on some of his points in the discussion, and some of the audience may have been relieved to hear that the US and UK museum sectors have made some of the same mistakes with the digitisation of collections and the production of Web sites that do not have reusable data.
Bridget McKenzie of Flow Associates discussed the need for gentleness when working with 'bottom-up' or experimental approaches in the cultural heritage sector, the role of institutional constraints, and the value in building on the emerging framework of the Web. She said that people are more likely to be engaged if they feel they can shape something or make a personal connection through it.
She suggested that 'critical mass' in digital collections really means 'contextual mass' - the potential to create new contexts for interpretation by combining collections. In an interesting concurrence with Ross Parry's earlier musings about whether museums would be different if they were built today, she asked whether the frame of 'the museum' makes sense anymore, particularly on the Web. What are our responsibilities when we collaborate? If the sector does not provide these contextual spaces or masses, are we missing the chance to share expertise in meaningful ways?
She pointed out that it is easy to revert to the ways in which previous projects have been delivered, especially in a sector where funding plans do not allow for iterative, new and emergent technologies. She concluded that hearing from two emergent projects in the presentations that followed is a step towards sharing learning and resolving current questions.
Carolyn introduced the The National Museums Online Learning Project (NMOLP), a partnership designed to increase use of the digital collections of nine museums. She focused on the partnership issues involved, describing the reality as 'like herding cats'. She found that partnership issues are not necessarily built into the project plan but they had to be addressed in order to avoid problems later in project. The project focussed on developing a common vision and a set of principles for working together. The project also worked to identify the things uniquely achievable through partnership, the barriers to success, and the types of content or functionality that added value for users.
She identified three levels of 'barriers to success': the new undertaking of working in a collaborative inter-museum way (a first for those partners); the organisational issues - working inter-departmentally with learning, Web, IT, etc. specialists who were not used to working together; and the personal issues around challenges to people's identities as specialists with particular roles.
The challenges specific to the project deliverables included: creating meaningful collection links; sending people to collections sites while knowing that content they would find there was not written for those audiences; providing support for pupils when searching collections; and creating sustainable content authoring tools and processes.
Other issues included the question of how individual museums and the partnership would build and sustain the Creative Journeys (user-generated content) communities; the challenge to curatorial authority and reputation; working models to deal with the messiness and complexity around new ways of communicating and using collections; and copyright and moderation issues.
Richard discussed the technologies used in the project and the ways in which they could be deployed to maximise flexibility and encourage reuse. A federated search that could be deployed across the partner sites was required and is being implemented as part of the project, though not originally part of the project plan. The project has been designed so that the back-end is decoupled from the front-end applications. RSS feeds are used to syndicate user interactions with the collections. They have used lightweight solutions for 'rapid (and reliable) application development'.
Jeremy Ottevanger of the Museum of London provided some background on his involvement with the European Digital Library (also called 'Europeana') and his hopes and concerns for the project. The aim of the EDL is to provide 'cross-domain access to Europe's cultural heritage', as 'our content is more valuable together than scattered around'.
The prototype will launch in November 2008, but the project still does not have enough members from the UK, and very few museums; Jeremy was interested to know why that is. Unfortunately the project will not provide an API or integrate user-generated content in the first iteration, however the interface may appear in three or four major languages and some item metadata may be multilingual.
The Europeana site will direct visitors to the original records on the contributing institutions' sites which should assuage fears about reduced visitor numbers, and there are many other benefits for participating institutions.
Some participants in the discussion raised concerns about the requirement to implement an OAI repository to contribute to sector-wide projects, suggesting there is a need for more information about the ability of existing OAI projects to ingest data in a variety of common formats. The phrase 'aggregation fatigue' was another response to the presentations.
It was suggested that museums could build APIs into their collections so their data could be used in projects without placing any further requirements on the museum when engaging in new partnerships. This may not work for smaller museums, but a combination of museum APIs and the ability of museums without APIs to contribute records to shared repositories might provide a workable solution allowing museum data to be reused in any number of projects or initiatives.
George Oates of Flickr introduced the Flickr Commons  Project. The project began when the Library of Congress, which has over 1 million digitised photos, was thinking about how to engage with Web 2.0 and approached Flickr.
What are the advantages of Flickr? It's 'a great place to be a photo'. Flickr is designed specifically to search and browse photographs, supports interfaces in eight languages, has a huge infrastructure to support 2.4 billion photos and 40 million unique visits a month - but more importantly, said George, 'it's made of people'.
As collecting institution, the Library of Congress does not necessarily own the copyright to its images, or know who the copyright holder might be. It had to devise a new statement, 'no known copyright restrictions', to provide a way to use the Library of Congress content in cases when the institution was not able to trace the copyright holders.
It has had impressive results for viewing figures, relationships with audiences and user-contributed content. The Powerhouse Museum had more views of the collection it had put on the Commons in one month than in the whole previous year it was online on its own site. Users have identified places and people, transcribed signs and provided information about the history behind photos. People have used the comments functionality in Flickr to link to their recent photos of a location.
The information that the community provides is proving useful for the institutions involved. The Library of Congress has updated 176 records in catalogue, recording that it's based on 'information provided by Flickr Commons Project 2008', and expects to update more as its staff research the leads provided in notes and comments . However, the project can be challenging for museums, and they should try to 'grow gently' to ensure that the institution can handle the changes and respond to the interactions.
The projects of Frankie Roberto (Science Museum) came out of last year's mashed museum day where the lack of public, reusable cultural heritage data online was a real issue. Discussion after the 2007 mashed museum event eventually turned to extra-institutional methods of obtaining data - screen scraping and Freedom of Information (FoI) requests were suggested. As a result, Frankie sent FoI requests to national museums for their collections data, asking for it in any electronic format as long as it had some kind of structure. He has previously presented the results of this process and will release it on a Web site soon.
He had concerns about big top-down projects in the museum sector and so he suggested five small or niche projects. In order to devise the projects he asked himself, 'how do people relate to objects?'. He is willing to give the domain names to anyone who is interested in taking the projects further. His five suggested projects and their rationales:
Fiona Romeo of the National Maritime Museum presented a quick case study undertaken to find out if she and her colleagues could make more of their collections datasets with information visualisation. They had a set of data about memorials around the world that included the text on the memorials. It was quite rich content and they felt that a catalogue was probably not the best way to display it. They commissioned Stamen Design for the project and sent them CSV (comma-separated values) files for each table in the database without any further documentation.
The direct outcome was beautiful and meaningful visualisation of the memorial data, and the process provided them with a better understanding of their data. The project also shows that giving your data out for creative reuse can be as easy as providing a CSV file, though ideally every collection Web site should provide an API or feed of the data.
Fiona also spoke about her experiments at the mashed museum day; she cut and pasted transcript data into IBM's free Many Eyes tool. Her experiment shows that really good tools are available, even if you do not have the financial resources to work with a company like Stamen. In the discussion that followed, Fiona said that her personal measure of success was creating a culture of innovation and engagement, and creating a vibrant environment, and this small case study was a good step towards that end.
At the end of this session, Mike Ellis presented a summary of the 'mashed museum' day held the day before.
The final open discussion included the idea that the provision of an API for any collections Web site should be a given, fears about putting content online reducing use of the physical collections (neatly rebutted by Paul Marty with 'since the State of Florida put pictures of their beaches on their Web site, no one goes to the beach anymore'), EXIF (exchangeable image file format) metadata on Flickr, and the need for more meaningful metrics. It was suggested that the sector 'push data to DCMS instead of expecting them to know what they could ask for' and that we should use the opportunity to change the way success is measured.
The debate engaged with the idea of watermarking or micro-marketing metadata, of sending it out in a wrapper and making it embeddable. The importance of wrappers and metadata for curators was described with the statement, 'they're more willing to let things go if people can get back to the original source'.
The discussion returned to the use of 'net promoter', with some saying it was a flawed metric because people do not recommend new or difficult knowledge or something they disagree with; 'what gets recommended is a video of a cute 8 year-old playing Guitar Hero really well. People avoid things that challenge them'. Others said that the advantage of the 'net promoter' is it takes the judgement of quality outside the originating institution.
Ross Parry summarised the 'take-home' ideas for the conference, pointing out that the discussions conflated many definitions of 'collections' including items, images, records, and Web pages about collections. Ross reminded us that technology is not the problem: it is the cultural and human factors, and that 'we need to talk about where the tensions are, we've been papering over the cracks'. The sector is changing, with a realignment of the 'axis of powers' creating a vacuum that might be filled by the Collections Trust, National Museum Directors' Conference. In that context, what's the role of the Museums Computer Group ? What should it do, and how?
Finally, he noted that the language has changed: previously it was about 'digitisation, accessibility, funding'. Three words today he heard at the conference were 'beauty, poetry, life'. As he said, we're entering an exciting moment.
In many ways, this conference provided an overview or review of the discussions on the Museums Computer Group email list , and of the progress (or lack of it) towards the ideals proposed by many working in ICT in the cultural heritage sector over the past years. It also allowed the sector to consolidate and share the learning from previous digitisation projects at a time when we have many exciting and challenging opportunities.
New opportunities have been opened up by lightweight technologies and development frameworks. New tools are emerging to enable audiences to engage with our collections and each other in personal ways; and just as importantly, tools that allow us to engage directly with our audiences.
One lesson from the Mashed Museum day was that in a sector where innovation is often hampered by a lack of financial resources, time is a valuable commodity. A day away from the normal concerns of the office in 'an environment free from political or monetary constraints'  is valuable and achievable without the framework of an organised event. An experimental day could also be run with ICT and curatorial or audience-facing staff experimenting with collections data together.
Taking a wider view, what social, technical, financial and legal issues need to be resolved so we can take full advantage of these opportunities? Copyright and metrics are areas where progress is clearly required. What institutional and personal fears do we need to address? Loss of control, authority, relevance, revenue? To return to Tom Loosemore's opening question, with whom do we need to work to make sure all these ideas for new approaches to data, to aggregation and federation, and for new types of experiences of cultural heritage data actually go somewhere?
*Editor's note: definitions of 'maven'