
Atti della Conferenza Internazionale (disponibili in lingua inglese), Firenze 15 - 16 dicembre 2009
Abstract:
La seconda edizione della conferenza Cultural Heritage on line. Empowering users: an active role for user communities ha confermato l’interesse per i temi proposti, nonché l’attendibilità e la solida rete di collaborazioni che la Fondazione Rinascimento Digitale si è guadagnata nel settore delle nuove tecnologie applicate alla valorizzazione e conservazione dei Beni Culturali.
>> vai alla Conferenza 2009: Cultural Heritage on line. Empowering users: an active role for user communities
Tipologia documento: Atti dei convegni
Evento: Conferenza
Argomento: Conservazione digitale
Maurizio LUNGHI
The second edition of the conference, Cultural Heritage online – Empowering users: an active role for user communities, has been successfully organised in Florence on 15-16 December 2009 by the Fondazione Rinascimento Digitale – FRD in cooperation with the Italian Ministry for Cultural Heritage and Activities (General Direction for Archives and General Direction for Libraries, Cultural Institutes and Authors’ Rights), and with the Library of Congress that mobilized all the network of the National Digital Information Infrastructure and Preservation Program - NDIIP partners.
We warmly thank our promoters, the local and regional authorities, and a fantastic group of thirty-five supporters who really were the engine and petrol in spreading information about the event within their user communities, involving cultural heritage institutions and professional associations, technology providers and research centres. A special thanks also to the Director of the Teatro della Pergola for his marvellous hospitality and technical support, and finally to the Palazzo Borghese that was the incredible venue for the gala dinner. The majority of the success has been possible thanks to all the speakers and chairpersons, as well as to the FRD staff, very limited in number but determined and motivated.
The FRD is a structural foundation promoted by the Ente Cassa di Risparmio di Firenze to support the best adoption of ICT and international standards with a special attention to Internet technologies and digital preservation. The Foundation operates in all of the sectors connected with the production, preservation and diffusion of digital memories by means of research projects and test-beds, management and preservation of digital memories, implementation of innovative applications, and dissemination of this knowledge, with tutorials or on-line training courses.
The FRD invests its annual budget on research projects or prototypes in synergy with the main cultural actors, promotes awareness and training about digital libraries and digital preservation with special attention to young generations.
The main interests and activities of the FRD are in two fields: digital libraries for the cultural and humanities sector, and all the issues related to digital preservation. In the first group, the Foundation has promoted a project to set up a user community about humanities & computing, and has developed an open-source software "Pinakes" to manage digital objects in a very advanced way. Concerning digital preservation, the Foundation is very active and participates in international arenas, for example with the Digital Preservation Europe – DPE coordination action, and with the "Digital Stacks" project, coordinated by the Central National Library of Florence - BNCF, about reliable digital repositories paving the way to the first test-bed for the national legal deposit of digital contents on behalf of the Ministry of Culture in Italy. The Foundation has also promoted an innovative architecture for National Bibliographic Number - NBN persistent identifiers for digital contents, and thanks to the BNCF and the Consiglio Nazionale delle Ricerche – CNR, they have developed jNBN an opensource software already requested by other countries for evaluation. Finally, the Foundation participates with an expert in the PREMIS Editorial Committee managing the development of the standard under the coordination of the Library of Congress. By these three projects, the Foundation attempts to face the challenge of digital preservation with a global and complete strategy.
This year, the conference topics were again related to digital libraries, digital preservation and how Internet is changing scenarios and paradigms, but focussing on the user needs and point of view; so instead of investigating the technology offer, we favoured the works about organisational issues, new Internet schemas and roles, user requirements and constraints, and cultural and economic limits too, that can be seen as obstacles in the way of completely adopting ICT in the cultural heritage sector.
In particular, the first day of the conference proposed eight invited lectures that investigated user needs and expectations, analysing how to better involve users and the cultural heritage community in creating and sharing digital resources. Other topics evaluated were the current trends and use of interactive Web 2.0 tools by cultural institutions, benefits and future opportunities but also limits and constraints for users to create new contents, or possible risks trusting too much any info available on Internet. Some basic concepts underpinning dematerialisation of traditional archives have been presented. The need to choose what digital contents we want to select for long-term preservation, also involves some ethical issues. One important message was that the challenge of the future information society hinges on the use through cooperation among all the sectors, and in particular the current development for Archives, Libraries and Museums - ALM sounds promising.
The plenary session of the second day conference was started with the presentation of national and international scenarios, followed by two thematic sessions with scientific speeches selected through a Call for Papers that observed the advancement of the research on the user-institution relationship towards the development of cooperative Web 2.0 tools and on sustainable digital preservation policies. The scientific program was enriched by two important training opportunities the day before and the day following the conference: the Tutorial "Long Term Preservation of digital assets: basics, concepts and practices" and the Tutorial "Dublin Core - Building blocks for interoperability".
The conference has seen a great interest not only among the specialists in the cultural heritage field, but there was also a large participation of Italian and foreign students, several European Projects, representatives of local administrations, research centres and the private and corporate sector. There were about 400 people at the conference: one-fourth of those present was formed by students; as for nationality, there were people from all countries: in particular three–quarters of participants were Italians, followed by Americans, Germans, Dutchmen, Estonians, and a varied representation of almost all of the European countries.
In conclusion, many activities have been carried out since the first edition of the Cultural Heritage on line conference in 2006, the Fondazione Rinascimento Digitale has developed contacts and cooperation with cultural institutions worldwide, and we feel confident to be able to work successfully with you all before the next conference in 2012. We'll keep in contact with you before that.
Thanks again to everybody!
Bernard SMITH
Conclusions and report from the parallel sections
Five years ago the Fondazione Rinascimento Digitale (http://www.rinascimento-digitale.it/) came into existence with the explicit task to promote the application of information and communication technologies in the field of cultural heritage. The young Foundation decided almost immediately to hold a conference. In this very same wonderful Teatro della Pergola (http://www.teatrodellapergola.com/) the Foundation organised a conference on 14-16 Dec. 2006 on the topics of access and preservation (http://www.rinascimentodigitale.it/conference2006.phtml). The conference looked at how new technologies were transforming knowledge and imposing new organisational requirements on our cultural institutions. Most of the papers were loosely classified as either on "digital libraries" or "digital preservation".
After the success of its first conference the Foundation has continued to work on projects in the broad fields of digital preservation, digital repositories, digital libraries and archives, and persistent identifiers.
Some 18 months ago it was decided to organise this second conference. Over the last 2-3 years much has changed. We saw IFLA (http://www.ifla.org/) in their 2009 conference in Milan focus on the future evolution of the library - and the way "libraries would drive access to knowledge". We saw ICA (http://www.ica.org/) becoming increasingly preoccupied with the challenges in exploiting new technologies to preserve "born digital" material. And we saw in the 2009 conference "Museums on the Web" (http://www.archimuse.com/mw2009/) major presentations on the institutional changes brought on by social media, on the creation of wiki communities, on digital asset management and digital preservation, on museum Web 2.0 sites, and on young audiences and creators.
So it was against this background that our conference title "CULTURAL HERITAGE: an active role for user communities" was conceived. We felt that the twin topics of access and preservation were just a valid as 3 years ago. However today it seems that users are not only able to adapt to technological changes faster than cultural institutions, but they are also driving innovation, becoming content producers and pushing institutions towards a new user-institution relationship. The Foundation was very fortunate to find support from the Italian Ministero per i Beni e la Attività Culturali and the US Library of Congress, and this produced a great cooperative effort in creating the sessions format, in providing speakers, in attracting high-quality papers and posters, and in promoting the event.
In addition to the support of many prestigious authorities and institutions, around 400 people attended the conference (including the pre- and post-conference tutorials). On the first day we were welcomed by representative from the Comune and Provincia of Firenze, the Regione Toscana, the Ente Cassa di Risparmio di Firenze (the parent organisation of the Foundation), the US Library of Congress, the Italian Ministero per i Beni e la Attività Culturali, and the European Commission.
We had 12 substantial invited talks on the state-of-art and state-of-practice in access and preservation. We had 24 papers presented in 2 parallel sessions: Digital library applications & interactive Web, and Sustainable policies for digital cultural preservation. And we had a poster session with another 11 papers. So a total of 47 speakers and presenters came together from 14 different countries. We saw speakers from major institutions such as the US Library of Congress, Instituto Centrale per il Catalogo Unico delle Bibliteche Italiane, the Italian Archivio di Stato, The British Library, the French Institut National de l’Audivisuel, the Austrian National Library, the Estonian National Archive, the European Digital Library Foundation and the European Commission. We also saw speakers from a multitude of prestigious academic institutions (North Carolina, Bath, Pisa, British Columbia, Helsinki, and Barcelona come to mind), and from major research labs. such as CNR, IRCAM, IBM and Max Planck.
So all the building blocks for success were present - a good cross section of high-quality cultural institutions and academic-research organisations presenting their activities and latest results.
Let us look more closely at the actual content. I want to retain our 2 traditional topics: access and preservation. My comments are largely based on the presentations and posters from the 2nd day. Many of the invited talks are in themselves masterful overviews of the state-of-art and/or state-of-play in specific fields of relevance to cultural institutions - and I would be doing the authors an injustice to try to summarise their contributions in a few lines. As such my comments should be read along with the collection of invited papers (and a longer version of these comments is available on the conference Web pages).
Equally I will not try to summarise all the papers presented. I have decided to pick out some points that I felt most relevant at the time. These are my personal comments and conclusions, and in no way do I intend to reflect negatively on the quality of the papers not mentioned. And naturally I hope I have understood and captured the key messages of the authors - and I here present my apologies if I mis-quote or mis-represent someones work.
Firstly Access
In this conference we saw two distinct trends concerning access. The first trend was towards very practical, large-scale Digital Libraries with an abundance of high-quality content being digitised and put online, and the second trend concerned a multitude of experiments with Web 2.0 technologies and social networks. Sitting between these two trends were a series of papers looking at the risks and benefits in adopting Web 2.0 technologies and social networking - and there were some concrete suggestions as to how to exploit the opportunities and manage to risks.
Large-scale digital libraries
How could I not but start with Jill Cousins of the European Digital Library Foundation who presented Europeana (http://www.europeana.eu) and Max Kaiser of the Austrian National Library who presented EuropeanaConnect (http://www.europeanaconnect.eu). Already today the prototype portal links to more than 5 million objects from more than a 1,000 European institutions and collections. And they promise 10 million items by 2010 and 25 million items by 2012. Europeana now has to integrate differing vocabularies, resources discovery tools, harvesters, metadata registries, and a multitude of licence agreements - then they want to add semantic processing, GIS- and time-related query options, etc. - and provide mobile access and on-demand ebooks. I admired the courage and optimism of Europeana - two essential qualities when trying to build a sustainable, large-scale pan-European Digital Library infrastructure and services.
Silvia Gstrein & Günter Mühlberger from the library of the University of Innsbruck described a trans-European ebooks on-demand network bringing together 20 libraries from 10 countries, including 6 national libraries (http://books2ebooks.eu). The approach taken gets users to co-fund the initial digitisation of rare or out-ofcopyright material. Once the initial users demand has been met, the digitised book is made freely available to the public. What interested me was that a user survey indicated that around 70% of users were prepared to pay 50€ to get a copy of an out-of-print book.
And this is not all - over the 2 days we learned about a real abundance of high-quality content being put on line:
We heard Laura Campbell of the US Library of Congress mention the American Memory Website http://memory.loc.gov/ammem/index.html, which makes freely available more than 14 million historical primary source materials. In addition we also have the 200,000 documents in the Global Gateway (http://international.loc.gov/intldl/intldlhome.html) and the 100,000 newspaper pages in the US National Digital Newspaper Program (http://chroniclingamerica.loc.gov).
On the 1st day Daniel Teruggi mentioned that French Institut National d’Audiovisual provides access to more than 100,000 documents and more than 5,000 hours of audio-visual material.
Thomas Kirchhoff and his co-authors from the museum information systems group in Konstanz University presented BAM the German cultural heritage portal for libraries, archives and museums (http://www.bamportal.de). This is another major portal effort to provide online access to 41 million records in the form of catalogues, repertories and inventories.
Lauri Leht from the National Archives of Estonia (http://www.ra.ee) talked about digitising around 5 million images, most from church books, and also putting online all 8 million archival heading thus allowing users to avoid searching through paper records.
Aly Conteh from the British Library & Asaf Tzadok from IBM in Haifa used the example of the National Library of Australia’s newspaper Website (http://newspapers.nla.gov.au/ndp/del/home), which today has put 104,000 articles online.
Andrea Fojtu of the Czech National Digital Library (http://www.ndk.cz/project/view? set_language=en), discussed their plans to digitise 1.2 million documents or 350 million pages over next 20 years.
Christoph Müller from the Ibero-American Institute in Berlin (http://www.iai.spk-berlin.de/) talked about their plans to put online around 1 million items (plus 900,000 press clippings, 200,000 microforms, more than 70,000 maps, etc.).
Friederike Kleinfercher & Kristina Koller from Max Planck Digital Library looked at a development within the eSciDoc portal (https://www.escidoc.org). They mentioned that their ViRR project (http: //testvirr. mpdl.mpg.de:8080/virr/) contains about 20,000 scans of legal artefacts from the Holy Roman Empire.
Portals, quality content, usage and sustainability
In looking at the emergence of these large-scale digital libraries we can see a move by cultural institutions towards services based upon portals. They want to present their content in a more user-friendly way, to offer new levels of interactivity, and to introduce user-oriented services sometimes working together with specific communities of interest.
The list of digital library initiatives given above is just the tip of the iceberg - there are hundreds of other largeand small-scale projects being planned and implemented around the world. Yet in talking with authors most felt that that authentic high-quality content was still lacking on the Web. This means that institutions still saw a need to continue to digitise and expand their online digital collections.
Yet I was worried by the absence of real data on the actual usage of the services already available. And I also missed a real discussion on the sustainability of the investments already made.
But what about standards?
Standards were not a specific topic in many of the papers presented at this conference, nevertheless the impression was that we have moved from a situation of uncertainty (a seen in the 1st conference in 2006) to one of relative clarity and even apparent abundance - few authors mentioned the lack of, or complexity of, todays standards as a barrier or risk.
In my opinion this may not be the true situation. Firstly this conference was associated with pre- and post-event workshops on long-term preservation and Dublin Core. Both offering ample opportunities to focus on standards such RDF and metadata, digital formats, as well as tools, practices, and approaches to risk management, etc. More importantly some authors hinted at the fact that smaller institutions appear still to have a very naive approach to standards for both digitisation and long-term preservation. I still think that there is a place for standards development and promotion through very practical guidelines, trails and experiments. Equally the success of the pre- and post-conference workshops shows, if anything, an increasing need for tutorials and training courses.
On the other hand we heard from several authors that one of the main advantages of Web 2.0 as a technology platform is that it is an existing (at least semi-standardised) infrastructure and is cost-effective for institutions, even if the different social networks are not interoperable (Kelly & Oppenheim). I might add that the novel integration of RFID tags, GPS and semantic Web by Kauppinen (representing the consortium behind the SMARTMUSEUM project http://smartmuseum.eu) also has the advantage of adopting what are becoming cheap and well-defined industrial standard components.
Web 2.0 technologies: risks & benefits
Looking beyond these large digital library efforts we can see more experimental projects exploiting Web 2.0 technologies and social networking as a way to involve users in the creation and maintenance of distributed collections of cultural material.
Smiliana Antonijevic of the Royal Netherlands Academy of Arts and Sciences working with Laura Gurak from the University of Minnesota looked at trust in online interaction. They noted that modern-days users want fast, accurate systems that have some embedded intelligence and can be customised. However they also noted (and I think rightly so) that above all users want systems that are trustworthy. Some of todays digital repositories are certainly moving in that direction - becoming reliable and persistent over time and engendering trust with their users. On the positive side, involving users can transform a static and historical content authority into a dynamic, multi-faceted and evolving body of localised knowledge. The down side is that information provided by users can be incorrect, incomplete, misleading, corrupt or highly biased. The authors called for cultural institutions to understand how to protect their status as trusted authorities and to learn how to transfer trust to external sources of information. However the authors also noted that the main socio-cultural features of trust have remained stable over the years and much can be learned by harvesting results published over the past 30 years.
Brian Kelly from UKOLN & Charles Oppenheim from Loughborough University echoed the point of view expressed by Antonijevic & Gurak, but they went one step further. On the one hand they recognised that Web 2.0 concepts were moving into the cultural institutions (they mentioned Library 2.0 as being an accepted expression - it even has its own wiki entry at http://en.wikipedia.org/wiki/Library_2.0, and Archive 2.0 and Museum 2.0 being not far behind). On the other hand they also extended the list of concerns and risks: services may not be secure, reliable or interoperable, they are open to misuse, and there are still open legal issues concerning the relationship between cultural institutions and users as content providers (and there are also outstanding copyright issues, the risk of misleading or inaccurate information, a failure to respect data protection laws and personal privacy, and the ever-present risk of posting illegal content). On the positive-side the authors noted that social networks are popular and easy to use, they can engage with new user communities, and are cost effective because they exploit an existing infrastructure. So the real issue is to help cultural institutions learn how to manage the risks involved in building and exploiting social Webs. The authors went on to propose a risk/benefits framework where institutions should be explicit about intended use, benefits, risks, miss opportunities when not adopting a new technology as well as the costs when adopting it, how to minimise risk, and the need to clearly document the evidence used in the analysis.
There were three further papers that highlighted the difficulties in understanding and meeting user needs. The first paper was by Alida Isolani from Scuola Normale Superiore di Pisa looking at empowering users without weakening the digital resources. The authors provide a collection of Renaissance texts for humanities scholars (http://bivio.signum.sns.it). The contents are valuable and regularly accessed by academics, but the advanced retrieval tools available are not well used - and users tend to access the site in a "traditional way". The authors concluded that the tools need to be simplified - by making them more complex! More services have to be offered (analysis, note, mark, correct, edit, etc.) and more formats have to be supported. It will be interesting to see if this approach really increases user demand.
The second paper was from Fred Stielow of the American Public University system looked at community building in an online university context. The author extended the list of very practical risks/problems he faces daily - ranging from the problem of price negotiations in todays chaotic rights marketplace, through the need to keep costs down when creating metadata and catalogues, to ways to improved tailoring for individual students.
The third paper in this group was from Jeremy Hunsinger of Virginia Tech., who looked at the problem of usage of an event-driven memory-bank (http://www.april16archive.org). The author reminded us that the memory bank is a collection (or memorial) of digital artefacts contributed after the April 16 tragedy at Virginia Tech. (where 32 people were killed). He mentioned that today his real problem is now a lack of visitors or users, and the author asks "what happened to the users?" without really being able to find an answer!
Looking beyond the risks: practical experiences
So a risk-benefit analysis is an absolute must, but there are still many ways to exploit Web 2.0 and social networking, keep the risks low, and obtain some valuable and practical results. Lets look rapidly at some practical experiences.
Cèsar Carreras & Frederica Mancini of the Universitat Oberta de Catalunya looked at Web production by small sized institutions. The authors discussed the aims and fears of institutions when faced with the Web 2.0 and the development of social networks. Users can express preferences and opinions, and this virtual community can represent a new life for a small institution (encouraging physical visits, promoting daily discussions, creating empathy with "friends of the museum", stimulating user content production and commentaries). But how to do this properly? There are risks: sterile tools that create nothing new, alienation of the users through abusive advertising, deforming the institutional identity, etc. In concluding the authors discussed the different approaches. The key for a small institution appears to be to create a local community of interest that supports not only content creation but also has reliable elements of content quality checking and validation (through local professionals, teachers, etc.). The authors stressed that quality checking is an expensive process for a small museum, and yet poor quality content can undo all the benefits that a institution creates in its local community.
Lauri Leht of the Estonian National Archives (http://www.ra.ee) looked at involving users in enriching digitised archival material. They have digitised around 5 million images, and put online all 8 million archival heading. Archive volunteers have been employed doing quality checking, helping to understand the content and describe the content in a structured way, and in collecting similar data from different archival sources (remembering that documents are in Estonian, German and Russian with differing alphabets and full of errors).
Aly Conteh from the British Library & Asaf Tzadok from IBM in Haifa talked about ways to foster user collaboration during mass digitisation - using as an example the National Library of Australia’s newspaper Website (http://newspapers.nla.gov.au/ndp/del/home) which supports collaborative correction of Optical Character Recognition (OCR) output. Generally speaking over 20% of the text of an early 19th century newspaper will not be correctly recognised (and this is equally true of many types of historical text). In-house checking and re-keying is not a valid option when digitising millions of pages. The authors argued for collaborative user correction and validation to improve the accuracy of OCR results. And improved OCR means better text mining, resource discovery, and overall accessibility. Already this newspaper project, in its first 6 month, brought together nearly 3,000 people to correct 104,000 articles.
The authors concluded by suggesting a hybrid approach: improved OCR technologies for automated text recognition linked with collaborative correction (not just correcting errors but also helping to train and enhance the OCR’s engine vocabulary and language analysis features).
Top problems: cost, expertise, and information management
Before moving on the topic of preservation, I would like to close this section on access by referring to the paper of Wendy Duff and co-authors from Toronto University. They looked at the impact of new technology on the museum environment in the US - and based their discussion on semi-structured interviews with 16 US-based senior museum professionals. The 3 most common challenges facing those interviewed were: cost of designing, implementing and maintaining technology, the lack of in-house expertise, and information management.
The starting point is a series of bold quotes saying that "a museum without a collections database and a Web presence is hardly considered professional" and that museums are moving from "object-centred to experiencecentred design". However on the down side "not all institutions are using online access equally well" and many funding agencies and museums professional don’t fully understand the challenges IT poses for museums. Finding from the interviews included:
There was no consensus about the extent to which the core-mission of museums has been impacted by new technologies (have they changed the core mission of the museum, does it help to attract a broader audience, or connect better with the local community, or change the way the museum works, or has it altered the way a museum sees itself, ...).
However most agreed that museums have been physically transformed by the proliferation of new technologies (multi-media installations have changed the way exhibitions are held, changes made in dissemination and collections management, for some interviewees 3D imaging is become an increasingly important tool, technology also helps professionals remain curious and creative in developing their expertise and plans for the future, and technology is now an essential tool in linking objects with the information about them, even if the management of legacy data remains a challenge).
For the majority of interviewees the major challenges are: cost of designing, implementing and maintaining technology, the lack of in-house expertise, and information management. Databases need to be created, data needs to be migrated and cleaned, metadata created and maintained, vocabularies need to agreed upon and shared, ..., and all this takes time, expertise and is expensive. Despite some people being technologically savvy, the majority of people were seen as not computer literate, so high-tech services might be simply an over-kill. Some people noted that poor quality or out-of-date information distributed over the Web can reflect negatively the reputation of the museum and its staff. And many museums don’t understand fully the cost/benefit of introducing new technologies.
An important point made by the authors was that many museums don’t appear to be dealing in the most efficient and cost effective manner with long-term digital preservation, e.g. digital photos are just being dumped on CD-ROM’s and stored on shelves.
And now Preservation:
In this conference our aim was also to review progress in digital preservation technologies and applications (and we should not forget that there was a pre-conference event dedicated to the basic concepts and practices of long-term digital preservation).
Sven Schlarb from the Austrian National Library (and his co-authors from The British Library and ARC - the Austrian Research Centers) looked at the Planets Testbed (http://www.planets-project.eu) which is a Webbased application that provides a controlled collaborative environment for scientific experimentation in digital preservation. The authors outlined how the testbed was used, how a tool was tested and assessed, and how the results analysed. Tools can be compared, preserved objects can be validated, emulation experiments performed, and a preservation plan created with recommendations. There is already a community of users sharing the experiments and a lot has been done to provide access to results (preservation services are offered, annotated datasets are available, validation services can check for valid and invalid document types, etc.). The authors closed by noting that the testbed will soon be a freely available public service.
Sam Coppens and co-authors from Ghent University looked at digital preservation using a semantic metadata schema of the PREMIS 2.0 preservation standard (http://www.loc.gov/standards/premis/). The authors kickedoff with an impressive list of all the different types of metadata that are needed: descriptive for search and general archive management, binary to describe the bitstreams, technical describing the files, structural for the representation information, preservation indicating provenance, context, etc., and finally rights metadata. The authors have extended PREMIS 2.0 to include the legal roles that people, organisations or software application can have. In concluding the authors stated that employing a 2-layer model allows the upper level with descriptive metadata to be made public (rights permitting), whilst the lower level with the legal roles remains in the hands of the institution.
Christoph Müller from the Ibero-American Institute (and his co-authors including from IPK-Fraunhofer) looked at user demands and preservation requirements for digitisation. The Institute in Berlin (http://www.iai.spkberlin.de) is Europe’s largest special collection on Latin America, Portugal, Spain and the Caribbean. The paper looks at the differing, often conflicting, requirements of scholars, librarians, and users. For example scholars want digitised copies to be as authentic as possible and tend to focus on making rare and unpublished material available. Librarians want digitisation to integrate well into their workflow and enable automated quality controls and indexing during scanning. Users (students, public, etc.) want content and context, want full-text search, and want want fast and easy access (in particular for exam preparation).
The authors now have a "wish list" for features of a future digitisation system, starting with flexible automated digitisation, then interactive quality control, excellent picture quality, easily generated metadata, etc.
Andrea Fojtu and co-authors looked at long term preservation in the Czech National Digital Library (http://www.ndk.cz/project/view? set_language=en). The authors discussed their digitisation and long-term preservation objectives (e. g. digitisation of 1.2 million documents or 350 million pages over next 20 years using robot scanners). They rightly identify the organisational challenges as being as important as the technical issues (nice expression "institutions must be ready for a business change, well before the scanners produce the first pages"). The authors provided a long list of practical suggestions, ranging from the creation of a digital preservation department to the changes needed in existing in-house workflows and the relocation and retraining of staff.
More on metadata!
Thomas Risse from the L3C research centre (and co-authors from the European Archive, the Hungarian Academy of Science, and the Max-Planck-Institut für Informatik) looked at how to turn stored Webpages into a living Web archive. The authors started by noting that Web archival has value (for scholarly studies, market analyses, IPR disputes, etc.), and there are now emerging industrial services in addition to the usual library and archival organisations. However Web content is highly dynamic, volatile, and in many formats. In addition physical media decays, technologies become obsolescent, encoding standards change, authenticity and integrity are difficult to maintain, etc. To go beyond just "freezing" Web pages, the authors looked at archival fidelity (capturing also the hidden and social Web, but not spam), coherence (identifying, analysing and repairing temporal gaps), and interpretability (ensuring accessibility and usability of the archive including the evolution of terminology, etc.). The authors discussed 2 applications: a "social Web archive" (for dynamic and varied user interactions) and a "streaming archive" (for audio-visual content) - all within a EU-funded project called LiWA (http://www.liwa-project.eu).
Felix Engel from FernUniversität Hagen (and his co-authors from Deutsche Nationalbibliothek and the company GLOBIT) looked at context-oriented scientific information retrieval with the specific aim to enable reuse of scientific publications, data and multimedia objects. This requires the capture and storage of additional metadata during all life-phases of the digital object, before, during and after archival. As noted by the authors this supports the goal of digital preservation by enabling reuse (and without being able to contact the object creators). Thus born-digital objects are defined not only as themselves, but also by life-cycle processes such a creation, appraisal, archival and adoption (unpacking, ingestion, adaption, transformation, display, emulation, access, aggregation, contextualisation, etc.) and reuse (including updates to the metadata).
Maristella Agosti and her co-authors from the University of Padua looked at cross-language access to archival metadata. The authors argued for an approach that would allow archival metadata to be both easily machine processable and permit cross-language solutions developed in the library community to be easily adopted by archivists.
Going beyond "conventual" digital preservation
Jerome Barthelemy from IRCAM in France (and his co-author including from McGill University) looked at realtime audio processing and a notation for contemporary music. The authors want not only to preserve music but also preserve the ability to re-perform the works live, e.g. for modern interactive works that are today completely dependent on a specific hardware and software implementation. They claimed that it is necessary but not sufficient to simply record and preserve outputs. The actual hardware and software used (called a patch) to process the input (from the performer or pre-recorded) must also be preserved. An alternative might be to develop a emulator, but this looks to be fraught with difficulties and uncertainties. Migration, moving from one technical environment to another, has its place. However the authors put forward the idea of virtualisation, or describing the electronic modules employed using abstractions. So a representation of signal processing modules can be found to describe say a violin played together with live electronics, and this can scored alongside the instrumental parts. Now comes the issue of musical notation.
Dennis Moser from the University of Wyoming looked at conserving digital ecologies such as Second Life. The author premiss was that our libraries, archives and museums will need to preserve complex environments such as Second Life (http://secondlife.com/ a user-generated and community-driven "experience"). He argued that it is inappropriate to simple store files, loosing the relationships that existed between aggregates of data. Massive-multiplayer online games can pose problems when trying to capture the stories, structures, rules, etc. and the complexity is increased by the closed proprietary environment used in Second Life and with the usergenerated content that has separate ownership. Moser argued for a more “ethnographic” approach when dealing with worlds such as Second Life, i.e. preserving an ecology rather than a data set. He suggests that producing video documentaries, with for example machinima (http://www.machinima.com), can go some way to capturing what actually happens inside Second Life.
A need to combat fragmentation in long-term digital preservation work
More generally the different papers and presentations on digital preservation highlighted the complex nature of the problem. In particular when dealing with environments that change and evolve in a disordered and quite rapid way.
We saw in some papers tools being developed that look to be based upon self-defined principles and methods, but which are specific to individual sectors.
The risk is fragmentation. Different ideas and approaches can rapidly lead to isolation and dead-ends when set against a rapidly changing technological and organisational landscape - even more so when users look to be driving innovations.
What we need is to share results and experiences in a cross-domain confrontation. We need to promote a common understanding of the different scenarios and frameworks that underpin efficient digital preservation policies. We need a set of networks (regional, national, European) to: Avoid useless duplication and foster a single-minded concentration on the long-term sustainability of approaches;
To test research results (to breaking) across a set of complex, cross-disciplinary tasks;
To offer high-quality training/educational events with a focus on real-world problems and using real content.
From "what might be" to "what is"
At the start of these conclusions I mentioned the objectives of the Fondazione Rinascimento Digitale, but today the real question is concerns what we can expect from the Foundation in the coming years. I personally think that the key will be to make its research, training courses, workshops, and above all its results as relevant as possible to cultural heritage professionals and academics. But to do so it will need feedback - positive and negative. Please go to the Foundations Website - look at the results, use them, adopt them, and tell the Foundation what you think. It needs constructive criticism in order to progress. And suggest to the Foundation what you think it should be doing next.
But criticism is not enough, it needs also to be congratulated when it has done something positive. And I think this conference is a positive result. What we have seen over the last 2 days has been less to do with "what might be" and more to do with "what is" - that is real-world considerations on building large digital collections, the practical reality in working with Web 2.0 technologies, the risks and benefits in working with users within large social networks, and the state-of-play in long-term digital preservation.
For making this conference happen our thanks must go to the Fondazione Rinascimento Digitale, and to the Italian Ministero per i Beni e la Attività Culturali and the US Library of Congress for the support they provided. In addition we had an impressive list of sponsors: the Ente Cassa di Risparmio di Firenze (the parent organisation of the Foundation), the Comune and Provincia of Firenze, the Regione Toscana, and UNESCO. and an equally impressive list of supporters: CNR, W3C, Liber, IFLA-PAC, European University Institute, europeana, Planets, CIVITA, and many more.
Our thanks must also go to the authors, speakers, and session chairs for providing the content of our conference and for making it such an intellectually stimulating event. Equally our thanks go the all the participants who attended all the sessions, asked questions, created debate, and who made this conference so dynamic and - in many ways a real, tangible, albeit "old-fashioned" social network.
As a final comment and with the desire to build on the embryonic community created over the 2-day conference I ask the organisers to:
Post on the conference Webpage a simple link-page listing all the links mentioned in the all the different papers, posters, and presentations (pointers to collections, tools, projects, etc.);
Send out a questionnaire to all attendees asking for comments concerning the conference content and organisation;
Consider ways to build on the community spirit established over the 2 days through a short regular newsletter or even a dedicated Facebook page (the approach must be validated with the user community).
Scarica i Proceedings:
Conference 2009. Proceedings: Cultural heritage on line. Empowerings users: an active role for user communities [file. pdf - 7.5 MB]