Friday, July 24, 2009

The Source: news about digital libraries and library innovations from around the web

Introducing The Source


21st Century Shipping: Network Data Transfer to the Library of Congress

From the D-Lib Magazine website


Between 2008 and 2009 the Library of Congress added approximately 100 TB of data to its digital collections, transferred from universities, publishers, web archivists and other organisations. The data comprised a broad range of content from photos to video, from books and periodicals to websites. Most of the data was transferred over the Internet rather than by hardware media, and for good reason: 650 GB of data shipped on a drive can take weeks from the time the content owner sends it to the time it is finally loaded onto a Library server. The same quantity takes just hours to transfer over the network.
While network data transfer has not completely replaced shipping data on hardware storage media to the Library, it is gradually becoming the preferred method. The Library's transfer process is relatively simple and quick, and it utilizes open-source software. This article describes the Library's network data-transfer process.


Australia's digital economy: future directions (Note: PDF)

From the Department of Broadband, Communications and the Digital Economy website


This paper discusses the key initiatives being undertaken by government, industry and the community to develop the digital economy, along with case studies of successful individuals and industries engaged in the area.
This paper explains:
· why the digital economy is important for Australia
· the current state of digital economy engagement in Australia and why current metrics point to a need for strategic action
· the elements of a successful digital economy
· the role for the Government in developing Australia's digital economy


Measuring Mass Text Digitization Quality and Usefulness: Lessons Learned from Assessing the OCR Accuracy of the British Library's 19th Century Online Newspaper Archive

From the D-Lib Magazine website

This article will discuss how to measure the accuracy of Optical Character Recognition (OCR) output in a way that is relevant to the needs of the end users of digital resources. A case study measuring the OCR accuracy of the British Library's 19th Century Newspapers Database provides a clear example of the benefits to be gained from measuring not just character accuracy but also significant word accuracy. After briefly discussing the role of OCR in the text capture process and how OCR works, we give a detailed description of the methodology, statistical data gathering techniques and analysis used in this study. Our conclusions point the way forward with suggested actions to assist other mass digitisation projects in applying these techniques.


Training in Electronic Records Management (TERM)

From the International Records Management Trust website

The Training in Electronic Records Management (TERM) programme consists of a series of five training modules on electronic records management plus related resource materials (Glossary, Route Maps, Best Practice Indicators). Modules include:
1. Understanding the context of electronic records management
2. Planning and managing an electronic records management programme
3. Managing the creation, use and disposition of electronic records
4. Preserving Electronic Records
5. Personnel Records as the information base for human resources and payroll management


100G Ethernet and beyond: preparing for the exabyte Internet (Note: PDF)

From the Joint Information Systems Committee (JISC) website

In the nearly 30 years since the Ethernet standard was first published, it has become the dominant mechanism for communication between devices at the data link layer of the OSI networking model, increasing in speed from the initial 10 megabits per second standard through 100 megabits and 1gigabit per second, to 10 gigabits per second (Gb/s) today.
Over the last decade, the availability of, and the demand for, information has increased at an unprecedented rate and this is driving the need for increases in the access speed across networks, and between networks and servers. Network managers are seeing a new scenario materialise, one that has moved away from a period of predictable traffic growth where capacity planning could be applied to both the Ethernet network and the separate networks used for voice, video, and storage.
This report explains why there does not appear to be a consensus for a single target and looks at the implications that may have for network managers in HE. It will review some of the technical implications of a move to 40Gb/s or 100Gb/s and make recommendations for how to maximise purchasing decisions at a time of flux in the industry. Finally, it will look ahead to the development of terabit Ethernet in order to put the continuing evolution of Ethernet into a longer-term context.


Libraries of the Future

From the Joint Information Systems Committee (JISC) website

JISC's 'Libraries of the Future' debate has gone digital, with a specially-commissioned documentary. The ten minute video marks the culmination of a year long campaign, which stimulated debate among librarians, information professionals and academics on the issues surrounding technology's impact on the emerging role of the academic library in the 21st century through a series of events, printed resources and podcast interviews.
The documentary showcases interviews with leaders from JISC, Oxford University and LSE as well as students and academics who discuss what the library of the future will look like.


Semantic Integration of Collection Description: Combining CIDOC/CRM and Dublin Core Collections Application Profile

From the D-Lib Magazine website


This article is motivated by the demand for unified access to the wealth of distributed digital cultural collections, allowing users to make queries and discover information about them through integrated processes. Our effort originates from the semantic interoperability perspective and considers CIDOC/CRM as the mediating schema, which integrates in an optimal way the semantics of the collection-level metadata schemas and application profiles. The research reveals the complexity of mapping metadata schemas to ontologies and resolves particular difficulties by presenting the crosswalk between Dublin Core Collections Application Profile and CIDOC/CRM.

0 comments: