Monday, November 30, 2009

Web 2.Nautical

At the LIANZA conference in Christchurch in October, we presented a session entitled Web 2.OhMyGod to Web 2.OhNo. We used the metaphor of colonial exploration to tell a story about the National Library’s experience voyaging into the deep dark world of Web 2.0.


The slides from our presentation are over on SlideShare, and we’ve adapted the script from the performance presentation to fit a blog post. And here's the handout we provided that was folded up like an origami ship!

During the presentation, we talked about our experiences in social networking and hoped that by sharing these reflections, the audience would take away something they could apply to their own Web 2.0 adventures.

I don’t know for sure what people took away from the session, but I do know they took away with them lasting memories of seeing two grown adults dressed like ship captains and pirates.

The National Library has spent the last few years exploring the Web 2.0 world. We’ve been getting our feet wet in places like MySpace, Flickr and Twitter. We’ll highlight five of our voyages, tell you why we went there and what we were trying to achieve. We'll analyse whether it was a successful voyage, and give you some insight on what went well and what didn’t go so well.

MySpace
In May 2007, we set sail to MySpace. A number of our governors were despatched to trade with musicians to encourage them to deposit their CDs with the National Library as part of the legal deposit process. We were having trouble communicating the legal requirement to musicians because most musicians don’t open formal envelopes that look like they’re from the bank.

When we looked back and analysed our experience, we put ourselves in our customers’ shoes and asked:

What were we offering?
Preservation of NZ music

Who was it for?
NZ musicians

Did we stay?
Not really – we didn’t have time to converse and Chelsea’s moved to a new job

There were two things we did well in our journey to MySpace:

  • We made sure the message was crystal clear: Give us your CDs. It's the law.
  • By going into MySpace we were not duplicating effort at the National Library. The only method for contacting musicians prior to MySpace was formal letters sent through the post, MySpace created one more channel for communicating to musicians.


But there were some things we didn't do so well:

  • We didn't actively engage with the community of musicians on MySpace. We didn't leave comments or try to "get to know" them.
  • We were overly cautious and did not have a good idea of where our boundaries were as a government department.
  • We also didn't have a transition strategy. Chelsea started and managed the Be Heard. Forever MySpace profile. But then she left and moved to another role within the Library.


The blogs
In June 2007 the Library boldly entered the land of blogging. This time we were better prepared for the trip. We thought about the kinds of things we could blog about, and who might be interested in them.

We settled on three ideas:

  1. Create Readers - book reviews and news and ideas about literacy initiatives.
  2. LibraryTechNZ - news about digital products and opinions from our digital teams.
  3. Collections blog - news from our curators about National Library collections.


The Collections blog never happened. This was the first, and biggest, piece of bad news on this trip. When we talked to our collections staff, we realised that for them to feel happy blogging, each post could take several hours of research. We felt this wasn't a fair thing to ask of a small group, so this idea didn't go ahead (although it hasn't gone away).

Once we set up the blogs, we sorted out a bunch of guidelines, we had training for staff in how to use the tool if they didn't already, and we briefed our Senior Leadership Team on what we wanted to do, and how we wanted to do it.

Again, we asked ourselves:

Create Readers
What were we offering?
Children’s book reviews

Who was it for?
School teachers and school librarians

Did we stay?
Yes – we have a group of staff running the blog so the distributed effort increases sustainability

LibraryTechNZ
What were we offering?
Technical knowledge

Who was it for?
People with an interest library technology

Did we stay?
Yes – The Source is a regular entry on the blog and is part of someone’s job

Collections
What were we offering?
Curatorial knowledge

Who was it for?
Cultural heritage enthusiasts

Did we stay?
No – not part of role

Some things went well with the blogs:

  • They're still there, and staff are still writing new pieces!
  • We have steady readership numbers, and a surprising and pleasing amount of interaction via comments and follow-up offline
  • Both blogs have stayed true to the intended topics, and became rich sources of information
  • The writers did not have to get managerial or communications team sign-off. Instead, all blog posts are reviewed by colleagues. This makes posting faster, and gives staff ownership over the blogs.
  • We've done some good things with them, like using the CreateReaders blog to promote the NZ Post Children's Book Awards competition, and using the LibraryTechNZ blog instead of the corporate website to talk to disgruntled web citizens during the 2008 Web Harvest.


Some things didn’t go so well:

  • Blogging is still an extra thing for staff to do, not part of their jobs. We also rely heavily on a small number of staff for the bulk of the content produced.
  • We haven't managed to get support from our comms team for the Create Readers blog because the team lacked resourcing and experience in social media, even though we think that with some extra promotion we could reach more people.
  • We think the blogs would be missed if they weren't there, but we're not sure at the moment exactly how to make them truly permanent.


Flickr
We went to Flickr because we believe in the idea that its good to take your content to the market, instead of waiting for people to find your far-flung location. Our first visit to Flickr was a general tiki-tour, beginning in mid 2007.

Like many other GLAMs organisations around the world, we loaded up photographs from our collections to test the social waters. Our first visit to Flickr gave us a chance to learn what's involved in being in a really active community space. We got some practice in reviewing contact requests, in responding to comments, and found out what kind of content people were interested in seeing.

More troubling for us was the rights issues. When we loaded photos up to Flickr, we had to use the All Rights Reserved option. We couldn't apply a Creative Commons licence, either because the work was out of copyright, or because the copyright was held by the creator, not us. As a make-shift measure we put a note on all our photos, encouraging people to use them in certain ways.

Then we joined Flickr Commons, which lets us use a No Known Copyright Restrictions licence. This means that we have to be super-careful when picking the photos we put up there, but lets us experiment with what happens when you set your content totally free.

Once again we asked ourselves:

What were we offering?
Heritage photographs

Who was it for?
Cultural heritage enthusiasts

Did we stay?
Passively – there’s low effort to add photos occasionally and respond to comments, no active involvement (eg. discussions, joining groups)

Our trip to Flickr was generally a success:

  • We learned how to take risks
  • We took our digitised photo collections out to people where they were, we didn’t force them to come to our site
  • We were clear with people about how they could use the images
  • By joining the Commons, we established relationships with other heritage institutions

But we lacked the resourcing to make it really awesome. Like the blogs, managing and growing our Flickr presence isn’t a major part of someone’s job.

Web Harvest
In October 2008, we ran a web harvest of all websites in the .nz domain as part of our legislative mandate. We were attacked by pirates – webmasters annoyed with how we did it. They took to the web to voice their anger in blogs and on Twitter. We talked about the experience in this post.

What went well:

  • We were in the social media space so were alerted to the growing frustration
  • We responded very quickly. We luckily already had available social networks, such as blogs, twitter and list-servs, to engage with webmasters


Twitter
Twitter is our latest venture - it works because we actually applied learnings from earlier ventures. The design of the presence is deliberately quite low effort because we had low resourcing, but we were quick to identify opportunity to promote content.

Twitter seems to satisfy two strong human needs: gossip and voyeurism.

Twitter lends itself so well to a short comment and a hyperlink that it became obvious really fast that posting links to items in digital collections such as Papers Past and Manuscripts & Pictorial was a natural use for the @nlnz account.

To make it easier for ourselves, we made up some rules:

  • We post twice a day (that's why they're called #tbreaktweets: we try to time our posts with the Library's traditional morning and afternoon tea times)
  • We restrict the tweeting to the #tbreaktweets; we don't do events or systems outages or media releases. Hopefully this means we're predictable, in a good way.
  • We try to make sure we're at our desks for 30 minutes after the tweet goes out, in case anyone writes back. If we're not open to conversation, what's the point of being there?


We don't measure the success of our Twitter stream by the number of followers. Instead, we use a URL shortening tool called bit.ly, which records how many clicks the links get, and we aim for conversations with our followers.

You know what’s next! The three big questions:

What were we offering?
Heritage curiosities

Who was it for?
Bored Twitterers/Cultural heritage enthusiasts/Weirdos

Did we stay?
Yes – low effort once a day

What went well:

  • We applied what we learned from previous adventures in social networking.
  • We knew exactly what we were offering, who we were offering to, and how much effort it would take to sustain it.
  • It was really important to us that we put our names on the account. Institutional accounts without any real names attached are a big no-no in our books.
  • We increased our audience. We knew there were cultural enthusiasts on Twitter that might enjoy our tweets, but we were surprised at the range of people we attracted. Our followers include cultural institutions, friends and acquaintances, art lovers, history lovers, library lovers, information lovers, New Zealand lovers, humour lovers, John Key lovers. It's an eclectic mix.


What didn’t go well:

  • Nothing….yet.


We think Twitter works because we have something good to offer, because there's a group of people who are interested in hearing from us, and because it's mercifully lightweight and doesn't interfere much with our working days.


Fleeting visits

We should mention we’ve made other journeys to places like Wikipedia, Delicious, Slideshare and Youtube, but these were fleeting visits. We dabbled in these platforms in the early days but haven’t really found a good fit between these platforms and what we have to offer. To be honest, if we had known back then what we know now, we might never have set up a presence on those sites.

Each of these voyages failed because they failed to meet at least one of the criteria for a successful social media presence: what’s on offer, who’s the audience, and how will it be sustained.


Places we didn’t explore

There are two major players in the social networking world that we have deliberately avoided: Facebook and Bebo. You can’t argue that these two sites are massive and very popular, but we didn’t want to make clowns out of ourselves.



In Facebook, we don’t have any content to offer. Facebook is great platform for creating discussion among a particular group around a particular topic (like the group, “It’s Kiwifruit not Kiwi” for all those people that understand the difference between kiwifruit and kiwi.). It’s also great for posting events. We haven’t yet identified anything from National Library that would fit in those categories.



In Bebo, we don’t have anything for the audience. Bebo is used primarily by teens, and right now, we don’t have a lot to offer this audience. No-one in our Services to Young New Zealanders team has indicated an interest in having a presence here.



Overall lessons learned

When Web 2.0 began 4 years ago, it was all about “go out and give it a whirl”, but Web 2.0 has grown up fast and people expect us to behave in a certain way. You have to know the game before you start playing it. Know the platform before you represent your organisation – try it out on your personal account first. You also have to truly engage. There’s a big difference between moderation, which may take 2 minutes, and engagement, which can take much longer.



We also learned that a key ingredient to success is making sure the people that run the thing really enjoy what they’re doing.



But the three most important lessons we learned are: Know thyself. Know thy audience. Know thy limits. You have to have all three ingredients: content, audience and resource for it to work.



We hope you’ve enjoyed our journey into Web 2.0 and although it hasn’t always been smooth sailing, we’ve learned a lot and hope you have too.


Friday, November 27, 2009

The Source: news about digital libraries and library innovations from around the web

Introducing The Source


Beyond 1923: Characteristics of Potentially In-copyright Print Books in Library Collections

From the D-Lib Magazine website

Issues of copyright and permissible use have swirled around efforts to digitise print book collections. Sharp debate has ensued over the circumstances in which creating a digital surrogate and making it accessible online runs afoul of copyright protections, and what remedies might be appropriate to compensate rights holders. Some digitisation efforts, such as the Open Content Alliance, have restricted themselves to public domain materials; Google Books, on the other hand, has sought to reach agreement with copyright holders represented by the Authors Guild and the Association of American Publishers.
The Google book settlement provoked spirited discussion of its potential ramifications, mimicking the commotion that followed the announcement of the original Google Print for Libraries (later re-named Google Books) project in December 2004. Using data from the WorldCat bibliographic database, OCLC Research published an article in 2005 aimed at illuminating issues surrounding Google's plan to digitise the print book collections of five major research libraries. The present article is motivated by a similar purpose: to provide empirical context for the many discussions surrounding the digitisation of in-copyright print books. The settlement has raised challenging questions regarding permissible use of print book titles published after 1923; many of these titles may eventually form a significant part of the Google book database should it come to pass.


International Copyright: Why It Matters to Libraries (Note: PDF)

From the Library Copyright Alliance website

Because libraries share a unique social responsibility for preservation of, and access to, the world’s intellectual heritage, they have an interest in promoting copyright laws that provide the broadest possible use of information for creativity, research and education. The Library Copyright Alliance (LCA) is working to address an increasing number of international legal and policy issues that affect libraries and the public, because of the many unresolved aspects of intellectual property rights in information in the digital age. The library community has long been engaged in responding to and developing proposals to amend international copyright law. In 2007 LCA gained accreditation as a non-governmental organisation with observer status at the World Intellectual Property Organization (WIPO). This has enabled even more direct involvement. LCA represents the U.S. library community at WIPO at the Standing Committee on Copyright and Related Rights (SCCR), Committee on Development and Intellectual Property (CDIP), and Intergovernmental Committee on Intellectual Property and Genetic Resources, Traditional Knowledge and Folklore (IGC), and in other international fora.


A Low Cost, Low Memory Footprint, SQL and Servlet-based Solution for Searching Archived Images and Documents in Digital Collections

From the D-Lib Magazine website

Easy online access to digital documents in special collections is a must for any library. Many of the resources in special collections are unique and irreplaceable. Because of their singular characteristics, their preservation, digitisation and availability online are of high priority for the library, and in many cases it is part of the strategic plan of the institution. Vendor products, as well as open source options, are available: as a commercial example, CONTENTdm® is being used at the University of Southern Mississippi Libraries, at the University of Washington, at IUPUI Libraries, and at many other places. On the other hand, Fedora is an open-source digital repository software which supports the University of Maryland Libraries Digital Collections, as well as others. Because of their richness of features, these software options were too complex to implement at the institution featured in this article: dedicated personnel is necessary in both cases, and financial support is required for the purchase and maintenance of any commercial product; neither of these requirements could be met.
In this article, we demonstrate a simple, elegant solution created in-house, with no additional monetary commitment that meets the needs of the institution.


Digital Economy Bill [HL] 2009-10

From the United Kingdom Parliament website

Summary of the Bill:
  • make provision about the functions of the Office of Communications
  • make provision about the online infringement of copyright, about licensing of copyright and performers’ rights and about penalties for infringement
  • make provision about internet domain registries
  • make provision about the functions of the Channel Four Television Corporation
  • make provision about the regulation of television and radio services
  • make provision about the regulation of the use of the electromagnetic spectrum
  • amend the Video Recordings Act 1984
  • make provision about public lending right in relation to electronic publications

From TIFF to JPEG 2000? Preservation Planning at the Bavarian State Library Using a Collection of Digitized 16th Century Printings

From the D-Lib Magazine website

Studies and user reports claim JPEG 2000 to be – or at least will become – the next archiving format for digital images. The format offers new possibilities, such as streaming, and reduces storage consumption through lossless and lossy compression. Another often claimed advantage of JPEG 2000 is that the master image can possibly serve as the access copy as well, and thus replace, derived compressed, low resolution access copies. The National Library of the Netherlands (KB-NL) evaluated the suitability of alternative file formats such as JPEG 2000 to their currently used format uncompressed TIFF.
Having the advantages of JPEG 2000 in mind, the Bavarian State Library (Bayerische Staatsbibliothek, BSB) also considered the option of migrating from TIFF to JPEG 2000 as the archive format for digitised images of rare books. BSB aims at digitising its complete collection of manuscripts and rare books, applying high standards and policies that result in considerable image sizes of the TIFF-master copies.
In order to find out whether TIFF or JPEG 2000 would be a more suitable archival master format, the BSB, together with the Vienna University of Technology, created a preservation plan for a representative collection of digitised 16th century printings. The goal of the project was to evaluate possible strategies for migration from TIFF to JPEG 2000 using lossless compression, including the alternative of keeping the status quo. The current preservation plan documents the resulting decision, taking into consideration the institution's preservation policies, legal obligations, organizational and technical constraints, requirements, and preservation goals, as well as the capabilities of the tested tools.


Knowledge as a Public Good

From the Scholarly Publishing & Academic Resources Coalition (SPARC) website


One of the most durable arguments for OA is that knowledge is and ought to be a public good. What is a public good? In the technical sense used by economists, a public good is non-rivalrous and non-excludable. A good is non-rivalrous when it’s undiminished by consumption. We can all consume it without depleting it or becoming “rivals”.
Knowledge is non-rivalrous. Your knowledge of a fact or idea does not block mine, and mine does not block yours. Knowledge is also non-excludable. We can burn books, but not all knowledge is from books. We can raise the barriers to knowledge, through prices or punishments, but that only creates local exceptions for some people or some knowledge. When knowledge is available to people able to learn it, from books, nature, friends, teachers, or their own senses and experience, attempts to stop them from learning it are generally unavailing.


What Do Teens Want?

From the Publishers Weekly website

In an industry without a lot of good news to report, the one consistent bright spot has been publishing for teens. While adult trade sales are expected to fall 4% this year, juvenile and young adult sales are expected to increase 5.1%. Although it's impossible to completely break out juvenile from young adult (YA), it is possible to look at expected growth rates for different categories. In the fiction/fantasy/sci-fi segment, where most sales in the YA category fall, we expect nearly 13% growth in 2009, reaching $744 million. By 2013, sales in this segment are anticipated to hit $861 million, a 30.6% increase over 2008.
Sure, lots of the growth in the teen category can be attributed to some phenomenally successful, blood-sucking bestsellers. And there is no doubt that there is a great deal of crossover readership from adult buyers. Nevertheless, this buying bubble is being fuelled by a teen demographic about which we know very little.

Monday, November 23, 2009

DDI: How beta is a beta release? Quality versus Delivery

This week our DDI development programme has been brought to you by the letters P and M - of Project Management.

So the big drama this last week has been deadlines and prioritisation - the stuff that gives project managers job security. We really want to get the upgraded Timeframes preview out ASAP, but we've found there's more work to be done than there is time available - I'm sure we're the first people in the world to strike this problem ;-)

But is the problem that we're slow developers or that we're bad at estimating how long the work will take? From my observations, at the end of the day we're basically good old-fashioned, optimistic Kiwis. There's that software development saying of "Take your estimate [of the time it will take to do the work], then double it, then double it again, then you're starting to get close to knowing how long it will take". Well, we apply that rule, then say "nah, surely it can't take thaaaat long, it'll only be half that... she'll be right".

Mind you, estimating accurately is doubly-hard when you're working on a system that you are still learning. We've found Primo is well optimised for university and public libraries, but as a national library we have additional needs - both in the content we have and how we present it. This means we've been sidetracked into developing a number of 'workarounds', though thankfully Primo has a plugin architecture so it hasn't been too mind-stretching.

We've now got something that mostly works, but isn't quite how we'd really like it to be (or how we know it could be). We're running up against our self-imposed deadline but some of the ducks aren't in the row yet. What to do?

It's at this point that all project/product managers hit that age-old dilemma: which would users prefer - almost reasonable today or really good next week? You only get one chance at first impressions. Users usually say they'd prefer 'anything' now, but then grumble when it wasn't what they hoped. I think I'd like to phone a friend on this one.

Given it's only a beta release, how much incompleteness and confusingness in the interface will users put up with? The Google (et al) Labs and 'perpetual beta' phenomena means people are open to the idea of using a site that is a work-in-progress, but we haven't quite set up our environment that way yet so we're on the back foot. While delivering something early may reduce the development workload over the week, it increases the 'comms' workload - communicating what state it's in, known issues, tips for completing tasks, etc., and handling increased user queries (from users that didn't read all that carefully crafted comms stuff).

Not an easy decision... Mainly because there's no 'right' answer.

Separately we've been working on a service model for managing our digital services (remind me to cover that in a later post). While the customers' needs are paramount, we believe there are three perspectives to balance:

  • Business needs
  • Customer needs
  • Technology needs.
This was a good opportunity to give the model a whirl in a field-test. We brought together representatives from all three perspectives and reviewed and prioritised each of their needs. We gained a shared understanding of each other's needs and came to a mutual decision. Since this is only a preview (running in parallel with the production system) we decided really there are only two 'must-haves', and once those are ready we'll release it. We'll then move on to tidying up the remaining high priority needs before we cutover the old site - after all, the tidyup won't take long based on our estimates (!!).

Surprisingly we received no feedback (positive, neutral, or negative) for the recent cutovers, so that bodes well for this preview, right??

Stay tuned.

Friday, November 20, 2009

The Source: news about digital libraries and library innovations from around the web

Introducing The Source


Report of the Task Force on (Harvard) University Libraries
(Note: PDF)

From the Office of the Provost, Harvard University website

Harvard’s library system now includes 73 separate libraries with 1,200 full-time employees, 16.3 million volumes, 12.8 million digital files, over 100,000 serial titles, and millions of manuscripts, photographs, musical recordings, films, and artefacts of all kinds, making it by far the largest university library in the world.

Statement on the Report of the Task Force on University Libraries (Note: PDF)

The Core Recommendations of the Task Force are:
  • Establish and implement a shared administrative infrastructure
  • Rationalise and enhance information technology systems
  • Revamp the financial model for the Harvard libraries
  • Rationalise system for acquiring, accessing, and developing materials for a “single university” collection
  • Collaborate more ambitiously with peer libraries and other institutions

Making the case for European research libraries (Note: PDF)

From the Ligue des Bibliothèques Européennes de Recherche (LIBER) website

The Ligue des Bibliothèques Européennes de Recherche (LIBER) Strategic Plan 2009-2012 provides a framework for the LIBER Strategy in the coming years. In 2009-2012 LIBER will give priority to the following areas:
  • Scholarly communication
  • Digitisation and resource discovery
  • Heritage collections and preservation
  • Organisation and human resources
  • LIBER Services

Social isolation and new technology: how the internet and mobile phones impact Americans’ social networks (Note: PDF)

From the Pew Internet & American Life Project website

This survey is the first ever that examines the role of the internet and cell phones in the way that people interact with those in their core social network. Key findings challenge previous research and commonplace fears about the harmful social impact of new technology.

Wednesday, November 18, 2009

Papers Fast

I've just finished writing up a project we finished earlier this year: Papers Fast.

Some background

Papers Past was re-launched in 2007 with a new look and new features -- particularly search -- and quickly become the National Library’s most popular website. In the first year the number of visits per month increased 20-fold, and then it kept growing. But even when it was re-launched, Papers Past was not a fast website. And as time passed, and the number of users grew, and the number of pages increased, we noticed it was becoming slower and slower.

To start with we had an easy solution: when we noticed the site was slowing down, we added another web server to share the load. We started with three web servers. By the time we got to eight this approach had stopped working: adding new web servers did not make Papers Past any faster. Worse, we had built up a backlog of almost half a million pages of searchable text that we could not put online because we were worried the whole system would grind to a halt.

Drastic action was necessary.

So the Papers Fast project was launched. Its goal: to make Papers Past fast.

What’s the problem?

After talking to people who might know, we identified four factors that might be causing problems:

  1. Application. As far as we know, Papers Past is the biggest and most-used Greenstone installation in the world. Maybe Greenstone cannot scale up far enough?
  2. CPU. Papers Past was running on old Sun SPARC servers that were due for a refresh. Maybe new servers would do the trick?
  3. NFS. Most of the Papers Past data is served up using the Network File Service protocol. Is this a good choice for Greenstone?
  4. Network. The Papers Past data is stored on a different part of the network from the web servers, behind a firewall. Is this a problem? Which was it?
To find out, we borrowed a massive computer with 24 terabytes of disk from GEN-i, copied over all our digitised newspaper data, and asked DL Consulting to install a fresh copy of Greenstone, setting up an entirely separate copy of Papers Past.

Then built a fake collection with 2.5 million searchable pages, used Jmeter and our Apache logs to put the test system under twice as much load as we've ever seen before, and watched to see what would happen.

We found the problem was... all of the above.

So what to do?


The first fix was to upgrade Papers Past search to use Apache Solr instead of Apache Lucene. The second was to replace our eight aging webservers with two new Sun Blade Servers with AMD CPUs. Third, we switched to local disk for the metadata and indexes (we'll upgrade to a fibre-attached SAN by the end of the year).

Then we built a new fully-searchable collection (including three new titles) and re-launched on 22 June 2009, two days ahead of schedule!

And no technology project would be complete without a little scope creep. In this case, we had to support the METS/ALTO journal profile so we could add Kai Tiaki: the Journal of the Nurses of New Zealand to the collection, and to extend the image server to support new titles digitised in greyscale. DL Consulting made these changes, and a few more, along the way.

Did it work?


Yes. We've been serving more traffic, and response times have been faster.

For Papers Past, we track traffic from Google separately from everyone else (it's along story, but the core problem is that we serve so much data to Google that our aging web statistics package can't crunch the numbers).

So here's the number of hits we served to everyone other than Google for four weeks before and eight weeks after the launch.

And here's the number of pages we serve up to Google (via Google Webmaster Tools).

You can see that requests from everyone is way up -- especially Google, who have slurped up about 700,000 pages per day lately, peaking at over a million. Before the upgrade, we had a lot of trouble getting Papers Past fully indexed in Google News Archive, but now it is pretty much all there.

Despite this increased traffic, Papers Past response times are much improved. We have been monitoring response times since 2007, and set out very clear performance targets before we kicked off Papers Fast. Here's the performance targets, and the times we observed before and since the changes were made. (All times are in milliseconds.)

Performance measure

Target

Before changes

Since changes

Average response time for generated page request: measured by Google Webmaster Tools

1000

3000-5000

600-800

Average response time for generated page request measured by the Library

1000

> 3500

402

Average response time for search page request measured by the Library

1500

6639

1055

Average the Library time for image server request measured by the Library

6000

11158

3574


Let's take a look at the changes graphically. Here's out internal tracking of response times.

Here's how the response times were tracked by Google.

It's quite a change.

Finally, it has made a big difference for our infrastructure. Here's how the NFS traffic to one of our fileservers changed when we moved the Papers Past metadata and search indexes away. It's also freed up corresponding network capacity.

Summary

On 22 June 2009 Papers Past users not only got half a million more searchable pages, they got a big speed bump. Traffic is up since then, but response times have remained low, and we have a plan to handle more data (the SAN) and more users (extra front-end servers).

Saturday, November 14, 2009

DDI: Three website upgrades

Following Courtney's challenge, I'm gonna take a crack at weekly updates on our current major website developments.

As the Digital Service Manager for Find, which is the poster child for a larger internal programme called DDI (Discover, Deliver, Interact), I'm supposed to hold it all together. We'll see if I can hold it together for a weekly update on progress...

Anyway, that's the end of my intro/disclaimer/apology if these posts peter out. Where are we at?

We've been migrating a lot of our metadata records to the new Primo software platform, and we released our first cut in July as the new Find search service. Our main priority has been migrating services off older software which has reached the end of its life.

Last Monday (the 9th) we cut over three of our websites to their (much faster) upgraded versions:

  • The GLAM organisations who are members of Matapihi have most of their content loaded into the growing giant DigitalNZ, so it made sense to move Matapihi's back end to the DigitalNZ engine. We also conveniently have all the Matapihi content loaded in Find
  • findNZarticles contributed content is also in Find, so the back end has been migrated to the Primo platform, it continues to have its own website
  • PublicationsNZ content (a.k.a. the National Bibliography) is also in Find, but it is effectively a subset of our National Library catalogue, so it no longer has its own website, instead there is a PublicationsNZ entry page on Find.
There's still some tidy up work to do, but these seem to be running reasonably well at their new locations.

It took us quite a while to come to terms with Primo's internal 'PNX' record format and how metadata records are converted during import; it loves MARCXML and simple Dublin Core records, but it coughs loudly when you throw more complex XML (especially with namespaces) at it, but we're finally starting to understand how to wrangle it. There's also a hugely complex maze of mapping/lookup tables - slowly we're piecing together the chains of lookup codes and documenting their inter-relationships so it's easier to maintain.

Our eyes are now focussed migrating two remaining services - Timeframes and Discover. We are planning on releasing previews for these before the end of November. You can check the current timetable on our Online Services Changes page.

Friday, November 13, 2009

The Source: news about digital libraries and library innovations from around the web

Introducing The Source


Copycats? Digital consumers in the on-line age (Note: PDF)

From the Strategic Advisory Board for Intellectual Property Policy (SABIP) website


Huge economic losses are being sustained due to large-scale unauthorised downloading, generated by widespread confusion about copyright law in the online world. This UK report examines online consumer behaviour in the UK and its potential impact on business and government policy. It is the first piece of research to look at evidence from across the copyright industries and across all age ranges.
The report has two further objectives:
  • To inform a SABIP workshop at which a selected group of attendees with a direct interest in the issue will consider the implications of consumer behaviour on IP and make recommendations for further areas of SABIP research
  • To highlight any further SABIP research that is required to ensure that all agencies of Government have the fullest understanding of the issues
Key findings:
  • The world of the digital consumer is an environment, indeed a series of ‘eco-systems’, subject to rapid change; change that means many predictions about the future of the Internet and digital convergence (and how these are ‘consumed’) made even two, and certainly five and ten years ago seem quaintly dated – a fact that should be held in mind as predictions are made for the future of not just ‘Digital Britain’, but also the ‘Digital World’
  • Within ten years we have seen the widespread domestic use of high-speed broadband and multichannel (and often High Definition) digital television with the facility to time-shift, copy and view programmes on other devices, and to upload these files to websites such as YouTube; the arrival of wi-fi in the high-street, the library, the office, university and the home; the rapid expansion of open source and Creative Commons publishing; at least four iterations of file-sharing technologies; the birth of mainstream blogging as a broad social phenomenon; the arrival of social media as a significant medium of authorship, sharing, and communication; the shift by the younger digital consumer towards the mobile phone as not just an aural communication tool, but also a medium for text messaging, music and video consumption, and as a gateway to post messages, photographs and other types of content to social media websites
  • Most recently the large expansion in use of ‘microblogging’, to websites such as the text-based Twitter and the image-based Tumblr, has once again surprised many who suspected these services were a fad. Finally, the recent successful launch of the BBC’s authorised programme-streaming service, iPlayer, and the music streaming service, Spotify, has demonstrated that new forms of business models may be possible in the world of ‘free things’. Unsurprisingly, the literature review we undertook does not grasp the enormity and the speed of these changes. Each impacts centrally on intellectual property
  • The challenge for IP policy makers is to judge and, where possible, measure the changing social behaviours and attitudes brought about by the myriad rapidly evolving technologies and networks of the digital revolution, and map this against their economic, political and social objectives

'Authentic' learning experiences: What does this mean and where is the literacy learning? (Note: PDF)

From the aWAy with Words Conference website

Teachers are challenged to adopt practices that facilitate the development of “necessary” skills and strategies for learners. For many, however, what is required in policy and curricula is increasingly obscured and even confusing as teachers are bombarded with jargon prescribing seemingly similar (yet apparently different) approaches such as “rich tasks”, “big questions” and “fertile questions” that are to be "relevant”, “authentic” and “engaging” for the learner. Barton and Hamilton (2000) argue that literacy learning should take the learner beyond the transmission of technical skills in the classroom to an understanding of its role within a community’s cultural practices. These literacy practices are mediated by literacy events and it is engagement with these events and their diverse demands that allows learners to make strong connections to their own literacy practices.
Reported in this paper are the interpretations of four experienced primary school teachers as they plan, programme and facilitate authentic literacy experiences in their classrooms. These are examined within the framework of the principles of authentic learning, which is useful in gaining insight into the ways that experienced teachers make sense of the complex jargon associated with their profession for the development of deep and flexible knowledge that can be applied in a range of community settings. Evident in these teachers’ stories are the understandings, beliefs, contexts and competing tensions that underpin the conceptualisation, design and implementation of these experiences. The teachers’ stories reveal the complexity of teaching as they consider:
  • the individual contexts of their schools
  • their students’ own communities
  • the expectations of stakeholders in a child’s education
  • the availability of resources

Public libraries and the Internet 2008-2009: Issues, implications, and challenges

From the First Monday website

This paper presents an overview of methods, findings, issues, and implications from the 2008 ‘Public Libraries and the Internet’ national survey, including comparisons to data from previous studies. Since 1994, these surveys have chronicled the expansion of the Internet as a primary library service. The 2008 survey includes key data about the many facets of public libraries as community Internet access, training, and service centres, from the number of workstations and connection speeds available to the most common Internet services and training. The findings from the 2008 survey reveal impacts of the global recession on public libraries and their ability to meet the needs and expectations of patrons, communities, and all levels of government.

Wednesday, November 11, 2009

Engage Your Community - Social Media Workshop

On Friday 13 November I'm giving a workshop on social media at the Engage Your Community conference.

Workshop Format

As I'm not sure what the level of experience is across the people in the workshop, I've broken it into five sections. Each of these sections can be expanded or contracted, depending on the level of detail we need to go into. I'm hoping for loads of experience-sharing from the people in the workshop.

Introductions
How do we all use the web? How many of us are running personal social media accounts? How many are running accounts on behalf of their organisation? What happens when personal and professional use start to overlap?

This section is designed to get people talking, and to give me a chance to assess how familiar people are with social media tools. That will help me pitch the following sections at the right level.

Observations from Day 1
A few quick points from the presentations given by Colin Jackson, Nathalie Hofsteede and Chris Brown.

A tour of the social web
What's out there that people could be using?

- Listening in (RSS feeds, Google Alerts)
- Joining in (Twitter, Flickr, blogging)
- Community & collaboration (Facebook, wikis, Ning)
[All with examples from the not-for-profit sector]

The golden rules of social media
Things to ask yourself before embarking on any social media adventure (and certainly before picking a social media tool):

- Why do you want to do this?
- What are you offering?
- Who is this for?
- Who will be doing this?

And (numerous) steps for a successful launch.

Planning exercises
Depending on how much time we have, I've prepared an activity for people to break into small groups and plan a social media 'campaign' for a specific scenario.

Hopefully all this gives a bit more context for my slides



Resources & examples

I've also prepared a rather lengthy handout which I'm now going to reproduce here for ease of use.

Introductions to different kinds of social media

It’s hard to beat the team at Common Craft http://www.commoncraft.com, who make short, straightforward videos about all matter of web (and non-web) things.

These are all available on the Commoncraft YouTube channel

Listening in

Twitter search | http://search.twitter.com

Google Alerts | www.google.com/alerts

Google blog search:
  • Google your search terms
  • From options at top left of results page, choose Blogs from the ‘More’ drop-down menu
  • Scroll to the bottom of the search results

Useful reading
- Social media monitoring (State Services Commission)

Joining in

Blogger | http://www.blogger.com

Wordpress
http://wordpress.com (basic account)
http://wordpress.org (to do your own hosting)

Twitter | http://twitter.com

Flickr | http://www.flickr.com

Examples used:
- Whangarei SPCA blog
- Get in on! Twitter
- Rainbow Youth Flickr

Useful reading
- Twitter case study (National Library)
- Mashable’s Twitter Guidebook
- Twitter for non-profits (Mashable)
- Fundraising potential for Twitter (TechCrunch)
- Darren Rowse’s blogging lessons

Community & collaboration

Ning | http://www.ning.com

Wikis
http://www.wetpaint.com
http://pbworks.com
http://www.mediawiki.org

Facebook | http://www.facebook.com

Examples used
- Mt Cook Mobilised wiki
- Museums 3.0 Ning group
- Cancer Society’s Daffodil Day campaign

Useful reading:
- Case study on Daffodil Day campaign (Ideashop)
- Managing Facebook groups (Mashable)
- Wikis when and why (Nina Simon)

Community management

If you’re going to start spending time with your community online, you’re effectively becoming a community manager. This elderly post from Jeremy Owyang is still relevant if you’re trying to figure out if this is your new line of work.

Like any job, there are some personal qualities you’ll need to bring out in yourself, and some tactics you might find useful.

- A case study from the Brooklyn Museum
- A case study from (the early days of) Flickr
- My notes from Heather Champ and Derek Powazek’s 2009 ‘Designing and sustaining creative communities’ workshop

Planning

One of the most important things you need to ask yourself is – how much time do I (or my team of people) have available? How much time does Web 2 take (Nina Simon)

You’re likely to need some simple policies around how you/your team use social media sites in a professional capacity or on behalf of your organisation. I’m a big fan of the very simple guidelines from the State Services Commission, which were written for government, but which translate over well

The Guardian’s community standards are also helpful if you’re thinking about things like comment moderation

And this page aggregates links to social media policies

One piece of advice: these are your policies. Don’t try to second-guess everything that might go wrong & plan against them, or you’ll become paralysed. Read some of the material above, write some useful & sensible guidelines (aimed at helping the people doing your social media outreach to understand what’s okay and what’s not so okay, both in terms of their own behaviour and that of others) and then update as time goes by and circumstances change.

Generally useful, sometimes even inspiring, reading

Beth Kanter’s blog ‘How nonprofits can use social media’ (the title pretty much explains it)
- http://beth.typepad.com/beths_blog

Nina Simon’s Museums 2.0 (Nina is interested in people’s participation in museums & galleries, and frequently writes about social media projects)
blog
- http://museumtwo.blogspot.com

The Community section on A List Apart (but don’t stop there, please, this site is full of delicious reading)
- http://www.alistapart.com/topics/content/community

The Pew Internet & American Life Project regularly issues reports on people’s online activities and behaviour
- http://pewinternet.org/Data-Tools.aspx


Friday, November 6, 2009

The Source: news about digital libraries and library innovations from around the web

Introducing The Source

© the way ahead: A Copyright Strategy for the Digital Age

From the Intellectual Property Office (IPO) website

The aim of copyright is to encourage authors’ creativity and make their works available widely. It is a global system that provides incentives for authors and investors, while allowing access to works for educators, researchers, cultural institutions and users of all sorts, both in business and in the home. Copyright engenders strong emotions. It is about authors’ livelihoods and recognition and about financial rewards for rights holders. But it is also about access to the copyright works, which are essential to our values, our cultures and to the way we spend our work and our leisure time.
This work looks ahead to how copyright can tackle the challenges of the digital age, drawing on previous work including Digital Britain and the Gowers Review of Intellectual Property, on international perspectives including the European Commission’s and on discussions and submissions from stakeholders.


Digitisation of special collections: Mapping, assessment, prioritisation (Note: PDF)

From the Joint Information Systems Committee (JISC) website

Traditionally, digitisation has been led by supply rather than demand. While end users are seen as a priority they are not directly consulted about which collections they would like to have made available digitally or why. This can be seen in a wide range of policy documents throughout the cultural heritage sector, where users are positioned as central but where their preferences are assumed rather than solicited. Post-digitisation consultation with end users is equally rare. How are we to know that digitisation is serving the needs of the Higher Education community and is sustainable in the long-term?
Key Findings:
  • The communities of both intermediary and end users are willing to express their view on prioritising digitisation of special collections; the participation in the project was a matter of good will and the good response makes evident that there is definitely interest of the professional communities to express their opinion on the matter of digitisation needs. It should be noted here that the community of intermediaries sees collections on a finer level of granularity; end users often refer to super-collections such as the holdings of an institution
  • The top user-driven priority criteria that emerged from consultation with both intermediaries and end users are: Improve access; Enhance impact on research and/on studies; Enhance impact on teaching; Allow for collaboration; Improve access outside
  • The geographic and institutional boundaries of collections nominated for digitisation are wider – this study was aimed at the higher education institutions in the UK, but 14% of the nominated collections were from institutions outside of the higher education sector, and 6% were from overseas
  • The complementarity of collections is strongly favoured by both users’ communities
  • The criteria for digitisation nominated by intermediary and end users include general criteria but also a number of criteria where metrics can be applied; thus allowing to establish a ranking mechanism

Integrated Library System Platforms on Open Source / Stephen Abram (Note: PDF)

From Stephen's Lighthouse (Stephen Abram) blog


Stephen Abram discusses what he (and SirsiDynix) see happening when libraries get into talks about moving their Integrated Library Systems to open source platforms systems. What has been found is that they often are not aware of the heavy drawbacks of what open source systems cannot offer at this point in time. To help buyers become aware of the limitations of open source, he has set out to clarify what open source is, how it is different from proprietary software platforms, and why Integrated Library Systems (ILS) are not ready for open source at this point.


Testing the accessibility of Web 2.0

From the University of Southampton, School of Electronics and Computer Science website

Dr Mike Wald and E.A. Draffan are leading a project funded by JISC (Joint Information Systems Committee) TechDis which looks at how well people with disabilities can access web services such as blogs and wikis and social networking sites. The team have built an accessibility tool kit, which will enable users to test the accessibility of web 2.0 services. The accessible pen drive offers freely available assistive technologies that can be used to help with this evaluation.
Web2Access, part of the toolkit, provides an online checking system for any interactive web-based services such as Facebook. “We developed it because nowadays users contribute, as well as read, information and so you cannot just click on a button to see if websites are accessible and easy to use”, said E.A. Draffan.