Archival Media Preservation header image 2

Digital Archiving: Fun for everyone?

September 23rd, 2011 · No Comments

by

How did one institution attract 50,000-plus volunteers to help with an archiving project?

The National Library of Finland is in the process of digitizing its archives so that they are fully searchable on the Internet. Scanning the centuries-old newspapers, journals, and documents isn’t so much the problem as is accurately transcribing the text. OCR (Optical Character Reading) software can only do so much. Standard fonts are easy enough for a computer to identify, but aging print in fancy scripts are more difficult. Obscure language, proper names, and decaying paper also interfere with OCR’s text recognition. In order for the materials to be accurately digitized, every document must then be double-checked by human eyes.

To help with the process, The National Library of Finland teamed up with Finnish technology company Microtask to come up with an innovative solution: make a game of it. Granted, it’s hard to imagine how anything like checking manuscript text against a computer’s digital interpretation could really be fun. But Microtask saw things differently—instead of pages of repetitive work, they broke down each individual word-check into what they (appropriately) call microtasks.

Taking each microtask as a tiny action to perform, two online flash games were created.

In Mole Bridge is a fast-paced typing challenge. An image of a digitized word from the original manuscript appears at the top of the screen. The player must type the word, as best he or she can read it, as quickly as possible. The typed word helps build a “bridge” that ensures the safety of some adorably cute, but evidently suicidal moles, who tirelessly try to cross the void. The quicker the words are entered, the more moles make it to safety. Words that are indecipherable—or too difficult to type (American keyboards lack the å, ö, and ä symbols common in Finnish) — can be skipped without penalty.

In Mole Hunt, cute critters hold up signs with individual manuscript words and the computer’s interpretation of each word below. Players identify whether or not the computer’s recognition of the word is accurate.

In terms of fun, the games are more on par with Mavis Beacon than Super Mario. But like many other online distractions, there’s something strangely addicting about them. Knowing that the games actually contribute to a worthwhile cause —preserving the cultural heritage of Finland — is evidently enough to keep people playing. Since the Mole Bridge and Mole Hunt were launched in February, they’ve logged over 50,000 players, hailing mostly from Finland, the UK, and the US. At last count, the grand total of work contributed to these games is over 3,500 hours, with 4.5 million microtasks achieved.

The real brilliance of these games is that not only do they get the work done, they get it done right. With thousands of people playing, gamers effectively cross-check one another’s work. The software developers at Microtask have even created a defense against the few people who might have malicious intent (apparently, even Finnish libraries occasionally encounter mean-spirited Internet trolls).

When the game starts, the computer shows words that have already been accurately identified. It can then determine if someone is making deliberate mistakes or is just too poor of a typist to benefit the project, and then disregards every further answer submitted by the player. Microtask reported that volunteers’ transcriptions were 99% accurate.

While the games delegate hours of work to the hands of others, they are in no way a substitute for trained professionals. No matter how many microtasks are achieved, the games only cover a fragment of the digitization process. Archivists, librarians, and other specialists are still necessary to develop and implement every step of digitization projects. Accurately digitizing a nation’s archives is an enormous task. For starters, it would require:

  • Different classification schemes for different types of materials
  • Categorization by subject matter
  • Indexing every document
  • Entering metadata on every file
  • Secure storage space with prevention against digital decay
  • A user-friendly interface that allows for effective browsing and searching

And that’s in addition to a plethora of other issues as well. How well does centuries-old newsprint fare in a scanner? What do you do with materials still under copyright? How do you anticipate the needs of both serious researchers and more casual users, such as genealogists?

Implementing gaming systems is not a quick solution to digitization projects, but many institutions could benefit from The Finland National Library‘s model. Introducing similar programs would not only help the workload behind large-scale digitization projects, it would (perhaps most importantly) establish a connection between the public and their libraries. It would raise awareness to libraries’ missions and likely cause more people to support such institutions. Preserving cultural heritage should not just be the work of archivists, and now, it doesn’t have to be.

Play Mole Bridge and Mole Hunt at http://www.digitalkoot.fi/en

Further reading:
Gamification time: What if everything were just a game?
The secrets of Digitalkoot: Lessons learned crowdsourcing data entry to 50,000 people (for free)
Digitalkoot e-programme breaks 25,000 participant mark


Please note that the images are used with permission from Microtask. Several attempts were made to receive written permission from the National Library of Finland, but there was no response. We assume that in the interest of education, this is fair use.

Related Posts with Thumbnails

Print This Post Print This Post

Tags: , , , ,

Category: Developing A Digital Collection · New Tools