Crunching Metadata/What Google Print Tells us About Future of Books

By David Weinberger

IN RECENT MONTHS, we've heard that Google is digitizing the libraries of several major universities and making the text searchable through its Google Print search engine-bringing cries of copyright infringement from publishers and author groups. Meanwhile, Microsoft says it will provide online access to 100,000 books in the British Library, and Amazon, which already sells digital versions of books, will soon sell individual chapters, too. But despite the present focus on who owns the digitized content of books, the more critical battle for readers will be over how we manage the information about that content-information that's known technically as metadata.

We've been managing book metadata basically the same way since Callimachus cataloged the 400,000 scrolls in the Alexandrian Library at the turn of the third century BC. Callimachus listed the library's contents on scrolls, Medieval librarians used ledgers, and we use card catalogs, now mostly electronic. But until information started moving online, the basic strategy has been the same: Arrange the books one way on the shelves, physically separate the metadata from them, and arrange the metadata in convenient ways.

This technique works so well for organizing physical books that we've long overlooked its basic limitation: Because books and their metadata have, until recently, been physical objects, we've had to pick one and only one way to order them in defined, stable ways. When Melvil Dewey introduced the Dewey decimal classification system in

1876, it was an advance because it shelved books by topic, making the library's floor plan into a browsable representation of the order of knowledge itself. But no one classification can represent everyone's way of organizing the world. You may file a field guide to the birds under natural history, while someone else files it under great examples of the illustrative art and I file it under good eating.

The digital world makes it possible for the first time to escape this limitation. Publishers, libraries, even readers can potentially create as many classification schemes as we want. But to do this, we'll need two things.

formatting link

Reply to
Monty Solomon
Loading thread data ...

Cabling-Design.com Forums website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.