Google books


Google made their reputation by indexing the content of the the internet. They’ve done some truly remarkable things with search. Their pagerank system although far from perfect does provide some great results. Some time ago they announced a project in conjunction with five of the largest university libraries including the University of Michigan to scan all the books in their collections and index the content. This project has been highly controversial. It has been attacked by publishers because some of the books to be scanned are still under copyright. It has also been attacked by librarians and others because this project is being undertaken by a private company (I know google is a public company, but I use private here to distinguish from a public entity like a government). Their is a lot of concern that a project like this will have a negative impact on physical libraries. I don’t think that is a really legitimate concern. There will will always be a place for actual libraries with with physical books. People like to read books. They like to sit on the beach, or deck, or under a tree or in bed and hold a book and read. There is also a social aspect to libraries that can never be completely replaced by the virtual world of the internet. Although the net can bring together communities of people who are widely geographically dispersed, people still need physical interaction with other people in the community and with librarians and teachers.

There is also concern about a private entity like google controlling all this data. This is, I think a more legitimate concern. If something were to happen to google what happens to all the scanned books? I like the idea of this project. Google has evidently developed some amazing scanning and character recognition technology as can be seen in this image from 1984. They have developed a mechanism that allows them to scan the pages without damaging some of the very old and rare volumes. The idea of a digital version of the great library of Alexandria would be a great way of preserving human culture. Perhaps if google were to put the raw data into the public domain, allowing anyone to access and index it, this concern could be addressed. Beyond the actual scanning technology, one unique thing that google adds is their indexing and searching capability. If the raw scans were available to everyone, than other companies could develop and apply their own search and display engines. I don’t agree with the idea of a private entity controlling so much of human culture. Their is a very interesting discussion on this whole topic on a recent Open Source, that included Siva Vaidhyanathan and a rep from Google.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.