Monday, January 25, 2010

Google Books - CPRR Documents


A whole bunch of CPRR original documents are up on the web from Stanford University.



Blogger CPRR Discussion Group said...

The Google Book project, announced in December 2004, has been ongoing for several years, and the CPRR and UPRR books available online have been linked on the CPRR Museum's home page. When the books first became available online, they did not provide search capability within the pdf files, so we reprocessed many of the railroad books with optical character recognition to add that search capability.

1/25/2010 7:40 PM  
Anonymous document storage said...

These documents are stored up neatly in a content-driven DMS.

12/07/2011 5:01 PM  
Blogger CPRR Discussion Group said...

Sounds like expensive software that needs lots of fast hardware to meet peak loads.

What we did instead just added OCR text to pdf's that previously only contained images of book pages, to make the books searchable. So when someone accesses a book from our server there is no processing of the file needed so zero overhead. Search is provided by Google, not our server. So we can serve books with inexpensive hardware, high load is not compute limited, and having the books has zero ongoing additional cost.

12/07/2011 5:45 PM  

Post a Comment

<< Recent Messages