American Textbooks Example of Scans

The first scan is in TIFF format, scanned at a medium 150 dpi. Some text from the opposite side of the page shows through, but it is still easily readable.

The second scan is in TIFF format, scanned at a higher 300 dpi. I performed a background clean-up on the text. I don't think the higher dpi makes much of a difference in readability, as the original source material is already uniform printed text. The background clean-up from the scanner actually served to make the text lighter and less visible - this would be a problem for performing OCR on a text that would otherwise be a good candidate.

I also uploaded a transcript of the first paragraph of this selection. This transcript could go behind the scan for searchability. I no longer think the transcript needs to be visible, as the scan itself is easily readable. (Though maybe the transcript could be available for download as it is a much smaller file).

I think the chapters will be available as TIFFS at 150 dpi, with transcripts behind them that would be available to download. Additionally, the entire scan of the book will be accessible for perusing or download, but probably only at 75 dpi, as it is still readable at that level. The entire book would not be transcripted.

I think the transcripts can mostly be garnered via OCR, with corrective quality control checks.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License