Burgum Digitization

The items to be digitized fall under two categories: Trial Transcript and Corresponding Exhibits. Each start off with a similar digitization process, but the final steps are different.

The transcript exists in two different analog forms: A bound copy and a series of looseleaf transparencies in a folder with metal slats through holes in the paper that hold it together. Both are letter-sized and entirely in Black and White. Given that the latter divides the transcript into smaller portions, it is the easier of the two to scan. For the sample, the file was saved as a grayscale .TIFF file with a 600 dpi resolution. A lower-resolution .JPEG will be the page that loads on the website to read, with an option to click on the item to view the .TIFF. They will be named for the page in the transcript (e.g. Burgum_62.TIFF).

Trial Transcript Page 61

The pages have also gone through OCR to see how well the transcript can be translated into a text-only document. At first, the results were less than inspiring, and transcription looked to be the only option. Because the trial transcript was not completely flat, and the text on the next page showed through, at first Omni Page SE had difficulty deciphering the page.

Burgumbadocr.png

However, slight adjustments to brightness and contrast fixed most of the problems, and with more experience at the scanning stage (in terms of settings and placement on the scanner), OCR seems like a viable option for the text.

Burgum_61OCR.png

NOTE: There are a few pages where OCR will not be applicable. They are pages in the transcript where someone has edited the text using pencil, and included a typewritten note explaining that this was authentic to the actual words that were spoken aloud at the trial. The marking to fix these typos will not show up on OCR, and those pages should be transcribed with the corrections to make them easier to read, since they do not change content in any way.

The exhibits are varied in their contents. There are books and pamphlets, newspaper articles and correspondence. None are larger than letter-size, and none are in color. The sample included here is a page of a pamphlet that encompasses many of the challenges that these exhibits pose for digitization. Not only are there graphics, but this image was originally a carbon copy. It was, like the text, scanned as a .TIFF with a 600 dpi resolution, and it will be similarly uploaded to the website as both a low-res .JPEG and the .TIFF. This page was also scanned using the "negative" setting, to make it appear as the original would have, not the carbon copy. This makes the exhibit much easier to read.

Dean's Exhibit 15

Although OCR will not always be applicable to the exhibits, the left side of this page with the text worked perfectly with OmniPage SE, which suggests that the text in the exhibits can be used in the same way as the trial transcript itself.

pollockdexhibit15ocr.png
Add a New Comment
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License