Bertillon Tagging

XML tagging is essential to this project. The information embedded in these sources can only become useful if it is searchable. The most difficult aspect in tagging this project is the multiple spelling for names. Names can differ among the contents of the envelope. To find the correct spelling of each name would be a difficult and inefficient, so the spelling on the Bertillon card will be considered correct. If there is only ancillary material, the spelling considered most frequently will be considered correct. Within the XML tagging, all possible spelling and spellings that are close will be tagged. This will cast a larger search net in case all of the supplied spellings are incorrect. Unfortunately, the problem is further compounded by the use of aliases. Although it is listed separately on the card, the alias will be tagged as a name. This helps both people who only know an alias, and people who do not know if what they are searching is a proper name.

For the Bertillon cards, images of both the front and back of each card will be provided, plus a transcription of the criminal information. The photographs will not be searchable as they are only facial and profile shots of the criminal. The transcription of the back will not be identical. There is no need to replicate the table because it require more work and adds no value.

The goal of tagging this information is to allow statistical manipulation of the Bertillon Cards. A single image of each article will be included, along with a transcription. This transcription will be encoded using the TEI P5 schema for newspaper articles. Proper names will be hyper-textually linked to any corresponding Bertillon card and other digitized envelope material. The goal of these articles is to provide context and create a more direct connection between the researcher and the complexity of the envelope from which the information came. Correspondence will be treated like the articles with a thumbnail image provided along with the transcription. Proper names will link to cards and articles. Institution and officer names will be lengthened and linked to a description. The contents of any single envelope will be linked linearly so it can be viewed as a whole. Below is a table with categorizing the most important information.

Bertillon Card Collection
Collection Name To be determined
Location New York City Police Museum
Institution Location 100 Old Slip New York, New York 10005
Copyright No Issues to date
Individual Card Front & Back of Card
Crime Crime for which person was arrested
Name Including any associated spellings & aliases
Supporting Material Any material included in the envelope
Date Arrest Date
Birthplace Home Country
Neighborhood Neighborhood where criminal was arrested
Article Article Title
Source Article Metadata
Publication Newspaper Title
Date Of Publication
Title Supplied
Copyright To be determined
Content Article Content
Name Criminal Name
Crime For which person was arrested
Neighborhood Any addresses provided
Correspondence Correspondence Title
Source Correspondence Metadata
Author Of Correspondence
Date Of Composure
Title Created Document Title
Recipient If Any
Copyright To Be Determined
Content Correspondence Content
Name Including any associated spellings and aliases
Crime For which person was arrested
Place Names Supplied

A separate person will be hired specifically for digitization. A trained professional will rekey all of the transcriptions in house. The digitization can take anywhere from twelve to eighteen months. Images and transcriptions will be published on the site incrementally so that corrections can be made as needed. As they are uploaded, interns will proofread all transcriptions and check links for accuracy.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License