Wars of the Roses - Tagging

In this project, both images of manuscripts and their transcriptions (and translations, when applicable) will be presented to the user. Tagging of images and documents for the Wars of the Roses project will be essential in making the documents usable by the audience. Metadata must capture documents’ dates, their repository location, as well as content information, to make the documents searchable.

TEI format will be used in tagging images with title, date, creator, and repository information. Additionally, index terms will be attached to both images and transcriptions to link them together. The goal is for both the image and the transcription to appear together when a relevant search is performed. For this, Library of Congress subject headings will be used. As of 1995, the British Library has used the Library of Congress subject headings in the British National Bibliography, and thus LCSH will be used in this project. A list of headings specific to the project will also be created and attached to the images and documents, as they are relevant for a deeper level of organization and searchability.

TEI P5, the newest version of TEI Guidelines, provides new encoding features, including support for manuscript descriptions and representation of data pertaining to people and places. The use of <sic> also provides “methods of encoding textual alternatives”. Links to other documents are also made easier with this version. The aim is to enable a search of a variety of categories, such as names, places, battles, and topics or themes. The tagging will allow documents to be grouped together by content, name, or chronology, as the user needs.

Transcriptions will be difficult. During this period, documents were written in what is known as “secretary script”. Secretary script is characterized by cursive-like writing and can be difficult to read. Additionally, while English (specifically Middle English) was the language of law and government during the period of the Wars, the use of English as such was a recent development dating from the late fourteenth century. Latin documents will also be digitized, requiring transcription and translation. A further problem arises from the lack of a spelling standard. In order to provide a more authentic experience, documents will be transcribed literally. This also saves time for the project and removes an extra step of further translation into modern English. Documents are not unreadable in Middle English, but do require more attention from the reader. For further help with search functions, however, regularized modern English spellings will be tagged within the document text. If a program can be found to perform lemma searches of the text, it will be purchased and used in the project.

Manuscript images and their transcriptions/translations will appear together on the website as smaller images. The user will them have the ability to make an image larger and use a zoom function to scrutinize the document.

[INSERT PAGE SAMPLE PAGE]

[INCLUDE SAMPLE MOCK-UP]

Manuscripts:

Document Title One I have created, or as appears on document, if applicable
Document Type Specifies personal letters or government/legal document
Heading
Date Indicate date: will use standardized date in tag, if no date present, will indicate approximate date with []
Creator As written on MS, will be regularized in tag
Recipient, if applicable As written on MS, will be regularized in tag
Repository Location of MS
Copyright Owner of these rights
Keywords Indicates major topics found in text, if overlap with References in document, there will be no duplicate tag
Body
Text Will reflect original structure of MS - either with paragraph format or without, as applicable
References Names, locations, subjects within text directly mentioned in text
Names As written in MS, will be regularized in tag
Locations As written in MS, will be regularized in tag
Subjects As written in MS, will be regularized in tag
Gap - unreadable or missing piece of MS Will be indicated - a guess may be made by the transcriber (an expert) and will be denoted by []
Seal or Stamp (if relevant) Will be described - appearance and its owner
Add a New Comment
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License