edu.northwestern.at.wordhoard.tools.fixers
Class ShaTitle
java.lang.Object
edu.northwestern.at.wordhoard.tools.fixers.Fixer
edu.northwestern.at.wordhoard.tools.fixers.ShaTitle
public class ShaTitle
- extends Fixer
Fixes Shakespeare titles ("head" elements).
In the XML the title is supposed to be specified by a unique
"head" child element of the "div" element for the act or scene.
There are, however, sometimes bugs where the "head" child is
missing or there is more than one "head" child. If the "head"
child is missing, we use the "type" and "n" attributes of the
"div" element to reconstruct the title and we set it as the "head"
attribute of the "div" element. If more than one "head"
child is present, we delete all but the last one.
We normalize all act and scene titles to use arabic numerals
instead of roman numerals, to use mixed case instead of upper
case, and to not include periods at the end of the titles.
For the poem "The Rape of Lucrece" we change the title of the
first part from "Introduction" to "Argument".
Method Summary |
void |
fix(java.lang.String corpusTag,
java.lang.String workTag,
org.w3c.dom.Document document)
Fixes an XML DOM tree. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ShaTitle
public ShaTitle()
fix
public void fix(java.lang.String corpusTag,
java.lang.String workTag,
org.w3c.dom.Document document)
throws java.lang.Exception
- Fixes an XML DOM tree.
- Specified by:
fix
in class Fixer
- Parameters:
corpusTag
- Corpus tag.workTag
- Work tag.document
- XML DOM tree.
- Throws:
java.lang.Exception