|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object edu.northwestern.at.wordhoard.tools.cm.LineGenerator
public class LineGenerator
Generates WordHoard tagged lines (paragraphs).
Constructor Summary | |
---|---|
LineGenerator(XMLWriter out,
java.util.Map posToWordClassMap,
Rules rules,
java.lang.String fullWorkTag)
Creates a new line generator. |
Method Summary | |
---|---|
void |
appendPunctuation(java.lang.String str)
Appends punctuation. |
void |
appendUntaggedWord(java.lang.String str,
boolean isVerse)
Appends an untagged word. |
void |
endElement(java.lang.String name)
Emit end tag for element. |
static int |
getNumBadContractions()
Gets the number of bad contractions. |
static int |
getNumWords()
Gets the number of words generated. |
void |
incDivCount()
Increment div count. |
void |
lineBreak()
Generates a line break. |
void |
normalizedText(java.lang.String str)
Generates normalized plain text. |
void |
parBreak()
Generates a paragraph break. |
void |
popStyle()
Pops the style stack. |
void |
processC(org.w3c.dom.Element el)
Processes a MorphAdorner c element. |
void |
processGap(org.w3c.dom.Element el)
Processes a gap element. |
void |
processW(org.w3c.dom.Element el)
Processes a MorphAdorner w element. |
void |
pushStyle(Style style)
Adds a style and pushes it onto the style stack. |
static void |
resetDivCount()
Reset div count to zero. |
void |
startElement(java.lang.String name)
Emit start tag for element. |
void |
untaggedLine(java.lang.String str)
Generates an untagged line. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public LineGenerator(XMLWriter out, java.util.Map posToWordClassMap, Rules rules, java.lang.String fullWorkTag)
out
- WordHoard XML output file writer.posToWordClassMap
- Map from pos tags to word class tags.rules
- Rules.fullWorkTag
- Full work tag for line IDs. Method Detail |
---|
public void pushStyle(Style style)
The specified style is added to the current style on the top of the style stack, and the result is pushed onto the style stack.
The indentation level and word styles are cumulative. For example, suppose the current top style is indented 10 pixels and is bold, and a style specifying an indentation of 5 pixels and italic is pushed. The new style is indented 15 pixels and is both bold and italic.
style
- Style. public void popStyle()
public static void resetDivCount()
public void incDivCount()
public void processW(org.w3c.dom.Element el)
MorphAdorner sometimes emits multiple "w" elements for a single word, with the same id. This typically happens with words marked up with multiple styles. For WordHoard, we discard all but the first occurence of words tagged with the same id.
All lemmas are mapped to lower case, to avoid having multiple WordHoard lemmas which are really the same, differing only in case.
el
- MorphAdorner w element. public void processC(org.w3c.dom.Element el)
Space characters at the beginning of lines are discarded.
el
- MorphAdorner c element. public void processGap(org.w3c.dom.Element el)
el
- MorphAdorner c element. public void startElement(java.lang.String name)
name
- The element name. public void endElement(java.lang.String name)
name
- The element name. public void appendUntaggedWord(java.lang.String str, boolean isVerse)
str
- Word.isVerse
- True if word in verse. public void appendPunctuation(java.lang.String str)
Space characters at the beginning of lines are discarded.
str
- Punctuation public void lineBreak()
public void parBreak()
public void untaggedLine(java.lang.String str)
str
- Text for line. public void normalizedText(java.lang.String str)
str
- Text to generate. public static int getNumBadContractions()
public static int getNumWords()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |