|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object edu.northwestern.at.wordhoard.model.text.CharsetUtils
public class CharsetUtils
Character set utilities.
Method Summary | |
---|---|
static java.lang.String |
getBadBetaSeq()
Gets the bad beta code sequence. |
static java.text.Collator |
getCollator(byte charset,
int strength)
Gets a collator. |
static java.lang.String |
translateBetaToUni(java.lang.String str)
Translates a beta code string to unicode. |
static java.lang.String |
translateToInsensitive(java.lang.String str)
Translates a string to a case and diacritical insensitive version. |
static java.lang.String |
translateTonosToOxia(java.lang.String str)
Translates tonos accents to oxia accents in a string. |
static java.lang.String |
translateUniToBeta(java.lang.String str)
Translates a unicode string to beta code. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Method Detail |
---|
public static java.lang.String translateBetaToUni(java.lang.String str)
str
- Beta code string.
public static java.lang.String getBadBetaSeq()
public static java.lang.String translateUniToBeta(java.lang.String str)
str
- Unicode string.
public static java.text.Collator getCollator(byte charset, int strength)
The character sets are:
The collation strengths are:
charset
- Character set.strength
- Strength.
public static java.lang.String translateToInsensitive(java.lang.String str)
All diacritical marks are removed and all letters are mapped to lower case.
str
- String.
public static java.lang.String translateTonosToOxia(java.lang.String str)
We use oxia accents on lower case vowels in the Greek Extended Unicode range in the Early Greek Epic text and lemma spellings. The tonos accents in the Greek and Coptic range are nearly indistinguishable visually and may be typed by users. For example, the Mac OS X Polytonic Greek input method results in tonos accents.
To prevent confusion, we convert tonos to oxia accents in all strings typed by users before we attempt to do searches for the strings in the WordHoard database.
str
- String with tonos accents.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |